How does marshalling work?
Hey lads, as you can see in this [LibraryImport] generated code it's calling a native function by just getting the address to a string.. But don't native libraries need strings to be null terminated? I don't see any code within GetPinnableReference() that suggests it makes a character array with a '\0' at the end
77 Replies
.NET strings are null-terminated internally already.
What?! I thought they worked by keeping their length managed
im so confused dawg
about what
about where i got the idea that .net strings just keep their length in mind in order to assume the end of a string
rather than having a null terminator character at the end
they do
like i said, thats the the spec way of doing it
you can have a .net implementation that doesnt terminate its strings with null
its just that the current .NET 5+ impl spec does
for optimization reasons
that impl could change in the future, and the logic for marshalling the string would then change
you shouldnt rely on this impl detail, you should just use the bcl string marshaller
they do both
their code appears to be generated by LibraryImport
but yes
right i see, so before .NET 5 they weren't null terminated?
they have always been, afaik
specifically for interop scenarios
im not really sure
well ill just write down that net strings are already null terminated then
it's for a thesis
where can i get more info on this?
im writing a thesis that includes marshalling that's why i thought it's important to write down how marshalling works
like "oh marshalling in C# means taking a string and turning it into a null terminated byte array"
sure
it doesn't turn it in anything, it pins the string, then gets a pointer ref to the underlying char array
(note that char is .net is 16-bit, not 8)
in my mind "turn it into" doesnt imply new data
in this context
"turn into" implies conversion to me
but apparently everyone else understands semantics differently than me after that convo in chat lmao
i guess conversion only happens when youre calling a native function that requires ANSI encoded strings
i see

Apparently chatGPT is where i got the idea that C# strings aren't null terminated by default
oh
can i get a source to this
(well unless youre writing the unsafe code thats marshalling the string)
yk that .NET strings are null terminated internally
i dont remember where i read about this, im sure you can find something when you look it up
it might be mentioned in that article about how string builders work
you could look at the source code
or that
better that than relying on chatgpt

The method doc already implies it so
i guess thats good enough
no it isnt
it is specced
A char* value produced by fixing a string instance always points to a null-terminated string. Within a fixed statement that obtains a pointer p to a string instance s, the pointer values ranging from p to p + s.Length ‑ 1 represent addresses of the characters in the string, and the pointer value p + s.Length always points to a null character (the character with value ‘\0’).
oh really?
oops sorry about that
thats surprising tho
why would they spec it?
because it's very useful?
fair enough
in my head i thought of the spec as like a "pure interface"
sure, and part of that interface is that string instances end with \0 for the purpose of being used with interop
yeah but kind of leaking an impl into the interface
how so
the implementation could copy the string when you fix it, if it really wanted to
because null termination is not required to do anything that string already do/want to do
sure it is
they want to be used for interop when fixed
yeah i get it
im just saying it feels like leaking an impl
im not saying i disagree or anything
it's as much leaking an implementation detail as any API guarantee that isn't directly in the signature
is int.Abs returning a positive integer "leaking an implementation detail", or is it just what that API is for
i dont follow
thats in the signature
oh
fixed/GetPinnableReference can't encode in their signature/syntax that the returned value is null terminated, they just return char*/ref char
well, not in the method signature, but in the "api" signature
but you are guaranteed that it is null terminated anyway
same as how Abs returns an int, just that int is guaranteed to be >= 0
but thats not the same as saying the spec doesnt need to have it
wdym
the spec needs to have it for fixed to behave that way
yeah but it doesnt have to be fixed that way
I'm not sure what you mean
this is unsafe code, guarantees like this arent required
yes they are
everything in the managed world can still work the exact same way without null termination being in the spec
otherwise unsafe code couldn't do meaningful work
why
what i mean is, marshalling strings could work by copying the string and adding a null at the end
why does that break doing meaningful work
you can still implement it that way in a runtime if you really want to
yeah, so its up to the impl
that was my point
I don't get it
you can achieve the exact same things without the impl that uses null terminated string
I do not see how specifying what fixed does is different than specifying what any other language feature does
of course you can change the language and still mostly do the same things
but fixing a string for unmanaged code is a different territory
you change that to copy the string
you could make
&a
actually allow copying a
before giving you a pointer if you really wanted toyou cant change int.Abs to do differnt behavior
and code could still be written that does the same thing
but why would you do that
int.Abs says "returns an int >= 0" and marshalling a string says "returns null terminated string", whether it reuses the same one or not doesnt matter
I'm not sure how that equates to exposing an implementation detail
its not exactly exposing an impl detail
thats why i said "feels like"
i dont really know how to explain it other than this
it is fixing a detail in the same way any specced thing is
sure
idk
same
anyway it is not just implied you are guaranteed that it is null terminated
by the language if not the runtime I can't remember if the runtime says this anywhere
I wonder what happens if you marshal a .NET string that has an internal null character?
it will have a null character where it did in the string
and anything you gave the marshalled string to will see that and probably think that's where the string ends
And if that context expects a null terminated string, then I guess you just lost data.
Ah, thanks for this mate