How does marshalling work?

c#
public static partial int MessageBoxW(nint hWnd, string text, string caption, uint type)
{
int __retVal;
// Pin - Pin data in preparation for calling the P/Invoke.
fixed (void* __caption_native = &global::System.Runtime.InteropServices.Marshalling.Utf16StringMarshaller.GetPinnableReference(caption))
fixed (void* __text_native = &global::System.Runtime.InteropServices.Marshalling.Utf16StringMarshaller.GetPinnableReference(text))
{
__retVal = __PInvoke(hWnd, (ushort*)__text_native, (ushort*)__caption_native, type);
}
c#
public static partial int MessageBoxW(nint hWnd, string text, string caption, uint type)
{
int __retVal;
// Pin - Pin data in preparation for calling the P/Invoke.
fixed (void* __caption_native = &global::System.Runtime.InteropServices.Marshalling.Utf16StringMarshaller.GetPinnableReference(caption))
fixed (void* __text_native = &global::System.Runtime.InteropServices.Marshalling.Utf16StringMarshaller.GetPinnableReference(text))
{
__retVal = __PInvoke(hWnd, (ushort*)__text_native, (ushort*)__caption_native, type);
}
Hey lads, as you can see in this [LibraryImport] generated code it's calling a native function by just getting the address to a string.. But don't native libraries need strings to be null terminated? I don't see any code within GetPinnableReference() that suggests it makes a character array with a '\0' at the end
77 Replies
jcotton42
jcotton423w ago
.NET strings are null-terminated internally already.
☠  Pointman ☠
☠ Pointman ☠OP3w ago
What?! I thought they worked by keeping their length managed
sibber
sibber3w ago
do note that thats an implementation detail they do, its an impl detail for optimizations
☠  Pointman ☠
☠ Pointman ☠OP3w ago
im so confused dawg
sibber
sibber3w ago
about what
☠  Pointman ☠
☠ Pointman ☠OP3w ago
about where i got the idea that .net strings just keep their length in mind in order to assume the end of a string rather than having a null terminator character at the end
sibber
sibber3w ago
they do like i said, thats the the spec way of doing it you can have a .net implementation that doesnt terminate its strings with null its just that the current .NET 5+ impl spec does for optimization reasons that impl could change in the future, and the logic for marshalling the string would then change you shouldnt rely on this impl detail, you should just use the bcl string marshaller
jcotton42
jcotton423w ago
they do both their code appears to be generated by LibraryImport but yes
☠  Pointman ☠
☠ Pointman ☠OP3w ago
right i see, so before .NET 5 they weren't null terminated?
jcotton42
jcotton423w ago
they have always been, afaik specifically for interop scenarios
sibber
sibber3w ago
im not really sure
☠  Pointman ☠
☠ Pointman ☠OP3w ago
well ill just write down that net strings are already null terminated then it's for a thesis where can i get more info on this?
sibber
sibber3w ago
just keep in mind that you shouldnt rely on this and dont even need to know it
☠  Pointman ☠
☠ Pointman ☠OP3w ago
im writing a thesis that includes marshalling that's why i thought it's important to write down how marshalling works like "oh marshalling in C# means taking a string and turning it into a null terminated byte array"
sibber
sibber3w ago
sure
jcotton42
jcotton423w ago
it doesn't turn it in anything, it pins the string, then gets a pointer ref to the underlying char array (note that char is .net is 16-bit, not 8)
sibber
sibber3w ago
in my mind "turn it into" doesnt imply new data in this context
jcotton42
jcotton423w ago
"turn into" implies conversion to me
sibber
sibber3w ago
but apparently everyone else understands semantics differently than me after that convo in chat lmao
☠  Pointman ☠
☠ Pointman ☠OP3w ago
i guess conversion only happens when youre calling a native function that requires ANSI encoded strings i see
☠  Pointman ☠
☠ Pointman ☠OP3w ago
No description
☠  Pointman ☠
☠ Pointman ☠OP3w ago
Apparently chatGPT is where i got the idea that C# strings aren't null terminated by default
sibber
sibber3w ago
hes not wrong he is wrong, see below from C#'s perpective thats the case the trailing zero is purely an optimization the language doesnt use it at all or know about it
☠  Pointman ☠
☠ Pointman ☠OP3w ago
oh can i get a source to this
sibber
sibber3w ago
(well unless youre writing the unsafe code thats marshalling the string)
☠  Pointman ☠
☠ Pointman ☠OP3w ago
yk that .NET strings are null terminated internally
sibber
sibber3w ago
i dont remember where i read about this, im sure you can find something when you look it up it might be mentioned in that article about how string builders work
Anchy
Anchy3w ago
you could look at the source code
sibber
sibber3w ago
or that
Anchy
Anchy3w ago
better that than relying on chatgpt
☠  Pointman ☠
☠ Pointman ☠OP3w ago
No description
☠  Pointman ☠
☠ Pointman ☠OP3w ago
The method doc already implies it so i guess thats good enough
Aaron
Aaron3w ago
no it isnt it is specced
A char* value produced by fixing a string instance always points to a null-terminated string. Within a fixed statement that obtains a pointer p to a string instance s, the pointer values ranging from p to p + s.Length ‑ 1 represent addresses of the characters in the string, and the pointer value p + s.Length always points to a null character (the character with value ‘\0’).
sibber
sibber3w ago
oh really? oops sorry about that thats surprising tho why would they spec it?
Aaron
Aaron3w ago
because it's very useful?
sibber
sibber3w ago
fair enough in my head i thought of the spec as like a "pure interface"
Aaron
Aaron3w ago
sure, and part of that interface is that string instances end with \0 for the purpose of being used with interop
sibber
sibber3w ago
yeah but kind of leaking an impl into the interface
Aaron
Aaron3w ago
how so the implementation could copy the string when you fix it, if it really wanted to
sibber
sibber3w ago
because null termination is not required to do anything that string already do/want to do
Aaron
Aaron3w ago
sure it is they want to be used for interop when fixed
sibber
sibber3w ago
yeah i get it im just saying it feels like leaking an impl im not saying i disagree or anything
Aaron
Aaron3w ago
it's as much leaking an implementation detail as any API guarantee that isn't directly in the signature is int.Abs returning a positive integer "leaking an implementation detail", or is it just what that API is for
sibber
sibber3w ago
i dont follow thats in the signature oh
Aaron
Aaron3w ago
fixed/GetPinnableReference can't encode in their signature/syntax that the returned value is null terminated, they just return char*/ref char
sibber
sibber3w ago
well, not in the method signature, but in the "api" signature
Aaron
Aaron3w ago
but you are guaranteed that it is null terminated anyway same as how Abs returns an int, just that int is guaranteed to be >= 0
sibber
sibber3w ago
but thats not the same as saying the spec doesnt need to have it
Aaron
Aaron3w ago
wdym the spec needs to have it for fixed to behave that way
sibber
sibber3w ago
yeah but it doesnt have to be fixed that way
Aaron
Aaron3w ago
I'm not sure what you mean
sibber
sibber3w ago
this is unsafe code, guarantees like this arent required
Aaron
Aaron3w ago
yes they are
sibber
sibber3w ago
everything in the managed world can still work the exact same way without null termination being in the spec
Aaron
Aaron3w ago
otherwise unsafe code couldn't do meaningful work
sibber
sibber3w ago
why what i mean is, marshalling strings could work by copying the string and adding a null at the end why does that break doing meaningful work
Aaron
Aaron3w ago
you can still implement it that way in a runtime if you really want to
sibber
sibber3w ago
yeah, so its up to the impl that was my point
Aaron
Aaron3w ago
I don't get it
sibber
sibber3w ago
you can achieve the exact same things without the impl that uses null terminated string
Aaron
Aaron3w ago
I do not see how specifying what fixed does is different than specifying what any other language feature does of course you can change the language and still mostly do the same things
sibber
sibber3w ago
but fixing a string for unmanaged code is a different territory you change that to copy the string
Aaron
Aaron3w ago
you could make &a actually allow copying a before giving you a pointer if you really wanted to
sibber
sibber3w ago
you cant change int.Abs to do differnt behavior
Aaron
Aaron3w ago
and code could still be written that does the same thing but why would you do that
sibber
sibber3w ago
int.Abs says "returns an int >= 0" and marshalling a string says "returns null terminated string", whether it reuses the same one or not doesnt matter
Aaron
Aaron3w ago
I'm not sure how that equates to exposing an implementation detail
sibber
sibber3w ago
its not exactly exposing an impl detail thats why i said "feels like" i dont really know how to explain it other than this
Aaron
Aaron3w ago
it is fixing a detail in the same way any specced thing is
sibber
sibber3w ago
sure
Aaron
Aaron3w ago
idk
sibber
sibber3w ago
same
Aaron
Aaron3w ago
anyway it is not just implied you are guaranteed that it is null terminated by the language if not the runtime I can't remember if the runtime says this anywhere
MarkPflug
MarkPflug3w ago
I wonder what happens if you marshal a .NET string that has an internal null character?
Aaron
Aaron3w ago
it will have a null character where it did in the string and anything you gave the marshalled string to will see that and probably think that's where the string ends
MarkPflug
MarkPflug3w ago
And if that context expects a null terminated string, then I guess you just lost data.
☠  Pointman ☠
☠ Pointman ☠OP3w ago
Ah, thanks for this mate

Did you find this page helpful?