❔ how to test for invalid chars before xml serialisation?

inside the xml serialiser System.Xml.XmlCharType.IsCharData(char ch) is called. but XmlCharType is internal, so i can't do that check myself without doing naughty reflection things to bypass visibility barriers. is there some way i can detect funny chars myself, without cloning that code? Ideally i'd like to do a pre-process pass over the text to handle funky chars (or an exception handler to post-process and retry on exception) (i'm forced to accept free text user input without an opportunity to reject it - i can't just exception and produce no XML at all here .. at the very least i should be able to detect and handle the junk record ...=)
4 Replies
Aart Bluestoke
Aart Bluestoke8mo ago
the specific issue here was a char(2) embedded in a string, that crashed the file creation. checking the source code only the following utf16 chars will crash the xml serialiser, so i could implement this myself if i have to ...:
000000000000xxxx except 1001,1010,1101
11011xxxxxxxxxxx
000000000001xxxx
111111111111111x
000000000000xxxx except 1001,1010,1101
11011xxxxxxxxxxx
000000000001xxxx
111111111111111x
but there is already a well optimised access to an in-memory lookup table within that function ...
WEIRD FLEX
WEIRD FLEX8mo ago
so are you saying this internal method would check for the problematic chars you found? how much is probable that the users enter those characters? how much speed is an issue? i thought the serializer was pretty safe, never had crashes with it i believe
JakenVeina
JakenVeina8mo ago
I'm inclined to say that you should just let the serializer handle it, but I'm guessiNg the APIs are poor? E.G. it just throws when encountering a bad character? what is your intention for doing this check yourself, instead of letting the serializer do it?
Accord
Accord8mo ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.