Strip the byte order mark from string in C#

I recently had issues with the .NET 4 upgrade, but until then the simple answer is

String.Trim()

removes the BOM up until .NET 3.5.

However, in .NET 4 you need to change it slightly:

String.Trim(new char[]{'\uFEFF'});

That will also get rid of the byte order mark, though you may also want to remove the ZERO WIDTH SPACE (U+200B):

String.Trim(new char[]{'\uFEFF','\u200B'});

This you could also use to remove other unwanted characters.

Some further information is from
String.Trim Method:

The .NET Framework 3.5 SP1 and earlier versions maintain an internal list of white-space characters that this method trims. Starting with the .NET Framework 4, the method trims all Unicode white-space characters (that is, characters that produce a true return value when they are passed to the Char.IsWhiteSpace method). Because of this change, the Trim method in the .NET Framework 3.5 SP1 and earlier versions removes two characters, ZERO WIDTH SPACE (U+200B) and ZERO WIDTH NO-BREAK SPACE (U+FEFF), that the Trim method in the .NET Framework 4 and later versions does not remove. In addition, the Trim method in the .NET Framework 3.5 SP1 and earlier versions does not trim three Unicode white-space characters: MONGOLIAN VOWEL SEPARATOR (U+180E), NARROW NO-BREAK SPACE (U+202F), and MEDIUM MATHEMATICAL SPACE (U+205F).

Leave a Comment