String sorting issue in C#

I suspect that in the last case “-” is treated in a different way due to culture-specific settings (perhaps as a “dash” as opposed to “minus” in the first strings). MSDN warns about this:

The comparison uses the current culture to obtain culture-specific
information such as casing rules and the alphabetic order of
individual characters. For example, a culture could specify that
certain combinations of characters be treated as a single character,
or uppercase and lowercase characters be compared in a particular way,
or that the sorting order of a character depends on the characters
that precede or follow it.

Also see in this MSDN page:

The .NET Framework uses three distinct ways of sorting: word sort,
string sort, and ordinal sort. Word sort performs a culture-sensitive
comparison of strings. Certain nonalphanumeric characters might have
special weights assigned to them; for example, the hyphen (“-“) might
have a very small weight assigned to it so that “coop” and “co-op”
appear next to each other in a sorted list. String sort is similar to
word sort, except that there are no special cases; therefore, all
nonalphanumeric symbols come before all alphanumeric characters.
Ordinal sort compares strings based on the Unicode values of each
element of the string.

So, hyphen gets a special treatment in the default sort mode in order to make the word sort more “natural”.

You can get “normal” ordinal sort if you specifically turn it on:

     Console.WriteLine(string.Compare("a.", "a-"));                  //1
     Console.WriteLine(string.Compare("a.a", "a-a"));                //-1

     Console.WriteLine(string.Compare("a.", "a-", StringComparison.Ordinal));    //1
     Console.WriteLine(string.Compare("a.a", "a-a", StringComparison.Ordinal));  //1

To sort the original collection using ordinal comparison use:

     items.Sort(StringComparer.Ordinal);

Leave a Comment