Difference between string and StringBuilder in C#

A string instance is immutable. You cannot change it after it was created. Any operation that appears to change the string instead returns a new instance:

string foo = "Foo";
// returns a new string instance instead of changing the old one
string bar = foo.Replace('o', 'a');
string baz = foo + "bar"; // ditto here

Immutable objects have some nice properties, such as they can be used across threads without fearing synchronization problems or that you can simply hand out your private backing fields directly without fearing that someone changes objects they shouldn’t be changing (see arrays or mutable lists, which often need to be copied before returning them if that’s not desired). But when used carelessly they may create severe performance problems (as nearly anything – if you need an example from a language that prides itself on speed of execution then look at C’s string manipulation functions).

When you need a mutable string, such as one you’re contructing piece-wise or where you change lots of things, then you’ll need a StringBuilder which is a buffer of characters that can be changed. This has, for the most part, performance implications. If you want a mutable string and instead do it with a normal string instance, then you’ll end up with creating and destroying lots of objects unnecessarily, whereas a StringBuilder instance itself will change, negating the need for many new objects.

Simple example: The following will make many programmers cringe with pain:

string s = string.Empty;
for (i = 0; i < 1000; i++) {
  s += i.ToString() + " ";
}

You’ll end up creating 2001 strings here, 2000 of which are thrown away. The same example using StringBuilder:

StringBuilder sb = new StringBuilder();
for (i = 0; i < 1000; i++) {
  sb.Append(i);
  sb.Append(' ');
}

This should place much less stress on the memory allocator 🙂

It should be noted however, that the C# compiler is reasonably smart when it comes to strings. For example, the following line

string foo = "abc" + "def" + "efg" + "hij";

will be joined by the compiler, leaving only a single string at runtime. Similarly, lines such as

string foo = a + b + c + d + e + f;

will be rewritten to

string foo = string.Concat(a, b, c, d, e, f);

so you don’t have to pay for five nonsensical concatenations which would be the naïve way of handling that. This won’t save you in loops as above (unless the compiler unrolls the loop but I think only the JIT may actually do so and better don’t bet on that).

Leave a Comment