Why is modifying a string through a retrieved pointer to its data not allowed?

Why can’t we write directly to this buffer?

I’ll state the obvious point: because it’s const. And casting away a const value and then modifying that data is… rude.

Now, why is it const? That goes back to the days when copy-on-write was considered a good idea, so std::basic_string had to allow implementations to support it. It would be very useful to get an immutable pointer to the string (for passing to C-APIs, for example) without incurring the overhead of a copy. So c_str needed to return a const pointer.

As for why it’s still const? Well… that goes to an oddball thing in the standard: the null terminator.

This is legitimate code:

std::string stupid;
const char *pointless = stupid.c_str();

pointless must be a NUL-terminated string. Specifically, it must be a pointer to a NUL character. So where does the NUL character come from? There are a couple of ways for a std::string implementation to allow this to work:

  1. Use small-string optimization, which is a common technique. In this scheme, every std::string implementation has an internal buffer it can use for a single NUL character.
  2. Return a pointer to static memory, containing a NUL character. Therefore, every std::string implementation will return the same pointer if it’s an empty string.

Everyone shouldn’t be forced to implement SSO. So the standards committee needed a way to keep #2 on the table. And part of that is giving you a const string from c_str(). And since this memory is likely real const, not fake “Please don’t modify this memory const,” giving you a mutable pointer to it is a bad idea.

Of course, you can still get such a pointer by doing &str[0], but the standard is very clear that modifying the NUL terminator is a bad idea.

Now, that being said, it is perfectly valid to modify the &str[0] pointer, and the array of characters therein. So long as you stay in the half-open range [0, str.size()). You just can’t do it through the pointer returned by data or c_str. Yes, even though the standard in fact requires str.c_str() == &str[0] to be true.

That’s standardese for you.

Leave a Comment