Why does char* cause undefined behaviour while char[] doesn’t?

Any attempt to modify a C string literal has undefined behaviour. A compiler may arrange for string literals to be stored in read-only memory (protected by the OS, not literally ROM unless you’re on an embedded system). But the language doesn’t require this; it’s up to you as a programmer to get it right.

A sufficiently clever compiler could have warned you that you should have declared the pointer as:

const char * p = "wikimedia";

though the declaration without the const is legal in C (for the sake of not breaking old code). But with or without a compiler warning, the const is a very good idea.

(In C++, the rules are different; C++ string literals, unlike C string literals, really are const.)

When you initialize an array with a literal, the literal itself still exists in a potentially read-only region of your program image, but it is copied into the local array:

char s[] = "wikimedia"; /* initializes the array with the bytes from the string */
char t[] = { 'w', 'i', ... 'a', 0 };  /* same thing */

Note that char u[] = *p does not work — arrays can only be initialized from a brace initializer, and char arrays additionally from a string literal.

Leave a Comment