What are null-terminated strings?

What are null-terminating strings?
In C, a “null-terminated string” is a tautology. A string is, by definition, a contiguous null-terminated sequence of characters (an array, or a part of an array). Other languages may address strings differently. I am only discussing C strings.

How are they different from a non-null-terminated strings?
There are no non-null-terminated strings in C. A non-null-terminated array of characters is just an array of characters.

What is this null that terminates the string? Is it different from NULL?
The “null character” is a character with the integer value of zero. (Characters are, in essence, small integers). It is sometimes, especially in the context of ASCII, referred to as NUL (single L). This is distinct from NULL (double L), which is a null pointer. The null character can be written as '\0' or just 0 in the source code. The two forms are interchangeable in C (but not in C++). The former is usually preferred because it shows the intent better.

Should I null-terminate my strings myself, or the compiler will do it for me?
If you are writing a string literal, you don’t need to explicitly insert a null character in the end. The compiler will do it.

char* str1 = "a string";   // ok, \0 is inserted automatically
char* str2 = "a string\0"; // extra \0 is not needed

The compiler will not insert a null character when declaring an array with an explicit size and initialising it with a string literal with more characters than the array can hold.

char str3[5] = "hello"; // not enough space in the array for the null terminator
char str4[]  = "hello"; // ok, there is \0 in the end, the total size is 6

The compiler will not insert a null character when declaring an array and not initialising it with a string literal.

char str5[] = { 'h', 'e', 'l', 'l', 'o' };       // no null terminator
char str6[] = { 'h', 'e', 'l', 'l', 'o', '\0' }; // null terminator

If you are building a string at run-time out of some data that comes from IO or from a different part of the program, you need to make sure a null terminator is inserted. For example:

char* duplicate_string(const char* src)
{
    char* result = malloc(strlen(src) + 1); // <- reserve place for null terminator
    strcpy(dst, src);
    return dst;
}

Standard library functions such as fread and POSIX functions such as read never null-terminate their arguments. strncpy will add a null-terminator if there is enough space for it, so use it with care. Confusingly, strncat will always add a null-terminator.

Why are null-terminated strings needed?
Many functions from the standard C library, and many functions from third-party libraries, operate on strings (and all strings need to be null-terminated). If you pass a non-null-terminated character array to a function that expects a string, the results are likely to be undefined. So if you want to interoperate with the world around you, you need null-terminated strings. If you never use any standard-library or third-party functions that expect string arguments, you may do what you want.

How do I set up my code/data to handle null-terminated strings?
If you plan to store strings of length up to N, allocate N+1 characters for your data. The character needed for the null terminator is not included the length of the string, but it is included in the size of the array required to store it.

Leave a Comment