Strings in C are a unique collection of mistakes. The biggest one is the idea of null termination. Null termination is not without its advantages: because you're using a single byte to mark the end of the string, you can have strings of arbitrary length. No need to track the size and worry if your size variable is big enough to hold the end of the string. No complicated data structures. Just "read till you find a 0 byte, and you know you're done."
Of course, this is the root of a lot of evils. Malicious inputs that lack a null terminator, for example, are a common exploit. It's so dangerous that all of the str*
functions have strn*
versions, which allow you to pass sizes to ensure you don't overrun any buffers.
Dmitri sends us a simple example of someone not quite fully understanding this.
strcpy( buffer, string);
strcat( buffer, "\0");
The first line here copies the contents of string
into buffer
. It leverages the null terminator to know when the copy can stop. Then, we use strcat
, which scans the string for the null terminator, and inserts a new string at the end- the new string, in this case, being the null terminator.
The developer responsible for this is protecting against a string lacking its null terminator by using functions which absolutely require it to be null terminated.
C strings are hard in the best case, but they're a lot harder when you don't understand them.