Null and Terminated

There's plenty of room for debate about what specific poor choices in history lead to the most bugs today. Was it the billion dollar mistake of allowing null pointers? Is it the absolute mess that is C memory management? Or is it C-style strings and all the attendant functions and buffer-overruns they entail?

A developer at Jay's company had been porting some C++ code to a new platform. That developer left, and the wheel-of-you-own-this-now spun and landed on Jay. The code was messy, but mostly functional. Jay was able to get it building, running, and then added a new feature. It was during testing that Jay noticed that some fields in the UI weren't being populated.

Jay broke out a memory analyzer tool, and it popped out warnings on lines where strlcpy was being called. Now that was odd, as strlcpy is the "good" way to copy strings, with guarantees that it would never allow buffer overruns. The buffers were all correctly sized, which left Jay wondering what exactly was wrong with the calls to strlcpy?

A quick grep through the code later, and Jay knew exactly what was wrong:

#define strlcpy strncpy

The code originally had been targeting a platform which had strlcpy available, but the port was moving to a platform which did not. The previous developer, either out of a combination of laziness, ignorance, carelessness, or some combination of all of those, decided that since strlcpy and strncpy had the same calling semantics, a macro could solve all their problems.

If you haven't had to deal with C-strings, or just general C-style conventions, recently, it's important to note a few things. First, C doesn't actually have strings as a datatype, it just has an array of characters. Second, arrays are actually just pointers to the first item in the array, and C doesn't do anything to enforce the length, which means you're free to access element 11 in a 10 element array, and C will let you. Finally, since "knowing how long a string is" might actually be important, the way C-strings address the problems above is that the last character in the string should be a null terminator. All the string handling functions know that if they see a null terminator, that's the end of the string, and that keeps your code from reading off the end of the array into some other block of memory- or worse, writing to that arbitrary block of memory.

Which brings us to the key difference between strlcpy and strncpy: the first one is "safer" and guarantees that the last character in the output buffer is going to be a null terminator. strncpy makes no such guarantee; if there isn't room in the buffer for a null terminator, it just doesn't put one in.

In other words, with one macro, Jay's predecessor had created hundreds of buffer-overrun vulnerabilities. Jay removed the macro, properly updated the calls to safely copy strings, and the errors went away.

In any case, let's close with this quote, from the "Bugs" section of the strncpy/strcpy manpage, which is just a fun read:

If the destination string of a strcpy() is not large enough, then anything might happen. Overflowing fixed-length string buffers is a favorite cracker technique for taking complete control of the machine. Any time a program reads or copies data into a buffer, the program first needs to check that there's enough space. This may be unnecessary if you can show that overflow is impossible, but be careful: programs can get changed over time, in ways that may make the impossible possible.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!

Featured Comments

hedronist (nodebb) 2021-02-02

Ah yes, null pointers and too-short character buffers! That's how I started my first company back in 1983. I created a C source-level debugger. It was initially just for myself, but then a friend talked me into getting a booth at the 1983 UniForum convention in San Diego. My customers coming out of that included HP, Siemens, and Microsoft (they had recently acquired Xenix from The Santa Cruz Operation). That little debugger, called CDB, definitely helped pay the bills.