Adding strlcpy() to glibc

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

strlcpy()

strcpy()

One of the longest-running requests for the GNU C Library (glibc) is the addition of thefamily of string functions. These functions, which have their origins in the BSD world, were written to address the longstanding security problems associated withand its related functions. Despite years of requests, however, the glibc maintainers have never allowed these functions to be added. But that situation might just be about to change.

The problem with functions like strcpy() , of course, is that they perform no length checking on the strings passed to them. The result has been a long history of buffer-overrun issues and security problems. In response, functions like strncpy() were created, but they have problems of their own; in particular, strncpy() can create strings without the terminating null byte, encouraging other types of buffer overruns. So strlcpy() was created to ensure that all strings would be null-terminated. Problem solved, or so one might think.

Back in 2000, one Christoph Hellwig posted a patch adding strlcpy() and strlcat() to glibc. The glibc maintainer at that time, Ulrich Drepper, rejected the patch in classic style:

This is horribly inefficient BSD crap. Using these function only leads to other errors. Correct string handling means that you always know how long your strings are and therefore you can you memcpy (instead of strcpy). Beside, those who are using strcat or variants deserved to be punished.

Christoph gave up after a token protest, but other users did not. Over the years, there have been many requests to add these functions to glibc, but the project's position has never changed. Fourteen years after Christoph's patch was posted, there is still no strlcpy() in glibc.

The primary complaint about strlcpy() is that it gives no indication of whether the copied string was truncated or not. So careless code using strlcpy() risks replacing a buffer-overrun error with a different problem resulting from corrupted strings. In the minds of many developers, strlcpy() just replaces one issue with another and, thus, does not solve the problem of safe string handling in C programs. They believe it should not be used, and, since it should not be used, it also should not be provided by the library.

The argument on the other side is simple enough: like it or not, plenty of programs use strlcpy() for string manipulation. If the system library does not provide an implementation, they will provide their own, and that implementation, beyond being duplicated code, is likely to be slower and buggier than a standard library implementation. Failure to support strlcpy() does not cause those programs to be changed; instead, it just makes them use inferior alternatives.

Florian Weimer pointed out that some 60 packages in Fedora use strlcpy() ; those packages, he said, will not go away. Among other problems, the implementations of strlcpy() found in those packages do not take advantage of the FORTIFY_SOURCE option provided by GCC. With fortification turned on, a number of buffer overruns can be detected at run time, causing the program to crash but avoiding a potential security hole. The recent, remotely exploitable Samba vulnerability, Florian said, was caused by an erroneous use of strlcpy() that would have been caught if fortification were in use.

This argument has gone back and forth many times over the years (and was covered here in 2012). One might think it would go on forever, except for one thing: the management of the glibc project changed significantly in early 2012. Under the new regime, the project has been more open to the addition of new features. It took a couple years for this particular subject to come back, but, when David A. Wheeler recently asked if a strlcpy() implementation might now be accepted, glibc developer Joseph Myers responded that "it would be reasonable to consider."

Florian wasted little time in putting together a patch adding the functions to glibc. The first version ran into some criticism (it didn't behave like the BSD version), but the second iteration has been better received. Which is not to say that there is a consensus that these functions should go into the library; some developers see it as encouraging their use. But the prevalent attitude would seem to be one of resigned acceptance; as David Miller put it:

I'm not really strongly opposed to adding strlcpy to glibc. But we should be completely honest about why we are doing this, what the effects on existing strlcpy users actually is, and what we should genuinely recommend to people writing new code.

The reasons are simple enough: replace a bunch of hand-rolled strlcpy() implementations with one high-quality library implementation. The good news is that, since most programs use autotools for configuration, those programs would switch over to the glibc implementation on systems where it is available with no intervention required.

So the strlcpy() issue may finally be put to rest. Of course, that does not solve the bigger problem: what the glibc developers would recommend for safe, fast, and simple string handling for C programs. The C language does not lend itself to providing all three of "safe," "simple," and "fast" in the same package. In many cases, the right answer is to use a different language anyway. But there will be a lot of C code out there for a long time, so there will be a lot of string-handling bugs as well, regardless of whether strlcpy() is used.

