A suspicious pattern in open-source software

One bug recently found by John using tis-interpreter on a widely used open-source library involved the comparison of strings with memcmp . The unexpected condition was that memcmp was, in one case, called with a pointer to a buffer shorter than the length passed as third argument, breaking one of the two symmetrical pre-conditions in the function’s ACSL contract:

/*@ requires \valid_read(((char*)s1)+(0..n - 1)); @ requires \valid_read(((char*)s2)+(0..n - 1)); @ … @*/ int memcmp (const void *s1, const void *s2, size_t n);

A reason that may have made this use of memcmp look okay to the developer is that the buffers being passed to it always differed before the end of the buffers were reached. That is, a memcmp implementation based on the following loop would not have caused any out-of-bounds read access:

for (size_t i=0; i<n; i++) if (((char*)s1)[i] != ((char*)s2)[i]) return …;

The first question raised was whether the pattern memcmp("a", "bc", 3) was problematic according to the letter of the C standard. If it was, the second question was whether the busy maintainer of one of the Open Source packages that make the Internet tick should be bothered with a bug report.

Reading between the lines of the C11 standard

I would like to be able to say that memcmp ’s ACSL contract was the product of careful deliberation, but unfortunately this is not the case: many standard function contracts were written quickly in order to get most of the standard library covered, and have not been tested by time. Anyway, upon proofreading the relevant clause in the C11 standard, my feeling was that the ACSL formalization was, in this particular case, right, and that it was undefined behavior to pass as memcmp argument a buffer that wasn’t fully valid, even if the implementation sort-of needs to read the buffer’s characters in order for the purpose of finding the first mismatch.

7.24.4:1 The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp is determined by the sign of the difference between the values of the first pair of characters (both interpreted as unsigned char) that differ in the objects being compared. 7.24.4.1:2 The memcmp function compares the first n characters of the object pointed to by s1 to the first n characters of the object pointed to by s2.310)

…

310) The contents of “holes” used as padding for purposes of alignment within structure objects are indeterminate. Strings shorter than their allocated space and unions may also cause problems in comparison.

My feeling is that the above clause 7.24.2.1:2 is faithfully formalized as the ACSL pre-conditions quoted at the beginning of the post. This feeling come from the contrast with the memchr function:

7.24.5.1:2 The memchr function locates the first occurrence of c (converted to an unsigned char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s. The implementation shall behave as if it reads the characters sequentially and stops as soon as a matching character is found.

The sentence that I highlighted above is what the C standardization committee writes when they want to indicate that the function will not read—or will behave as if it did not read, anyway—beyond the match.



Parenthetical anecdote: not only this sentence is absent from the specification of memcmp , but there is an entire standardization committee, the POSIX one, who seeing the sentence absent from the specification of memchr in the C99 standard, thought that this absence meant that memchr was only allowed to receive fully valid buffers and was thus unacceptable, clarified the situation in their own standard and invited the C11 committee to incorporate the change. So that’s actually two standardization committees close to the matter at hand who think that if a sentence like the one above is not part of a function’s specification, the function must only be called with fully valid input buffers.



This meant that the memcmp formal specification that caused the warning to be emitted did not need to be fixed. Remained the question of how to best explain to developers to avoid an invocation that, on the face of it, looks relatively harmless.

A problem that matters in practice

Here is the implementation of memcmp in Glibc. There are two distinct optimizations for long buffers, one that applies when both buffers start at the same offset modulo the word size, memcmp_common_alignment , and one that applies when they don’t, memcmp_not_common_alignment .

The function memcmp_common_alignment is relatively well-behaved: it reads from the two buffers aligned word by aligned word, and thus reads the entire words that contain differing bytes. If the caller passed buffers that aren’t valid after the differing byte, this amounts to reading out of bounds, but this sort of out-of-bounds access is not detected by the typical MMU, which works at the scale of the page. This optimization is extremely common in standard library implementations—you would find strlen doing the same thing, in a context where there is no ambiguity about the illegality of reading past the null character. This does not constitute a convincing argument for the doubting developer.

The “not_common_alignment” case, however, tells a different story. To illustrate, the C program summarized below is made of two parts: first is Glibc’s implementation, lightly instrumented to show some of the addresses being accessed. Second is an input vector: t1 and t2 are declared as arrays of long long in order to force them to be aligned, and memcmp is passed t1 and (char*)t2+1 as arguments in order to force the “not_common_alignment” codepath to be followed inside memcmp ’s implementation.

… /* memcmp_not_common_alignment -- Compare blocks at SRCP1 and SRCP2 with LEN `op_t' objects (not LEN bytes!). SRCP2 should be aligned for memory operations on `op_t', but SRCP1 *should be unaligned*. */ static int memcmp_not_common_alignment (srcp1, srcp2, len) long int srcp1; long int srcp2; size_t len; { op_t a0, a1, a2, a3; … a0 = ((op_t *) srcp1)[0]; printf("just read %zu bytes from %p

", sizeof(op_t), (void*)&((op_t *) srcp1)[0]); … a1 = ((op_t *) srcp1)[1]; printf("just read %zu bytes from %p

", sizeof(op_t), (void*)&((op_t *) srcp1)[1]); … a2 = ((op_t *) srcp1)[2]; printf("just read %zu bytes from %p

", sizeof(op_t), (void*)&((op_t *) srcp1)[2]); … a3 = ((op_t *) srcp1)[3]; printf("just read %zu bytes from %p

", sizeof(op_t), (void*)&((op_t *) srcp1)[3]); … } int my_memcmp (s1, s2, len) const __ptr_t s1; const __ptr_t s2; size_t len; { … long int srcp1 = (long int) s1; long int srcp2 = (long int) s2; … if (srcp1 % OPSIZ == 0) res = memcmp_common_alignment (srcp1, srcp2, len / OPSIZ); else res = memcmp_not_common_alignment (srcp1, srcp2, len / OPSIZ); … } long long t1[] = { 0xFF07060504030201, 0x100f0e0d0c0b0a09 }; long long padding[2] = { 1 }; long long t2[] = { 0x0706050403020100, 0x0f0e0d0c0b0a0908, 0x1716151413121110, 0x1f1e1d1c1b1a1918 }; int main(void) { printf(" t1:%p

" " pointer 'one past' t1:%p

" " padding:%p

" " t2:%p

" " pointer 'one past' t2:%p

", (void*)t1, (void*)(&t1+1), (void*)padding, (void*)t2, (void*)(&t2+1)); /* t1's validity is 16 bytes. One may think it's ok to compare it with the sequence of bytes at (char*)t2+1, since they differ in the 8th byte. */ return my_memcmp(t1, (char*)t2+1, 31); }

Compiled on the I32LP64 little-endian machine I am typing this on, the program shows:

$ clang memcmp.c && ./a.out t1:0x10733f020 pointer 'one past' t1:0x10733f030 padding:0x10733f030 t2:0x10733f040 pointer 'one past' t2:0x10733f060 just read 8 bytes from 0x10733f028 just read 8 bytes from 0x10733f030

In other words, when passed the carefully (mis-)aligned buffers t1 and (char*)t2+1 , although these buffers differ in the 8th byte, Glibc’s implementation of memcmp reads 8 bytes beyond the end of t1 . By making the 16th byte differ instead of the 8th one, it is also possible to make Glibc’s implementation of memcmp read 16 bytes beyond the end of t1 .

There is no particular reason why these 8 or 16 bytes should not be in the next, unmapped memory page. In fact Alexander Cherepanov modified the above example to make sure it crashed by accessing into an unmapped page:

/* prepare a copy of t1 located at the end of a page: */ size_t pagesize = sysconf(_SC_PAGESIZE); char *p3 = mmap(NULL, pagesize * 2, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (p == MAP_FAILED) return -1; if (mprotect(p + pagesize, pagesize, PROT_NONE) != 0) return -2; memcpy(p3 + (pagesize - 16), t1, 16); printf(" p3:%p

" " pointer 'one past' p3:%p

", (void*)p3, (void*)(p3 + pagesize)); /* expect a crash soon: */ fflush(stdout); /* call memcmp with the prepared copy: */ printf("comparison:%d

", my_memcmp(p3 + (pagesize - 16), (char*)t2+1, 31));

Conclusion

In conclusion, yes, some implementations of memcmp will crash when invoked with buffers that aren’t valid for the full length, even if they differ early. The circumstances are rare (probably the reason this bug was still there to be found in a library that had already been tested with all the available techniques) but outside the programmer’s control. The pattern described in this post should be reported as a bug when found.

Acknowledgements

In addition to John Regehr finding an instance of the bug and being embarrassed about reporting it, and Alexander Cherepanov insisting on the difference between word accesses that remain in the memory page or that reach out of it, this post also benefited from remarks from Rich Felker, Julien Cretin, Jed Davis, and Aaron Jacobs. Discussions around the issue have lead to a series of bug reports in Glibc’s memchr, strncat and strnlen implementations.