Fuzzing DNS zone parsers

2019-07-11 13:00

In my never-ending quest to improve the quality of my C codebases, I've been using AFL to fuzz statzone, the zone parser I use to generate monthly statistics on StatDNS. It helped me to find and fix a NULL pointer dereference.

I initially used the .arpa zone file as input, but then remembered that OpenDNSSEC bundles a special zone for testing purposes, containing a lot of seldom used resource records types, and decided to use this one too.

Out of curiosity, I decided to try fuzzing other DNS zone parsers. I started with validns 0.8, and within seconds the fuzzer found multiple NULL pointer dereferences.

The first occurrence happens in the name2findable_name() function, and can be triggered with the following input:

arpa 86400 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2019021500 1800 900 604800 86400 arpa. 86400 IN RRSIG SOA 8 1 86400 20190228000000 20190214230000 49906 arpa. Qot7qHAA2QhNmAz3oJUIGmxGJrKnWsIzEvZ92R+LV03K7YTFozio2U7Z534RZBhc0UJvlF1YenrbM6ugmF0z55CJD9JY7cFicalFPOkIuWslSl62vuIWHLwN5sA7VZ0ooVN2ptQpPHDa3W/9OPJRF0YqjBBBwD7IiL7V560rbXM =

With the above input, the following call to strlen(3) in rr.c results in a NULL pointer dereference because 's' ends up being NULL:

static unsigned char * name2findable_name ( char * s ) { int l = strlen ( s );

The second occurrence happens in the nsec_validate_pass2() function, and can be triggered with the following input:

arpa. 86400 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2019021500 1800 900 604800 86400 arpa. 86400 IN NSEC a

With the above input, the following call to strcasecmp(3) in rr.c results in a NULL pointer dereference because 'rr->next_domain' ends up being NULL:

if ( strcasecmp ( rr -> next_domain , zone_apex ) == 0 ) {

Given those encouraging results, I went on to fuzz BIND, NSD and Knot zone parsers, using their zone validation tools named-checkzone, nsd-checkzone, and kzonecheck respectively.

While the fuzzers didn't produce any crash for BIND and Knot after running for 3 days and 11 hours, they did produce some valid ones for NSD, and I decided to continue on nsd-checkzone and stop the other fuzzers.

I let AFL complete one cycle, and as I didn't need the box for anything else at this time, I decided to let it run for a few more days. I ended the process after 16 days and 19 hours, completing 2 cycles with 167 unique crashes.

After sorting and analyzing the crashes, I had two valid issues to report.

The first one is an out-of-bounds read caused by improper validation of array index, in the rdata_maximum_wireformat_size() function, in rdata.c.

The second one is a stack-based buffer overflow in the dname_concatenate() function in dname.c, which got assigned CVE-2019-13207.

================================================================= == 7395 == ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffcd6a9763f at pc 0x0000004dadbc bp 0x7ffcd6a97510 sp 0x7ffcd6a96cc0 WRITE of size 8 at 0x7ffcd6a9763f thread T0 #0 0x4dadbb in __asan_memcpy (/home/fcambus/nsd/nsd-checkzone+0x4dadbb) #1 0x534251 in dname_concatenate /home/fcambus/nsd/dname.c:464:2 #2 0x69e61f in yyparse /home/fcambus/nsd/./zparser.y:1024:12 #3 0x689fd1 in zonec_read /home/fcambus/nsd/zonec.c:1623:2 #4 0x6aedd1 in check_zone /home/fcambus/nsd/nsd-checkzone.c:61:11 #5 0x6aea07 in main /home/fcambus/nsd/nsd-checkzone.c:127:2 #6 0x7fa60ece6b96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310 #7 0x41c1d9 in _start (/home/fcambus/nsd/nsd-checkzone+0x41c1d9) Address 0x7ffcd6a9763f is located in stack of thread T0 at offset 287 in frame #0 0x533f8f in dname_concatenate /home/fcambus/nsd/dname.c:458 This frame has 1 object ( s ) : [ 32, 287 ) 'temp' ( line 459 ) < == Memory access at offset 287 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext ( longjmp and C++ exceptions * are * supported ) SUMMARY: AddressSanitizer: stack-buffer-overflow ( /home/fcambus/nsd/nsd-checkzone+0x4dadbb ) in __asan_memcpy Shadow bytes around the buggy address: 0x10001ad4ae70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10001ad4ae80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10001ad4ae90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10001ad4aea0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 0x10001ad4aeb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 => 0x10001ad4aec0: 00 00 00 00 00 00 00[07]f3 f3 f3 f3 f3 f3 f3 f3 0x10001ad4aed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10001ad4aee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10001ad4aef0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 0x10001ad4af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10001ad4af10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend ( one shadow byte represents 8 application bytes ) : Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return : f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb == 7395 == ABORTING

Both issues have been fixed and will be part of the NSD 4.2.2 release.

I also fuzzed ldns using ldns-read-zone for 12 days and 7 hours but the only crashes it produced were in fact only triggering assertions.

It's been an interesting journey so far, and while finding issues is still relatively easy, time required to sort crashes and distinguish between valid, duplicates, and false positives takes a lot of time. Nonetheless, reading 3rd party source code and analyzing what is going on and why the program crashes is both very instructing and rewarding.

For the time being, I plan to continue fuzzing stuff and will write more about my findings.