Fuzzing as a Test Case Generator

2019-12-28

To learn more about both Lua and Zig, I’ve started writing a Lua 5.1 implementation in Zig beginning with the lexer (i.e. separating a source file into tokens). Lua’s lexer, however, does not have any test cases and the lexing API is not easily separated from the Lua parser. So, when writing test cases for a new, compatible implementation, it’s hard to be certain you’ve covered all the edge cases. This is especially true for Lua, since any 8-bit value is technically a valid character in .lua source files (embedded NUL , other control characters, you name it).

After writing some obvious test cases based on my reading of the Lua lexer’s source code, I decided to try using fuzz testing to find any edge cases that I wasn’t accounting for, and ended up with both some surprising and not-so-surprising results.

The setup

libFuzzer uses various techniques to generate a random set of inputs that maximizes the code coverage of the fuzz-tested code (in this case, the Lua lexer). After fuzzing for a while, I then took those generated inputs and ran them back through the Lua lexer, generating corresponding output files that consist of a list of the lexed tokens and the resulting error, if any. For example:

Input: local hello = "world" → Output: local <name> = <string> <eof>

→ Input: local hello = "world → Output: local <name> = [string "fuzz"]:1: unfinished string near '"world'

With these pairs of input/output files, I can simply make my lexer implementation generate similar output, and then compare that with Lua’s for each input file. Any discrepancies (different tokens, different errors, errors at different locations, etc) is then an opportunity to figure out what’s happening, write a minimal reproduction test case, and fix it. Once all of the discrepancies are ironed out, we can be reasonably sure that our implementation is compatible with the reference implementation.

See squeek502/fuzzing-lua for the full Lua lexer fuzzing implementation.

The not-so-surprising results

The surprising results

The above changes could have been caught without fuzzing (given enough scrutiny/time), but there was one additional edge case that likely would not have been caught without fuzz testing due to how rarely it affects normal source code. It turns out that the Lua 5.1 lexer has a bug in its check_next function. That is, when it is looking ahead at the next character and checking that it is within some set of expected characters, it accidentally accepts '\0' / NUL characters as well (due to its unchecked use of strchr which has been fixed for Lua 5.2+). Luckily, check_next is only used in a few places in the Lua 5.1 lexer:

When checking for the second/third . characters in concat ( .. ) and ellipsis ( ... ) tokens

characters in concat ( ) and ellipsis ( ) tokens When checking for e / E exponent markers in number tokens

/ exponent markers in number tokens When checking for - / + to denote exponent signed-ness in number tokens

This means that Lua’s lexer will tokenize the following such that (where 0 is the NUL character):

. 0 will lex to ..

will lex to .. 0 will lex to ...

will lex to 1.5 0 -1 will lex to 1.5 (internally, the lexer’s state will ‘think’ the token is 1.5e-1 , but the finished token’s string will be treated as NUL -terminated when converting from string to double).

This behavior can be verified by doing the following:

$ printf 'print("con".\0"cat")' > concat.lua $ lua51 concat.lua concat $ lua53 concat.lua lua53: concat.lua:1: ')' expected near '.'

Because .lua source files rarely actually have embedded NUL s–especially outside of string literals–very few people have likely ever run into this particular edge case, but if absolute compatibility with a reference implementation is a goal, then such edge cases have to be taken into account. That’s not a goal for my project, but it’s still illustrative of the depth of the test cases that fuzzing can bubble up, and it has allowed me to make check_next -bug compatibility an option in my implementation.

Links