To use the code provided in this chapter, write

>>> from fuzzingbook.Grammars import < identifier >

and then make use of the following features.

This chapter introduces grammars as a simple means to specify input languages, and to use them for testing programs with syntactically valid inputs. A grammar is defined as a mapping of nonterminal symbols to lists of alternative expansions, as in the following example:

>>> US_PHONE_GRAMMAR = { >>> "<start>" : [ "<phone-number>" ], >>> "<phone-number>" : [ "(<area>)<exchange>-<line>" ], >>> "<area>" : [ "<lead-digit><digit><digit>" ], >>> "<exchange>" : [ "<lead-digit><digit><digit>" ], >>> "<line>" : [ "<digit><digit><digit><digit>" ], >>> "<lead-digit>" : [ "2" , "3" , "4" , "5" , "6" , "7" , "8" , "9" ], >>> "<digit>" : [ "0" , "1" , "2" , "3" , "4" , "5" , "6" , "7" , "8" , "9" ] >>> } >>> >>> assert is_valid_grammar ( US_PHONE_GRAMMAR )

Nonterminal symbols are enclosed in angle brackets (say, <digit> ). To generate an input string from a grammar, a producer starts with the start symbol ( <start> ) and randomly chooses a random expansion for this symbol. It continues the process until all nonterminal symbols are expanded. The function simple_grammar_fuzzer() does just that:

>>> [ simple_grammar_fuzzer ( US_PHONE_GRAMMAR ) for i in range ( 5 )] [ '(692)449-5179' , '(519)230-7422' , '(613)761-0853' , '(979)881-3858' , '(810)914-5475' ]

In practice, though, instead of simple_grammar_fuzzer() , you should use the GrammarFuzzer class or one of its coverage-based, probabilistic-based, or generator-based derivatives; these are more efficient, protect against infinite growth, and provide several additional features.

This chapter also introduces a grammar toolbox with several helper functions that ease the writing of grammars, such as using shortcut notations for character classes and repetitions, or extending grammars