

#!/usr/bin/python

from pyparsing import *



# Define the grammar we will use

digits = "0123456789"

colon = Literal(':')

semi = Literal(';')

period = Literal('.')

comma = Literal(',')

lparen = Literal('{')

rparen = Literal('}')

number = Word(digits)

hexint = Word(hexnums,exact=2)

text = Word(alphas)



# Define host configuration specific grammar

host_keyword = Literal('host')

hardware_keyword = Literal('hardware')

ethernet_keyword = Literal('ethernet')

address_keyword = Literal('fixed-address')

mac = Combine(hexint + colon + hexint + colon + hexint + colon + hexint + colon + hexint + colon + hexint).setResultsName("mac_address")

ip = Combine(number + period + number + period + number + period + number)

ips = delimitedList(ip).setResultsName("ip_addresses")

hostname = Combine(text + period + text + period + text).setResultsName("hostname")

ethernet_statement = hardware_keyword + ethernet_keyword + mac + semi



ipaddress_statement = address_keyword + ips + semi

x = host_keyword + hostname + lparen + Optional(ethernet_statement) + Optional(ipaddress_statement) + rparen



# Here is some sample data for us to parse



host_declaration = """

host a.foo.bar {

hardware ethernet 00:11:22:33:44:55;

fixed-address 192.168.100.10, 192.168.200.50;

}



host b.foo.bar {

hardware ethernet 00:0f:12:34:56:78;

fixed-address 192.168.100.20;

}



host c.foo.bar { hardware ethernet 00:0e:12:34:50:70; fixed-address 192.168.100.40; }

"""



# Do the parsing



results = x.scanString(host_declaration)



# Print out the stuff we're interested



for result in results:

print result[0].hostname, result[0].mac_address , result[0].ip_addresses





a.foo.bar 00:11:22:33:44:55 ['192.168.100.10', '192.168.200.50']

b.foo.bar 00:0f:12:34:56:78 ['192.168.100.20']

c.foo.bar 00:0e:12:34:50:70 ['192.168.100.40']



Yesterday while browsing the table of contents of the May 2008 issue of Python Magazine I came across a reference to the pyparsing module - a python module for writing recursive descent parsers using familiar python grammar.O'Reilly's Python DevCenter has an excellent introduction to using this module entitled Building Recursive Descent Parsers with Python . Well worth a read.It just so happens that I have a number of projects which are stalled because writing code to parse complexly structured data is not my strong point. I enjoy parsing up text line by line as much as the next guy but this recursive stuff I find tedious.The ISC DHCP configuration file is, in my opinion, a good example of parsing complexity. It's configuration directives can contain many optional directives, can be nested, and can be all on a single line or broken up move multiple lines. Writing the parser using pyparsing makes this much simpler.Here is a simple example of using pyparsing to parse a few host definitions which while simple is quite flexible.The output looks like this: