xkcd 1313: Regex Golf¶

Peter Norvig

January 2014

revised November 2015

I ♡ xkcd! It reliably provides top-rate insights, humor, or both. I was thrilled when I got to introduce Randall Monroe for a talk in 2007. But in xkcd #1313,

I found that the hover text, "/bu|[rn]t|[coy]e|[mtg]a|j|iso|n[hl]|[ae]d|lev|sh|[lnd]i|[po]o|ls/ matches the last names of elected US presidents but not their opponents", contains a confusing contradiction. I'm old enough to remember that Jimmy Carter won one term and lost a second. No regular expression could both match and not match "Carter".

But this got me thinking: can I come up with an algorithm to match or beat Randall's regex golf scores? The game is on.

I started by finding a listing of presidential elections, giving me these winners and losers: