What is Menhir?

Menhir is a LR(1) parser generator for the OCaml programming language. That is, Menhir compiles LR(1) grammar specifications down to OCaml code. Menhir was designed and implemented by François Pottier and Yann Régis-Gianas.

Menhir is 90% compatible with ocamlyacc. Legacy ocamlyacc grammar specifications are accepted and compiled by Menhir. The resulting parsers run and produce correct parse trees. However, parsers that explicitly invoke functions in module Parsing behave slightly incorrectly. For instance, the functions that provide access to positions return a dummy position when invoked by a Menhir parser. Porting a grammar specification from ocamlyacc to Menhir requires replacing all calls to module Parsing with new Menhir-specific keywords.

Why prefer Menhir to ocamlyacc?

Menhir allows the definition of a nonterminal symbol to be parameterized by other (terminal or nonterminal) symbols. Furthermore, it offers a library of standard parameterized definitions, including options, sequences, and lists. It offers some support for EBNF syntax, via the ? , + , and * modifiers.

by other (terminal or nonterminal) symbols. Furthermore, it offers a of standard parameterized definitions, including options, sequences, and lists. It offers some support for EBNF syntax, via the , , and modifiers. ocamlyacc only accepts LALR(1) grammars. Menhir accepts LR(1) grammars, thus avoiding certain artificial conflicts.

grammars, thus avoiding certain artificial conflicts. Menhir's %inline keyword helps avoid or resolve some LR(1) conflicts without artificial modification of the grammar.

keyword helps avoid or resolve some LR(1) conflicts without artificial modification of the grammar. Menhir explains conflicts in terms of the grammar, not just in terms of the automaton. Menhir's explanations are believed to be understandable by mere humans.

in terms of the grammar, not just in terms of the automaton. Menhir's explanations are believed to be understandable by mere humans. Menhir supports incremental parsing (in --table mode only). This means that the state of the parser can be saved at any point (at no cost) and that parsing can later be resumed from a saved state.

(in mode only). This means that the state of the parser can be saved at any point (at no cost) and that parsing can later be resumed from a saved state. Menhir offers an interpreter that helps debug grammars interactively.

that helps debug grammars interactively. Menhir allows grammar specifications to be split over multiple files. It also allows several grammars to share a single set of tokens.

over multiple files. It also allows several grammars to share a single set of tokens. Menhir produces reentrant parsers.

parsers. Menhir is able to produce parsers that are parameterized by OCaml modules.

by OCaml modules. ocamlyacc requires semantic values to be referred to via keywords: $1 , $2 , and so on. Menhir allows semantic values to be explicitly named.

There are other differences, documented in Menhir's reference manual.

Documentation

The reference manual is available in HTML and PDF formats.

The MenhirLib.Convert API offers facilities for converting back and forth between the traditional parser API (which assumes that the lexical analyzer is produced using ocamllex) and a revised API (which makes no such assumption).

The incremental API is defined by MenhirLib.IncrementalEngine (and is also explained in the reference manual).

Downloading

Here are the source code releases of Menhir. Compiling and installing requires GNU make and OCaml (version 4.02 or later). Bug reports and suggestions are welcome. Here is a list of recent changes.

Menhir is also available through opam. Once you have installed opam, just type opam install menhir .

Mailing list

There is a mailing list for announcements of new releases and for discussion of problems, bugs, feature requests, and so on. Only subscribers can post.