Eli Bendersky has mentioned some problems of parsing VHDL. I was quite pleased though finding the Postlexer idea being discussed in the following paper which provides a more thorough treatment of VHDL parsing issues.

They call it “Auxiliary Terminal Generator” which is a more precise notation of what I call a “Postlexer”. In that case a Postlexer resolves ambiguities caused by context sensitivity. The solution is to introduce auxiliary terminal symbols in between two context free passes – the one for lexical analysis and parsing.

What I distaste about their approach is that they coupled the grammar with the Postlexer using parsing actions. Apparently generations of computing scientists enjoyed squeezing as much information as possible into their grammars instead of acknowledging the obvious: a separate parsing phase is often required for non-trivial grammars and since the problems are specific one can leave it entirely to a powerful general purpose language and the programmer to solve them. As a consequence the parser isn’t portable across languages but at least the grammars are.

Generally speaking one has to consider three functions now instead of two:

The Parser Sandwich

Lexer: Parsetable String -> ; Tokenstream Postlexer: Tokenstream -> ; Tokenstream Parser: Parsetable Tokenstream -> ; Tree

Lexer and Parser shall be universally applicable for any language whereas the Postlexer is always specific and can be omitted if the whole language can be described in a context free way.