Looking for clues Cesar Manso/AFP/Getty

A simple cryptography method can produce the unusual language-like features of a mysterious manuscript from the Middle Ages. The finding suggests that the famous Voynich manuscript may be an elaborate hoax, not a secret language to be decoded.

The manuscript has baffled cryptographers since book dealer Wilfrid Voynich found it in an Italian monastery in 1912. It contains hundreds of pages of fine calfskin parchment, which scientists have dated to the first half of the 15th century. Pages of indecipherable text are accompanied by illustrations of exotic unidentified plants, naked nymphs and plant-based pharmaceuticals, astrological diagrams, and other material that no one has been able to identify.

Academics continue to hotly debate whether the manuscript is an elaborate hoax designed to fool 15th-century book collectors, or a detailed secret code that remains unbroken. Advocates for a meaningful code argue that the text shows similarities to texts written in natural languages. For instance, the distributions of words and syllables follow a linguistic pattern called Zipf’s law.


But Gordon Rugg of Keele University in the UK, who has spent more than a decade studying the text, argues that even such apparently natural features would be easy to fake with a few simple techniques.

The fact that the manuscript remains undeciphered after decades of research implies that if there was a code in it, then it was either “anachronistically sophisticated” or “based on some radically different underlying approach from any known code”, writes Rugg in a new paper.

Meaningless gibberish?

Rugg previously developed a simple card-based system that allowed him to replicate the syllable structures of “Voynichese”. He first drew a grid with several syllables comprising what appear to be the roots, prefixes and suffixes of Voynichese words.

He then overlaid a card with three holes cut out of it on the grid, revealing a set of three syllables in the underlying table. By moving the card across the grid, Rugg could come up with different combinations of syllables and produce new words.

Now, he has used this method to generate a series of gibberish words that follow Zipf’s law, producing a similar word-frequency distribution to real natural-language texts. Rugg says this result shows that just because the Voynich text looks a lot like a language doesn’t mean it is one.

“We have known for years that the syllables are not random. What I’m saying is there are ways of producing gibberish which are not random in a statistical sense,” he says. “It’s a bit like rolling loaded dice. If you roll dice that are subtly loaded, they would come up with a six more often than you would expect, but not every time.”

Debate rages

Marcelo Montemurro at the University of Manchester in the UK has argued that the manuscript does contain meaningful text, and has completed a statistical analysis comparing its structure to classic works in several languages. He disagrees with Rugg’s suggestion of a hoax, arguing that the manuscript has too many layers of complexity for a simple hoaxer to produce.

“It is not impossible that these tables can generate Zipf’s law, in the same way that it is not impossible to win the lottery 10 times. It is still very unlikely,” he says. “Bringing in all of these narratives to explain something makes it sound so far-fetched. They are writing a thriller, not a scientific paper.”

In a separate analysis, Montemurro found statistical similarities between the botanical and pharmaceutical sections of the manuscript, both in the art and in the indecipherable words.

“That means whoever made the hoax was aware of these subtle layers of structure that are very difficult to find just by looking at the text,” he says. “We cannot say for certain whether it is a hoax, or hides a message. But we can say, whoever wants to propose that it is a hoax needs to explain how all of this can arise spontaneously without the author planning all these things.”

Rugg begs to differ, arguing that his method shows it would be simple to generate complex-seeming text. “We don’t need to say it has to be a code because a hoax is too difficult. If a hoax is feasible, I think the burden of proof shifts onto people saying, ‘No, this is a code’,” he says.

Although it has been a subject of scrutiny for a century, Rugg says that Voynich researchers still have some tricks up their sleeves. They could, for example, study correlations between shifts in the frequency of different syllables within the text, he says.

Alternatively, they could look for erasures and corrections, comparing how frequently those happened against other handwritten documents from the time. “There are testable hypotheses that somebody can go out and try,” Rugg says. But, he says, he won’t be the one to do it. “I don’t think there will ever be a resolution that everybody will be happy with.”

Reference: Cryptologia, DOI: 10.1080/01611194.2016.1206753