Comparison of the Voynich manuscript and different information carrying sequences. A) Information in word distribution as a function of the scale for the Voynich manuscript compared to other five language and symbolic sequences (F: Fortran; C: Chinese; V: Voynich; E: English; L: Latin; Y: yeast DNA). The number of words in all sequences was equal to that of the Voynich text; if the original sequence was longer, the additional words were not considered. B) Scale of maximal information for the sequences considered in A. Credit: doi:10.1371/journal.pone.0066344.g001

(Phys.org) —Theoretical physicist Marcelo Montemurro and colleague Damián H. Zanette have published a paper in the journal PLOS ONE claiming that the Voynich text is likely not a hoax as some have suggested. The two researchers along with others at the University of Manchester in the U.K. analyzed a digital copy of the text and say that computer assisted analyses of the "book" suggest it does harbor meaning, though what that might be is still a mystery.

The Voynich text is a book made up of 104 folios—each page has graphemes (arrays of characters) and drawings on it. It first came to light in 1912 when Wilfrid Voynich claimed to have found it in an Italian Monastery. The graphemes suggest words made up of characters that do not appear in any other known language. Since the time of its discovery, various researchers have sought to determine if the text is written in an unknown language, or if it is instead a book created by someone as a hoax. Adding to the mystery of the text are the drawings of plants on most of the pages—none of them are known to exist in nature. Carbon dating of the text suggests it was created sometime in the 1400s—but that that doesn't offer proof that the writing on the parchment was done during that period, leaving some to suggest it was Voynich himself who created the characters and drawings. To date, no one has been able to prove whether the text has meaning or if it is simply pages of gibberish. To learn more, Montemurro and his team turned to advanced computer analysis.

To analyze the text, researchers assign modern language letters to characters; this allows for the application of algorithms. In this case, the team looked at global patterns of "words" that appear throughout the text. This process represents a novel way to view the semantics. One type of pattern distribution known as "entropy" allows researchers to compare documents to one another using a computer. The method offers a single number that describes the complexity of the text. The Voynich text received a score of 805, compared to 728 for text samples written in English and 580 for those written in Chinese. A comparison of the Voynich score to yeast DNA samples (25) and a program written in Fortran (285) suggests the Voynich text is more complicated than simple gibberish.

The team notes that the text also conforms to Zipf's law—it states that words in real languages are inversely proportional their rank in a frequency table. Taken together, the researchers conclude that the Vonynich text mostly likely contains real information and thus, is not a hoax.

Explore further In search of the key word: Bursts of certain words within a text are what make them keywords

More information: Montemurro MA, Zanette DH (2013) Keywords and Co-Occurrence Patterns in the Voynich Manuscript: An Information-Theoretic Analysis. PLoS ONE 8(6): e66344. Montemurro MA, Zanette DH (2013) Keywords and Co-Occurrence Patterns in the Voynich Manuscript: An Information-Theoretic Analysis. PLoS ONE 8(6): e66344. doi:10.1371/journal.pone.0066344 Abstract

The Voynich manuscript has remained so far as a mystery for linguists and cryptologists. While the text written on medieval parchment -using an unknown script system- shows basic statistical patterns that bear resemblance to those from real languages, there are features that suggested to some researches that the manuscript was a forgery intended as a hoax. Here we analyse the long-range structure of the manuscript using methods from information theory. We show that the Voynich manuscript presents a complex organization in the distribution of words that is compatible with those found in real language sequences. We are also able to extract some of the most significant semantic word-networks in the text. These results together with some previously known statistical features of the Voynich manuscript, give support to the presence of a genuine message inside the book. Journal information: PLoS ONE

© 2013 Phys.org