Text Analysis - the Writing System

Introduction

The main mystery of the Voynich MS is clearly its unknown writing. This topic is addressed from three different aspects, on three (sets of) pages:

A look at the writing system, describing its main properties, and similarities and differences with other known writing systems;

Transliteration of the text;

Statistical analysis of the text of the MS (further subdivided into five areas).

This page addresses the first part, the analysis of the writing system. Following are the main topics of this page:

Main text writing

Almost the entire Voynich MS is written in a script that is not found in any other surviving (old) document. The text of the MS has been written mostly in a line-by-line manner, obviously from top to bottom and from left to right. The majority of this text is written in short paragraphs, which are often separated from each other by a larger line spacing. Herbal pages typically have two or three such paragraphs, which tend to occupy mostly the upper half of the page, clearly avoiding the herb drawing. Pages in the so-called biological section tend to have much more text, filling the entire page, sometimes with just three paragraphs, but occasionally also more. The text tends to have a straight left margin, and is only roughly right-justified, except for the last line of each paragraph which tends to be shorter.

The text consists of groups of characters separated by spaces, and these groups seem to form words. The same words tend to appear throughout the MS, with a frequency distribution that is quite normal for a meaningful text (1).

In some places, single 'words' are written near elements of drawings. These have come to be called 'labels'. There is a clear suggestion that these words provide the name of the object in question. Many of the label words also occur in the running text, though when they do, this is rarely in the immediate vicinity of where the label occurs.

The following types of labels may be found in the Voynich Ms:

There are also places in the MS, for example in the cosmological section, where single words appear as elements in the overall design, but not necessarily near an identifiable object.

The term 'titles' was introduced by John Grove. This term is related to the layout of the last lines of some paragraphs. Normally, these last lines are left-justified and do not reach the right margin. Three alternative formats are used occasionally:

The last line is short and centred



The last line is short and right-justified



The last line is left-justified, but has additional words that are right-justified



The last example is strictly speaking what John Grove called titles, but all three cases are of interest. I have counted 17 pages that include centred end lines, 11 pages with right-justified end lines and 5 pages that use 'titles'.

Almost all astronomical, astrological and cosmological pages have circular drawings, some with text written in normal paragraphs, but all with text that has been integrated in the drawings. Frequently, text is written along the circumference of these circles, and occasionally also along radial lies. The following figure shows an example of text along the circumferences and along radii of such a circular drawing.

In some circular designs and occasionally also in in the margins of some pages, sequences of single characters or short words may be found. These are usually referred to as key-like sequences. They have been included in the table at the bottom of this page, with the keyword 'SEQ'.

The circular diagram on f57v deserves special mention here, as there appear to be several sequences of words and characters integrated in this figure. Since there is no reason to call these 'additional' or 'extraneous', these alone are not included in the table .

Non-sequential writing

In a few places it appears as if the first characters of lines were written in a vertical column first, possibly in order to create a straight left margin. The remainder of the text was then added later. The following example is one paragraph on f88v. Especially in the last two lines of this fragment, there is a strong suggestion that the initial characters were written first, and the remainder of the line was written later.

In addition, there are several places in the MS where writing seems to have been added 'afterwards' or where the text was perhaps not written in a strictly top-down line-by-line manner. One clear example is f105r, where one may observe a text break and a use of different inks before and after it, and also a partial line added above the last paragraph in the darker ink. There are more examples of this type of change in the MS.

While the earlier owners of the Voynich MS (Barschius, Kircher) may possibly have believed that the writing represented a languague unknown to them, nowadays we know that there is no other old MS that uses the same writing. Ever since it was brought to light by Voynich, people have compared the script with examples of writing. A summary of this is provided in section 4 of D'Imperio (1978) (2), and this may serve as the basis for the following.

A number of characters are very similar to Latin characters, and look like 'a', 'c', 'i' (undotted), 'm' 'n' and 'o'. ( a e i iin in o );

); Others look like numerals, such as 2, 4, 8 and 9. Of '4' both an early Arabic and a modern variety exist.( r l q d y );

); Some characters look like abbreviations found in Latin medieval manuscripts. They are briefly discussed below;

The remaining set has been compared with alchemical symbols or early renaissance ciphers. These are also briefly addressed below.

In medieval manuscripts numerous more or less standard abbreviations and ligatures were in use. A collection of such abbreviations may be found in the dictionary of Cappelli (1912) (3), which lists, in alphabetical order, a large number of abbreviations found in Latin and Italian manuscripts. While browsing through this dictionary, one will immediately recognise the similarity between some of its entries and the writing in the Voynich MS. D'Imperio (see note 2) shows a few cases in Fig.17. Some of these are also included below.

c cum, con ch ra, ci, cri Co co, quo Ca ca Cy cus s cun, con, ... m -nd- , -nt- g eius y con, cum, -us, ... n ter, in-, ... in -um Sh termi

A particular set of characters in the Voynich MS is usually called 'gallows' characters. They ascend above the majority of other characters. These are not typically found in medieval manuscripts, also not as abbreviation signs. There are four of these:

They also occur in combination with the character ch, in which case they have been called 'pedestalled gallows':

While these characters are not typical abbreviations, one illustration in Cappelli (1912) (4) provides a striking comparison:

Alchemical symbols

Figure 42 in D'Imperio, taken from Gessmann (1922) (5) shows some similarities between Voynich MS characters and a few alchemical symbols. There are only a few examples, and it is not certain whether this is coincidental or not, since these alchemical symbols were not known to me to have been in use in the early 15th century.

14th and 15th Century cipher

Potentially of great interest is the comparison between the characters in the Voynich MS and the symbols found in early renaissance cipher systems. D'Imperio (Fig.39) shows examples from a cipher of Parma (1379), A Venetian cipher (1411) and the Code of Urbino (1440). While these do not show exactly the same characters as the Voynich MS, there are some striking similarities, and the author of the MS may well have been inspired by these, or similar, examples.

Of particular interest is the codex of Tranchedino (MS Vindobonensis 2398 in Vienna), which has been issued in a facsimile edition (6). This MS lists numerous, rather similar sets of ciphers to be used with different correspondents, and they are dated from 1450 to 1496. None of these ciphers uses the gallows characters mentioned above, but the common sequence qo (see also the various illustrations of Voynich MS text above) appears several times, typically representing a single character. Other typical features of the ciphers in Tranchedino are that they include nulls (characters that are introduced but have no meaning), and that double characters are usually represented by a single code character.

Among the few known examples of manuscripts from the 15th century that have been written in code are two manuscripts by the humanist Giovanni Fontana (ca. 1395 - ca. 1455). Also these are available in facsimile edition. More information about Fontana and his cipher may be found on a web page by Philip Neal (7).

Initials

The initial characters of many paragraphs are larger than usual, and are sometimes embellished with additional curls or dots. Not all characters in the MS can appear in a paragraph-initial position, and indeed these characters form only a very small subset.

In addition to this, there are two initials highighted in red on the first page of the MS (f1r), which are not standard characters in the Voynich MS alphabet. These have been the subject of some speculation, and many years ago I was struck how one of them appears almost identical to the old Aries symbol in Greek astronomical manuscripts (see in particular here). More recently, they have been observed in Spanish manuscripts - see for example this >>blog page by J.K.Petersen. These symbols have been used for centuries as emphasis or paragraph markers, and the strong suggestion is that the Voynich MS scibe was familiar with these.

Rare characters

The text of the MS includes numerous rare characters. Many of these appear only once in the MS, and could be considered aberrant forms of standard characters. Others appear several times, and are concentrated in some pages of the MS. One of these has been called the 'picnic table' (x). Numerous other examples may be found in the next page about transliteration of the text.

General

Until as recently as 2018, a still largely unexplored area of the Voynich MS was a paleographic analysis of the writing. Since the script is unique, a general classification by comparison with existing scripts cannot be made very easily, but a few comments have been recorded. They fall into two categories.

The first is whether the type of script can be placed in a geographical or temporal context. In this respect, the herbalist Sergio Toresella stated in 1995 (8) that the handwriting of the MS was a humanist hand, which belongs to 15th century Italy. While some doubts have been expressed over the years by non-experts, it was recently confirmed by another handwriting expert (9). An interesting point with respect to the humanist hand is, that this was not at all popular in the time before printing, and it was an elitist hand primarily used for classical texts, but not for medical texts or by scribes (see note 9).

The second question is whether the text is all by a single scribe, or one may identify different scribes, or at least different hands. Several experts have stated that the handwriting appears to be uniform throughout the MS. Panofsky made a short remark to his extent (10), and a more specific statement from A.H. Carter (1946) is quoted in section 3.1 of D'Imperio (see note 2), who even goes further and says:

Because the same ink and the same kind of penstrokes appear in the illustrations and because the text forms an integral and unified part of many of the illustrations, it appears probable that the same person wrote the text and drew the illustrations.

Toresella also stated (see note 8) that the entire MS appears to be in the same hand.

A well-known contrary opinion was stated in the 1970's by Prescott Currier, who was the first to point out (11) a variation in handwriting style. He also correlated these with textual statistics, which are described on another page (12). He called the main hands '1' and '2', and the textual variations languages 'A' and 'B', and proposed that the 'A' text was written mostly in hand '1' and the 'B' text mostly in hand '2'. Currier furthermore tentatively identified several more hands, which he called 3, 4, 5, X and Y.

The following 'cuts' from various pages of the Voynich MS attempt to illustrate the different handwriting styles used in the different sections of the Manuscript as identified by Currier. His classification of languages into A or B has been reflected in the font used in the captions. Red and bold for A-language , blue and italic for B-language and neutral when no identification was given.

Each image represents a similar-sized section of a page.

f2r, Herbal f26r, Herbal f70r2, Cosmological f79v, Biological f86v5, Text-only f88v, Pharmaceutical f116r, Recipes

At first sight, only the Herbal-B page seems different, and it would appear to my inexpert eyes as if Currier's correlation between hands and languages, which he based mostly on the herbal section, may not be valid for the other sections.

It is worth noting here, that the features of the last lines of paragraphs described under the term 'titles' above, mostly appear in both 'styles' identified by Currier. Centred and right-justified lines appear in pages of all hands, while titles in the strict meaning of the word are only found on pages in language A and Currier hand 1 (13).

Recent developments - Lisa Fagin Davis

More recently, the medievalist Lisa Fagin Davis has taken up the task of analysing the handwriting in the MS. Having studied the entire MS, she has come to the conclusion that five different hands can be identified. These results have initially been announced in various social media, and on 24 January in a presentation at the 2020 Annual Meeting of the Bibliographical Society of America. They were also mentioned in a video interview published in April 2020. (14). They were first published completely in May 2020 (15). She does not agree with all aspects of Currier's handwriting analysis but confirms Currier's hands 1 and 2, and adds hands 3, 4 and 5.

An important difference between these two sets of identifications is that Lisa Fagin Davis has provided an identification for all pages, while Currier misses many. Furthermore, in the newer case we have a clearly-defined and justified explanation of the criteria that were applied for the identification, from a recognised authority in the field.

What both identifications have in common, is that the split between the different hands is essentially along bifolios. In the new identifications, the exceptions are clearly identified as three cases only:

Of the bifolio with fol. 57 and fol. 66 , only 57v is in a different hand.

Of the single large sheet in quire 14, the two sides are in different hands

Of the bifolio with fol. 104 and fol. 115, only the first 12 lines of fol. 115r are in a different hand. It is a new and so far unique feature that different hands are observed on a single page.

Also Currier had several exceptions in this area, in particular related to his 'lesser hands' 3, 4, 5, X and Y.

The differences between the two hand identifications are summarised in the following table.

Folio(s) Currier's Lisa Fagin Davis f57v, f89, f90, f99-f102 - 1 f115r (1st 12 lines) - 2 f58, f65, f107-f116r - 3 f67-f73 - 4 f66 - 5 f41, f48, f57r 2 5 fRos (obverse) 3 2 fRos (main) 3 4 f87, f88, f93, 96 4 1 f94, f95 5 3 f103, f104, f106 X 3 f105 Y 3

One unusual feature of the writing of the Voynich MS is that it appears to have no corrections. The first mention of this (that I am aware of) is in a letter preserved in the Beinecke Library, from Anne Nill to Theodore Petersen, dated 19 Feb. 1953. She writes:

I remember I talked too much, but did I really say "the ms. does not include a single erasure or correction"; whatever I said, this is my present opinion: in all my experience of manuscripts I have never come across one in which corrections and erasures are so unobtrusive as they are in this ms. if it contains any. I have looked through it again - of course not every word, and have nothing to add to the one or two probable corrections I recorded c. 1936 when I worked with photostats.



In this connection I must add that we still have a few Voynich estate mss. (requiring additional research which I never found time for, before I attempt to sell them) which include one beautifully written text on fine vellum and some well and some poorly written manuscripts; and in none of them is there any difficulty in detecting corrections, erasures, deletions or transpositions.

This apparent lack of corrections has occasionally been taken as evidence that the scribe could not understand what he wrote, or even that the text is meaningless.

Ever since high-resolution images of the MS have been publicly available, closer scrutiny by many people has revealed a few cases where it appears that the text has been emended. Following are those cases I am aware of, which shows that such corrections are few and minor.

f16r f20v f24v f39r f39r f42r f50v f79r f80r f83r f102v2 f112r

This general term is used to indicate several different types of additional writing that may be found in the MS. Only very few barely legible phrases in the normal (non-Voynich) alphabet may be observed in the Voynich MS. These are listed below, including the numbering of folios and quires.

Folio and Quire numbers

Quire marks are found most commonly (but not always) on the verso of the last folio in each quire, in the lower right corner, while folio numbers have been added in the upper right corner of the folios, when the pages were folded in. Both appear to have been added after completion of the MS. They are discussed in some more detail on a previous page (see also here).

On several of the cosmological pages the characters 'a', 'b' and 'c' have been written in the top corners: fol. 67r1, 2, fol. 68r1, 2, 3, fol. 70r1, 2. These are written in pencil and were obviously added by a later owner of the MS. This may date from as late as the Jesuits of the Collegium Romanum, or even Wilfrid Voynich.

Month names have been written in a later hand in the central drawings of each of the zodiac pages: fol. 70v2,1, fol. 71, fol. 72 and fol. 73.

The language, which is certainly a Romance language or dialect, has been much debated, including suggestions of Spanish, Occitan and French. The most convincing argument is presented >>at this web page and thus the language would appear to be Northern French. This is further confirmed by the appearance of very similar month names on an astrolabe that originates from Northern France (16). Following is a table of the readings of the month names in the Voynich MS zodiac pages:

Folio Sign Month name f70v2 Pisces mars f70v1 Aries aberil f71r Aries aberil f71v Taurus may f72r1 Taurus may f72r2 Gemini jong f72r3 Cancer iollet f72v3 Leo augst f72v2 Virgo septe(m)b(r) f72v1 Libra octe(m)bre f73r Scorpius nove(m)bre f73v Sagittarius decebre

On some of the earlier herbal pages in the beginning of the MS we find individual letters, and in one case the word 'rot', in or near leaves, flowers and stem of plants. These appear like colour annotations, and were already discussed here, in the frame of the origin of the MS. A detailed list of these annotations has been added to the table at the bottom of this page with the key word 'COL'.

Other 'plain text' writing

On f1r we find the faded or erased ex libris of Jacobus de Tepenec. This is discussed in more detail on a dedicated page.

In addition, in the right margin we find faded or erased character tables, which are presumably a decryption attempt by a later owner.

Mixed writings in or near margins

On f17r we find a largely unreadable comment near the top margin. Nick Pelling was the first to observe that this also includes one or two words in the Voynich script, barely visible in normal light, but clear under UV illumination. They seem to read: oteeeon oiil but in particular the second word is not clear. See also Pelling (2006) p. 164, where he reads oteeeol aim (17).

In the left margins of f49v and f76r we find detached sequences of characters, in the former case accompanied by the arabic numerals 1-5.

On f66r we find both characters and short words in the left margin. Near the bottom of the same page we find some apparently German words, which have been partially amended, near the dead body of a man or woman, and some other objects. This was first interpreted by R. Salomon as 'der Mussdel'. I would propose the reading 'Musmel'. Neither are generally accepted. There is also some writing in the Voynich script, this time clearly visible.

The text on f116v is a short paragraph of text including what appears to be German, Latin and two words in the Voynich script. An additional line is slighty offset above this, in the top margin. It is perhaps the most debated text in the entire MS.

Other

On f66v and f86v3 we find strange, unreadable yet similar scribbles of which it is hard to say whether they represent readable text or not. They are usually referred to as chicken scratches. These two have been added to the below table with the key word 'SCR'. More chicken scratches may be observed above the roots of fol. 43r, but these really look like scratches, so are not counted.

The following table combines all types of additional or extraneous writing into a single overview. This includes observations brought together by many different observers. As already indicated above, the sequence(s) of characters integrated in the circular design of f57v is not included here.

Fol. Type Lang What / where Comment 1r SEQ V+L Three vertical sequences of single characters in the right margin, faded or erased Possibly a decryption attempt by a later owner 1r WRI L The ex libris of Jacobus de Tepenec, followed by the number 19 (TBC) faded or erased, in the bottom margin see here 1v COL L A single 'g' under the paint of the second leftmost green leaf The suggestion is that this indicates that it should be painted green. 1v COL L A less clear single letter, possibly a 'J', inside the lowest yellow leaf right of the stem It is not clear that this is ink under the paint. It is hard to identify this as a colour indication. 2r COL V Voynich text ( ios an on ?) under green paint of bottom right leaf (middle petal) The suggestion is that this could also be a painting instruction 2v COL L Cursive 'fo' or 'fa' in dark ink just above the first word of the second paragraph. Similar to 'fo' (meaning folio) annotations on all pages of the alchemical herbal Firenze MS 106. 4r LET L A capital F in the rightmost flower 4r COL L The word 'rot' (German for red) written vertically in the stem of the plant (near the bottom) 7r COL L Characters probably forming the word 'rot' (German for red) under the paint in the left half of the root. Could be an additional colour annotation similar to f4r 9v COL L Several characters in the top left flower under the blue paint: 'por' in the top petal, 'p' in the lower left petal and 'r' in the lower right petal Readings are tentative. These may also be colour annotations. 9v COL L An unclear letter of scribble between the top two petals of the top left flower Meaning not clear. 9v COL L A single 'g' to the right of the top right flower Seems to be another colour annotation. 11v MAR Nr The nr. 88 (or Voynichese dd ?) close to the edge of the page, at the height of the second last line. Seems to refer to the Voynich character sequence dd found on that line. 17r MAR L+V Small writing in the top margin of the page. Under UV illumination, one or two words in Voynich writing, largely faded, can be seen at the end of this fragment. 20r COL L A 'p' (or possibly an 'r') at the top of the root system Could be another colour annotation. 28v SCR ? Some apparent symbols in the middle of the flower If this is writing, the script has not yet been identfied. 29r COL L An 'r' in the higher part of the root system Could be another colour annotation. 32r COL L A 'p' and what looks like a 'v' or an 'r' in the bottom right flower. There could be another character after the 'p'. These could be more colour annotations. 39v LET L A capital B in the white space between the two green part of the bottom right leaf. 49v SEQ V+L A vertical sequence of single characters in Voynich script in the left margin, aligned with and just before each line of writing. In addition, the numbers 1 to 5 to the left of the top five characters. Both may have been added after the main text was added. 57v OTH ? A symbol in the lower right corner of the page Tentative reading: 17 (-mus) 66r SEQ V Sequences of words and characters in the left margin of the page 66r SEQ L+V Writing near the lower left corner of the page, with a reclining person and some other small objects The well-known 'musdel' reference 66v SCR ? Unintelligible scribbles left of the root of the plant Similar to those on f86v3 67r2 FOL L A pencilled 'b' in the upper right corner of the page 68r1 FOL L A pencilled 'a' in the upper left corner of the page 68r2 FOL L A pencilled 'b' in the upper left corner of the page 68r3 FOL L A pencilled 'c' in the upper left corner of the page 70r1 FOL L A pencilled 'a' in the upper left corner of the page 70r2 FOL L A pencilled 'b' in the upper left corner of the page 70v2 MON L The word 'mars' near the Pisces emblem 70v1 MON L The word 'aberil' near the Aries emblem 71r MON L The word 'aberil' near the Aries emblem 71v MON L The word 'may' near the Taurus emblem 72r1 MON L The word 'may' near the Taurus emblem 72r2 MON L The word 'jong' near the Gemini emblem This is the strongest evidence that the language is northern French 72r3 MON L The word 'iollet' near the Cancer emblem 72v3 MON L The word 'augst' near the Leo emblem 72v2 MON L The word 'septe(m)b(r) near the Virgo emblem 72v1 MON L The word 'octe(m)bre near the Libra emblem 73r MON L The word 'nove(m)bre near the Scorpius emblem 73v MON L The word 'decembre' near the Sagittarius emblem 76r SEQ V+L A vertical sequence of single characters in Voynich script in the left margin, aligned with some of the lines of writing. 86v3 SCR ? Unintelligible scribbles in the middle of the page Similar to those on f66v 99v COL V? What looks like Voynichese qo in the fourth root of the third row If this is writing, it is very small 116v MAR L+V Recipe or spell, in a mixture of pseudo-Latin and German, with two words in Voynich script. Additional marginal drawings.

Notes