An old-school encyclopedic dictionary such as Webster’s Second from 1934 was the fact-filled Internet of its day. There were color plates for flags and orchids, photographs of naval vessels, anatomical diagrams, census figures, and a gazetteer. There were entries for Eiffel Tower and Bridge of Sighs (including the poem with that title) and a “Pronouncing Biographical Dictionary” for notables living and dead, from King Ethelwulf to Postmaster General James Aloysius Farley. At 3,350 pages, it was probably the largest mass-produced book in the history of American publishing, and its heft underscored its function: it was an answer book.

It was also overfull with typographical and symbolic complexities. Double vertical lines before a headword indicated a “Foreign Word.” Labels such as Obs. or Rare would appear before a definition if there were several senses, or after a definition for single-sense entries. Other labels—such as incorrect, improper, and erroneous, or humorous, jocular, and ludicrous—could be difficult to distinguish from each other. Entries for odd variants or extremely rare words were set in tiny “pearl” size at the bottom of each page. And many definitions were written almost like essays.

When it came time to revise the Second, editor Philip Gove famously streamlined the apparatus of the dictionary. He reduced the number of labels and required that those that remained always appear in exactly the same part of an entry. He integrated the “pearl” section into the main A–Z sequence. He was innovative and rigorous in his use of punctuation. For example, he borrowed from the field of symbolic logic the boldface colon that appeared before every definition, which he called the “symbolic colon.” Its description in the Explanatory Notes of the Third sounds like that of a tag in a markup language such as XML: “It indicates that the supporting orientation immediately after the main entry is over and thus facilitates a visual jumping from word to definition.” And he instituted the single-statement defining style, making each definition a single grammatical unit that could (at least theoretically and ideally) replace the headword in a sentence.

The explanation of defining in his immense “Instructions to Editors” for the Third (a set of three-ring binders, known in-house as the “black books,” that contained the complete style guide for the dictionary) make his debt to formal logic explicit:

The first major machine-readable dictionary was created in the 1970s, as part of a project funded by the National Science Foundation to develop natural-language processing for intelligence and educational uses. The dictionary chosen was the Gove-edited Seventh Collegiate, the first abridgment of Webster’s Third. Robert Amsler, a member of the team that did the work at System Development Corporation (SDC also developed an early mainframe computer for DARPA and is considered the first software company), told me that he found the text of Collegiate definitions to be a nearly exact subset of that in Webster’s Third. This might seem logical and unremarkable to most people, but in practice abridgments almost always end up differing significantly from their sources. In this case, Amsler said, “it would have been possible to generate the Collegiate from the Unabridged by putting in brackets and having a computer program pick out the bracketed text to create the Collegiate.”

Not until 1989, almost thirty years after the publication of the Third, would Merriam-Webster begin digitizing the dictionary itself. Scans of historic dictionaries can be valuable for scholars but don’t have the search or indexing functions that allow for updating entries as needed and isolating online lookups for everyday use. And optical character recognition (OCR) technology introduces too many errors to be efficient; even a 99.9% accuracy rate would result in almost 100,000 typos in the text of Webster’s Third. As a consequence, the entire text had to be electronically keyboarded, one character at a time. For increased accuracy, each page was keyed by two different typists in order to identify and correct discrepancies. Some of the results were proofread by Merriam-Webster editors, and further corrections were made to the resulting digital file as they were found.

But to make the book fully functional and searchable, a system of tags had to be devised. The task of analyzing the text and identifying a structure of data elements—essentially, of translating Gove’s style structure into SGML—fell to editor John Morse. He marveled at the rigor and logic of the result: a book with 2,662 pages, 465,000 entries, and some 60 million individual characters whose structure was reducible to eighty immutable rules expressed by tags. The flowchart he created expresses what a definition is:

Structure was paramount to Gove. He was a linguist who used the logic of a programmer, and in the 1950s and ’60s he seems to have been thinking about the dictionary with the extreme rigor of a software engineer. Though he could never have imagined search as we know it today, he would have been among the first to intuit its uses for lexicographers. So as work on the Third was winding down he took another step to address a kind of question that only a computer could easily answer: he set the typing staff the new task of creating a 3”x5” slip for virtually every word that appeared in boldface in the dictionary typed backward, each letter followed by a space (and spelled normally, without the extra spaces, below its backward spelling).

Hundreds of thousands of such slips were eventually produced. They occupied a card catalog on the editorial floor at Merriam-Webster until the mid-1990s, when they were transferred to cardboard file boxes and moved to shelves that fill a wall in the building’s basement. The yellowed label on the end of each box is marked “Backward Index.”

The Backward Index evidently turned out to be a useful tool in the pre-electronic age. For example, it could help identify a set of related terms that should ideally be defined in similar ways, including open compounds (Highland pony, Shetland pony, Welsh pony), closed compounds (blocklike, clocklike, rocklike, socklike, chalklike), and morphologically related terms (phytopathological, ethological, lithological, ornithological). Thus, looking up all the diseases that end in –itis or all the doctrines and theories that end in –ism was now possible. Since rhymes depend on word endings, initial research for a rhyming dictionary also made use of the Index, where sequences such as seepy, steepy, weepy, sweepy and dorty, forty, shorty, snorty, porty, sporty, rorty, torty show up regularly. And while drafting the section on plurals for Webster’s Standard American Style Guide in the mid-1980s, editor Madeline Novak recalled identifying ending patterns she wanted to cover, using the Backward Index to find examples with those patterns, then checking the backing in the citation files to find actual plural usage.

The Backward Index was also useful for those “I-know-there’s-a-word-for-that” moments; a colleague trying to remember the name of a particular phobia found it by looking up the listings at aibohp-. Answering unpredictable questions from the public turned out to be another use of the Index. Responding to the many letters (now mostly e-mails) that we receive every week sometimes presupposes a freakish omniscience of the dictionary. In the pre-digital era, how else could we have ascertained that there are some 500 words in the dictionary that end in –ology? that the third English word ending in –shion, after cushion and fashion, is fushion (a rare variant of foison)? that publicly is the only adverb that now more commonly ends in -cly than in -cally? that there is a third word in English ending in -gry? or that there is such a word as phytolithological?

In fact, phytolithological isn’t entered in Webster’s Third. Using the Index, I noticed that some of the slips are white and some are buff. The buff slips seemed to be much older; they were typed using a capital initial letter (that is, the last letter of the word), and most did not include the word typed forward. It soon became clear that the buff slips derived instead from Webster’s Second, and many of them contained words (phytolithological, predestinary, matutinary, etc.) that would be dropped from the Third. I had always assumed that the Index was Gove’s creation, as a perfect reflection of his orderly character, but it turns out that he was building upon the work of his predecessors. Gove had evidently run across the old file or been alerted to its existence by older editors, and deemed it worthy of a comprehensive update. (In 1976, after he had retired, the Index was again updated with new words from the Third’s addenda.) There are few lexicographers with messy minds.

There are about 315,000 slips in the Backward Index, in 129 file boxes. Words ending in e alone take up 23 boxes. Most of the boldface forms from the Third and some portion of those from the Second are there, and there are duplicates for many words entered in both editions. But boldface idioms and phrases are omitted, as are all capitalized nouns; if the latter were originally included in the earlier set of slips (the Second included many such nouns), they may have been removed at Gove’s direction when he updated the Index (the Third has none). Entries from the tiny-print pearl section of the Second are also omitted. Any additional criteria for inclusion and omission would be difficult to determine without lengthy and dusty investigation.

Remarkably, I have found no company records or memos about the work of the typing pool, or indeed about the official purpose of the Index. And the memories of veteran editors are hazy. I have consulted with staff from the ’50s, ’60s, and ’70s, and no one recalls any specifics about the Index. It’s not mentioned in any of the books or articles about Gove and Webster’s Third. The one point repeated to me by all of the editors and clerical staff who knew and used it is that Gove did not want to lay off staff immediately after the publication of the Third and so devised the index project as make-work during the latter stages of proofreading for Webster’s Third, when editors were not marking enough new citations to keep the typists busy. We may never know more than that.

Most of us are nearly incapable of spelling backward in our language. As a colleague in our electronic publishing department remarked after hearing a description of the Index, “These people wanted a computer.”

Special thanks Robert Amsler at the National Museum of Language and to Ben Zimmer at Visual Thesaurus for connecting the dots.

Thanks to my colleagues John Morse, Madeline Novak, Michael Belanger, Syd Seale, and E. Ward Gilman for their recollections and ideas.