Language Tree Sheds Light on First Sentence Structures

Most linguistics scholars today agree that many of the world’s modern languages share a common forebear. Indeed, this notion lines up perfectly with the sudden appearance of behaviorally modern humans, as they are referred to by geneticists and archaeologists, some 50,000 years ago. Where experts disagree, however, is how language has developed since then.

While many linguists posit that development of sentence structure has been a relatively haphazard process, an unlikely collaboration between a Stanford linguist and a Nobel Prize-winning physicist provides proof of a more linear evolution of language. Merritt Ruhlen, a lecturer in Anthropology at Stanford, and his longtime collaborator Murray Gell-Mann, a founder and Distinguished Professor of the Santa Fe Institute, have mapped the evolution of word order in a paper titled "The Origin and Evolution of Word Order," published in the Proceedings of the National Academy of Sciences (PNAS).

Drawing on a sample of 2,135 of the world’s known languages, they have created a phylogenetic Language Tree, and through it show that sentence structure in the first modern languages followed a subject-object-verb (SOV) order, still evident in many modern languages such as Japanese. With groundbreaking evidence that points to a continuous evolution of language, the scholars provide new insight into how our ancestors may have communicated.

To highlight the significance of such findings, it is first necessary to demonstrate word order in action: In the English language, the common sentence will see the verb follow the subject, and the object follow the verb. Thus, the sentence: “The man reads the book” would be described as having a SVO (subject-verb-object) order. In Japanese, however, the verb follows the object, which itself follows the subject, to produce an SOV word order. It is for this reason that if we were to use Google Translate to decipher the Japanese for: “The man reads the book”, our screen would show us something resembling: “The man the book reads”.

The work of Ruhlen and Gell-Mann demonstrates that such differences did not occur by chance. Using their Language Tree, the authors show not only that SOV was the original word order but, moreover, that word order evolution has since been a largely unidirectional process whenever it has occurred.

Tying Sentence Structure To Migration Patterns

Modern humans first appeared 200,000 years ago, Ruhlen explains, but their social, technological and linguistic capabilities were largely evocative of their predecessors, the Neanderthals. “It was only about 50,000 years ago that major changes occurred by which these humans became not just anatomically modern, but behaviorally modern.” Along with technological innovations, such as the creation of tools from materials other than rock, art and fishing, came the development of fully modern language.

“We can trace the source of all modern languages today to this period,” says Ruhlen. “And it is our argument that the languages of this period were characterized by an SOV word order.” Changes that saw the emergence of other word orders only truly began about 20,000 years ago, says the scholar, as a result of human global migration out of Africa. These migrations allowed the original SOV word order to change into the other five possible word orders at different times in different places.

The tree was constructed using cognates, or words that have common etymological origins. “Cognates are words that are born together,” explains Ruhlen. “For example, the word for hand in both Spanish and Italian is mano, derived from Latin.” The scholars noted the occurrence of each type of word order in the tongues of each language family, finding that while all possible word orders (SOV, SVO, OSV, OVS, VSO and VOS) appeared, over half of the world’s languages are today SOV.

Deciphering how word order has evolved, the researchers found that the direction of change in almost all cases moved away from SOV. In almost no cases has language ever evolved backwards, except in instances of what Ruhlen calls “borrowing.” This, they argue, indicates that the evolution of language has not been a random process.

Similarities Across Disparate Language Groups

That the direction of change has almost always been away from SOV, except for rare cases of diffusion, is best illustrated by the linguistic evolution on New Guinea, where Ruhlen finds that for over 30,000 years of human habitation, almost all 700 of the island’s languages have maintained an SOV word order. The only exceptions, the scholars find, occur in coastal regions, where SVO word order has in some languages replaced SOV as a result of diffusion by contact with Austronesian languages.

“If language word order really does change back and forth all the time for no reason, word order on New Guinea should be all over the place,” says Ruhlen. “But it’s not.” Taking Indo-European languages as an example, Ruhlen demonstrates how, despite the occurrence of three word order possibilities in the branch’s modern languages, all are derived from an original SOV word order.

SOV, he shows, is common in the languages of the Indian subcontinent, while SVO appears in European languages such as English and Spanish. VSO, meanwhile, is rarer, appearing only in Celtic. As he traces these languages up to the nearest out-group, Anatolian, he finds that Anatolian is strictly SOV, which indicates that Indo-European too was originally SOV and SVO and and VSO are innovations within the Indo-European family.

“Going up the tree, you find that SOV appears at the earlier stages of all of these language families.” Similarly, the Afro-Asiatic family shows occurrences of the same three word order possibilities. “But as you work up the tree, you see the branches become strictly SOV,” says Ruhlen. For example, North Afro-Asiatic (VSO) and Chadic (SVO) languages both derive from Chado-Afro-Asiatic (SVO), which along with Cushitic (SOV) is derived from the Erythraic (SOV) language branch. This branch, along with the Omotic (SOV) branch, was derived from the earlier Afro-Asiatic language (SOV).

Mapping Human Development

Ruhlen and Gell-Mann drew inspiration for their work from the late celebrated Stanford linguist Joseph Greenberg, who Ruhlen cites as a major influence and mentor over their 30 years as colleagues. In a 1963 publication that remains one of the most cited works dealing with language classification, Greenberg showed that the languages of Africa--over 1,000 in number--could be reduced to just four language families.

Also in 1963 Greenberg published an article in which he surveyed “the order of meaningful elements” in language and it was in this paper that he first discussed word order. The paper by Gell-Mann and Ruhlen is an extension of this work. Gell-Mann and Ruhlen first met at a workshop on the evolution of language that Gell-Mann had organized at the Santa Fe Institute in 1987. Ten years later, Gell-Mann and Ruhlen organized another workshop on language classification and presented their paper on the origin and evolution of word order at that time. 14 years later, their groundbreaking scholarship on the evolution of word order was finally published.

Today, Ruhlen – who serves as a lecturer in Stanford’s Anthropological Sciences program – is seeing another project of his being investigated by other departments. The database used to study the evolution of word order in the present paper is now being analyzed in Marcus Feldman’s lab in the biology department to study the phylogenetic evolution of other traits such as consonants, vowels, pronouns and other aspects of word order.

“This project is being worked on very heavily right now. In theory, there is no connection between the phylogenetic evolution of consonants, vowels, pronouns, word order, etc.,” says Ruhlen. “But it turns out they are often highly alike.” “If a correlation is proven, this will not only provide further proof of Greenberg’s theory but will show that phylogenetic trees are the most accurate way to map human development.”

By Kareem Yasin