Linguists know a huge amount about the historical changes that have shaped the English we speak today, but there are still plenty of questions to be answered. In some cases, new tools that linguists stole from biologists are letting us ask questions that we haven't been able to address before.

A paper in Nature this week shows that randomness has an important influence on how language changes over time—in much the same way as random genetic mutation plays a central role in biological evolution. And by borrowing tools from biology, the researchers point to some examples of historical change in English that are best explained by random processes.

Random drift or biased brains?

The parallels between biological evolution and cultural evolution are not always exact, but there are some pretty robust similarities. Like genetic mutations, new forms appear in language. As with genes, some of those new forms become more prevalent over time. If a mutated gene is beneficial, natural selection ensures that it becomes more popular; if a new linguistic form is preferred for some reason, cultural selection makes it more popular.

But genes can sometimes also become widespread just through random chance, a phenomenon called genetic drift. Does the same apply to linguistic forms? Biologists Mitchell Newberry and Joshua Plotkin teamed up with linguists Christopher Ahern and Robin Clark to work out how techniques from biology could be adapted to linguistics.

As test cases, they used well-known examples of changes in English. First, they looked at past-tense verbs. In most cases, we form the simple past tense by adding -ed onto the end of a word, like type → typed or like → liked. But there are some irregular cases, like write → wrote and sleep → slept. And then there are the in-betweens: do you say sneaked or snuck? Spilled or spilt?

The researchers searched a linguistic corpus, which is a gigantic database of real-world language use, to track verbs that have two possible past-tense forms. They found 36 of them, which collectively popped up more than 700,000 times in the corpus. For each of the verbs, they tested statistically whether the pattern of change over the last 200 years looked more like random chance or like the result of selection by people who were biased toward one form or the other.

For most of the 36 verbs, there was no clear preference for either the regular or irregular form; the fluctuation could be explained by randomness. The remaining six showed evidence of selection. You might expect that the regular form would become more popular over time, and a lot of linguistic theory would agree with you—but surprisingly, this was the case for only two of the six words: “smelled” and “weaved.”

The other four words all showed evidence of the irregular forms being selected (“snuck," “dove," “lit," and “woke”). It’s possible, the researchers write, that rhyming words could explain this: when they looked at the rise of “dove” as the preferred form, they found that it “coincide[d] with a marked increase in the use of the irregular verb drive/drove in the corpus, associated with the invention of cars in the 20th century.” So people became more used to one irregular form and then transferred that pattern onto a rhyming verb.

Do you even English?

Another case the researchers explored was the rise of “do-support” in English. Where once an English speaker could have used a word order like “Drink you wine?” or “I drink not,” present-day English requires a “do” in there: “Do you drink wine?” and “I do not drink.” The change took a good few centuries to complete, so the researchers used a corpus of historical English that covers the last 900 years.

What they found was also a surprise: do-support showed signs of random drift in some kinds of sentences but selection in others. The corpus data suggested that do-support in questions became more frequent through random chance—but once it was frequent in questions, people copied the syntactic pattern into other types of sentences, which then showed evidence of selection.

The particular findings from English create a fascinating test case for the paper’s larger point, which is that this method can be used to better understand the processes that contribute to language change over time. And just as this paper has turned up some delightful oddities in the history of English, applying the method to other languages should find other interesting nuggets—and help to nudge our understanding forward at the same time.

Nature, 2017. DOI: 10.1038/nature24455 (About DOIs).