Riddle me this: what does a stubbed toe have to do with the call of a raven or the origins of human speech? Perhaps more than you’d expect!

If you’re hopelessly clumsy or easily surprised, you’re probably very familiar with those visceral words that come so naturally when life puts unexpected obstacles in your path, like “ow,” “ouch,” “oops,” “oh,” “wow,” “ewww,” or “yuck!” These words—well, are they even real words, or just vocal cries for help?—are known to linguists as interjections.

Alas! Interjections, particularly primary emotive interjections, have historically often been seen as the sad, unwanted step-child in the study of language. As spontaneous emotional outbursts, these unchancy communicative creatures are often dismissed as non-productive, standalone utterances—nasty, brutish, and short. For some linguists, they contribute little of interest to a language’s sentences and syntax (the bread and butter of modern linguistics).

They’ve been described rather harshly by Latin grammarians (as well as more contemporary researchers) as “non-words” and by nineteenth century linguists as para-linguistic… or barely even linguistic at all. In 1869 Gesch reputedly said “interjection is the negation of language.” According to Sapir interjections were “never more, at best, than a decorative edging to the ample, complex fabric [of language]” and Müller in 1861 said that “language begins where interjections end.” By signaling our more basic human emotional and mental states they often seem more like vocal gestures than real language.

Language is the cognitive faculty that separates humans from other animals, but interjections have often been equated with the primitive cries of animals. We might all know that a dog goes woof and a cat goes meow (even if we still don’t know what a fox says), so what does a human say? Are interjections then the instinctual sounds that a human can’t help but make when dealing with all their feelings? As another philological wag, Horne Tooke, noted:

The dominion of speech is erected upon the downfall of interjections. […] The neighing of a horse, the lowing of a cow, the barking of a dog, the purring of a cat, sneezing, coughing, groaning, shrieking, and every other involuntary convulsion with oral sound, have almost as good a title to be called parts of speech, as interjections have. Voluntary interjections are only employed where the suddenness and vehemence of some affection or passion […] makes [men] for a moment forget the use of speech.

So why bother with interjections at all, if people are just stumbling about in the linguistic dark, sounding their barbaric human yawps? What can they tell us about language, if they themselves have historically been regarded as not language?

Well, Müller and friends might have been onto something. Say interjections are supposed to be close to what humans might typically sound like when no one’s watching. What if language really did begin where interjections end? Could emotive interjections and animal cries offer some insight into the mysterious origins of human speech? Why did humans even begin talking when, for example, gesturing might work just as well for communication (as shown by the existence of many sign languages)? Could humans have evolved the language faculty by mimicking the cries of animals, themselves included? And if humans could do this, what’s preventing animals, particularly other primates like chimpanzees, from doing it too?

All these questions (and more!) that revolve around the origins of human speech are not so simple to answer, much less agree on, so not surprisingly, it’s been dubbed by some as “the hardest problem in science.”

In 1865, a society bylaw from French linguists expressly banned any of this loose talk about the beginnings of language.

Now if there’s one thing linguists avoid more than discussing interjections, it’s pure speculation about the murky origins of how language began, according to Adam Kendon. No good can come of this! said the Société de Linguistique de Paris when it formed in 1865. In its by-laws, since it was near impossible to validate the many speculative theories on offer, some wilder than others, it expressly banned any of this loose talk about the beginnings of language:

The Society of Linguistics has as its object the study of languages, and of legends, traditions, customs and documents which could clarify ethnographic science. All other subjects are rigorously forbidden… The Society will accept no communication dealing with either the origin of language or the creation of a universal language.

In more modern times, it was Noam Chomsky’s distinctly non-Darwinian though influential theories on the “surprisingly perfect” innate language faculty, as if designed from the drafting table of “a divine architect” that stuck a knife into this problem. It effectively bypassed any serious discussion of the origins of human speech for some time, by deftly ignoring it. Chomsky theorized that perhaps there was a “Great Leap” to language “that was effectively instantaneous, in a single individual, who was instantly endowed with intellectual capacities far superior to those of others, transmitted to offspring and coming to predominate.” The end. For this much relief, much thanks. The rest is, if not silence, at least is all speculation, allowing linguists to go onto solve other puzzles somewhat unburdened by the noise of actual human culture and communication.

Many theories have been broached, including Müller’s comically labelled “bow-wow” (language evolved by humans mimicking animals) and “pooh-pooh” (language evolved from interjections) theories, but few theories can really be validated. Along with similar early theories such as the “yo-he-ho” theory (language came out of collaborative, rhythmic labour) they’ve likewise been discounted as overly simplistic in scope (as well as ridiculous in name). It’s true that going so far back to the beginnings of “fossil poetry” as Emerson described language means we have little recorded evidence to go on of how early humans or even Neanderthals might have first mumbled to each other, the technology of writing only going back about 5,400 years. But in each of these possible answers there lies a kernel of truth or insight that may allow us a better understanding of the forces behind the language instinct, what sets humans apart from other animals, whether genetic, cognitive, social or culturally motivated. As such it seems an exercise worth considering.

And it all might start with some kind of grunt, just like an interjection.

The well-travelled language fancier might now be spluttering madly, because if “ouch” is such a natural response cry for humans, why do other languages use a variety of other speech sounds for the same thing? Why, for instance, do the French say “aïe!” and “ouille!” while the Japanese say “itai!”? For that matter why do roosters crow “cocorico!” in French instead of the sillier “cockle-doodle-do!” of English? At the same time, many languages might use the interjection “eh” but in semantically different ways (looking at you, Canadians), while “huh” has been so widely used with essentially the same meaning it’s been proposed as a universal word.

These onomatopoeic words are clearly symbolic, conventionalized signals and not the actual sounds that humans might instinctively make in certain emotional states. In essence, they are not quite “honest” signals. A real, involuntary laugh can be a world apart from the symbolic “ha ha!” just as a good cry or a loud cough might sound completely different from their phonetic interjections. Laughing, crying, coughing, screaming—these are all sounds we can make that reliably communicate our emotional or mental states to others without being in the realm of language. These sounds, along with body language, might be difficult to control and so are crucially difficult to “fake,” just as a cat or a dog might purr or bark. Meanwhile, words are easy to manipulate. So for us to believe interjections, we have to believe and trust that a speaker is telling the truth.

A baby crying might signal something urgent, such as pain, discomfort or even danger to its mother. For animals, that cry might be treated as an honest and reliable signal, which can be trusted, taken exactly for what it is every time. For example, Kenyan vervet monkeys have a relatively advanced set of alarms that not only alert others to a danger but can even warn them of the kind of predator, such as a snake or an eagle. A snake call might have all the monkeys looking down on the ground for the danger, as opposed to the air. While this might seem like a nascent language in the making, alarm calls are never made when there isn’t an actual predator nearby, so those calls are not the same as a word signifying a concept.

To put it bluntly, human language also developed the critical ability to play fast and loose with the truth—to lie.

Amotz Zahavi suggests that all signals evolve to become reliable in this way, with the remarkable exception of human speech. Why did these calls become symbolic, stylized, and conventional signals for humans, decoupled and displaced from their referents, in a way they did not for other animals such as our close relatives, the chimpanzees or animals with a larger acoustic range, such as birds?

There are certainly anatomical and neurological bases for how human speech evolved to be more efficient at conveying information than, say, gesture. But to put it bluntly, human language also developed the critical ability to play fast and loose with the truth—to lie. Language developed such things as displaced concepts (such as talking about something that isn’t there), metaphors (evocative analogies between two concepts), institutional facts (things that are only facts because everyone conspires to believe in them, even if it involves group denial of brute facts) and the like. As Chris Knight puts it “birth, sex and death are facts anyway, irrespective of what people think or believe. These, then, are brute facts. Phenomena such as legitimacy, marriage and inheritance, however, are facts only if people believe in them.” Somehow, humans are predisposed to produce these conventional signals and have them be decoded and ultimately trusted by others, despite the fact that it’s easy to fake or cheat.

Deception, such as tricking others out of a food source, is not unknown in animals. But even if certain species develop effective communication systems that have some similarities with an early evolving language, animal languages never do arise. Why?

For some researchers, an intriguing possibility is that the complexities and richness of language arose out of a social motivation to cooperate within the community. Conventional signals that are easy to fake are only worthwhile if you can trust the speaker, because you belong in a group that shares the same goals and intentions and presumably want to convey truthful information (even if that information is literally false, such as in a metaphor!). In a symbolic culture, some signals may have developed through ritual and even shared playfulness, imagination and games, enhancing group membership and trust.

By contrast, according to Kendon, animals like chimpanzees are naturally competitive rather than cooperative. When they appear to contribute to joint activities, it really is each individual animal taking advantage of work done by others (such as building a pile of boxes to be the first to get to a bunch of bananas). Chimpanzees who deceive others (such as running in the wrong direction to a known food source) may in turn be deceived by exactly the same kind of signals, because they are predisposed to accept their signals as reliable. Cooperative signals for the benefit of other chimpanzees not only would not happen, but would not be easily understood if it did, much less be trusted. It is their very competitive social lives, in which signals are valued only if they are reliable and trustworthy (such as body language), that blocks them from developing any language faculty further, simply because they don’t need it.

The social lives of animals may be as rich as humans, but the evolution of human speech uniquely allows us not just a more complex way to deceive, but a way to share richer, evocative stories, and build social cooperation. That’s surely worth an “ouch,” “aïe,” or “itai!”