When we talk to one another, we take turns. This simple rule seems to apply to all human conversation, whether the speakers are English city-dwellers or Namibian hunter-gatherers. One person speaks at a time and, barring the occasional interruption, we wait for our partner to finish before grabbing the conch. Timing is everything: cutting someone off is rude; leaving pregnant pauses is awkward. You need to leave a Goldilocks gap—something just right.

There are variations, certainly. New Yorkers are reputedly fond of “simultaneous speech” while Nordic cultures apparently love long, lingering pauses. But when Tanya Stivers analysed turn-taking across varied cultures, she found more similarities than differences. As I wrote in 2009:

“Stivers [collected] video recordings of conversations in ten different languages from five continents – from English to Korean, and from Tzeltal (a Mayan language spoken in Mexico) to Yeli-Dyne (a language of just 4,000 speakers used in Papua New Guinea). She found that… in all ten cultures, speakers shoot for as little silence as possible without speaking over each other, and the majority of answers follow questions after virtually no delay or overlap. The average delays certainly varied from language to language, but [the] extremes were only a quarter of a second off from the international average.”

The universal nature of turn-taking fascinated Asif Ghazanfar, a psychologist at Princeton University who studies monkey behaviour. “Taking turns acts as the foundation for more sophisticated forms of communication. You can’t share information if you’re constantly chattering over each other,” he says. “So how does that evolve?”

Our close relatives—the other great apes—provide few clues. They don’t actually vocalise very much and when they do, there’s no evidence that they take turns. So, Ghazanfar turned to another primate—the common marmoset, a tiny monkey that looks not unlike Back to the Future’s Doc Brown.

Although marmosets aren’t especially sophisticated communicators, they do regularly call to one another. Together with Daniel Takahashi and Darshana Narayanan, Ghazanfar placed 27 pairs of common marmosets in opposite corners of a room, separated by an opaque curtain. Both monkeys called out, and although the pace of their exchanges was much slower than a human conversation, the team saw similarities in their rhythms.

For a start, they rarely interrupted one another. Each one waited for about 5 to 6 seconds after its partner finished before sounding off itself. The partners also ‘conversed’ with a steady rhythm, technically known as coupled oscillation. Both monkeys left a predictable interval between their calls, and their vocals slotted neatly into the silences created by their partner. And to confirm that they really are coordinated, the team showed that if one partner sped up or slowed down, the other followed suit.

“That’s what we do in conversation all the time,” says Ghazanfar. “If you speak to someone who’s speaking fast, you’ll start doing it too. We’re reporting the same for marmosets.”

Of course, there’s more to human turn-taking than that. We use sophisticated tricks to work out when it’s our turn to speak. We pay attention to grammar, meaning, inflection, body language and eye contact, and there’s no evidence that the marmosets are doing any of that. But nonetheless, the results are very similar—a coordinated vocal see-saw.

The marmosets also behaved in the same way whether they were paired with familiar cagemates or complete strangers. That’s another feature they share with humans, and it sets them apart from, say, duetting birds, which only coordinate their vocals under very specific circumstances. “That’s not the case here,” says Ghazanfar. “One marmoset could have a conversation with any other marmoset.”

View Images Common marmosets. Credit: Dario Sanchez. Common marmosets. Credit: Dario Sanchez

But Margaret Wilson, a psychologist from the University of California, Santa Cruz who studies turn-taking in both humans and animals, is not convinced. “The paper doesn’t demonstrate turn-taking in any interesting sense,” she says. “I think the authors have failed to appreciate just how weird human turn-taking is.” Wilson explains the weirdness beautifully, so I’m going to yield the floor without interruptions:

“When humans take turns, there is a cyclic structure to the extremely short gaps between speakers’ utterances. A between-turn gap of, say, 200 milliseconds is more likely to be broken by the second speaker at certain regular intervals (say, odd multiples of 50 ms) than during the “troughs” between those intervals. That is, short silences are not of arbitrary length, but reflect a cyclic passing back and forth of who has the “right” to speak next. The troughs represent moments when the right to speak has shifted back to the original speaker, hence the second speaker inhibits speech during those fractions of a second. And this is happening at the order of tens of milliseconds. This “structured silence” can only be explained by extremely tight coupling of some oscillatory mechanism in the brains of the two speakers.“

And there’s no hint of that complexity in the marmosets’ exchanges. The cycles in their conversations span their actual calls rather than just the gaps between them. That just means their timing’s not completely random. And with long silences between turns, they could just have been calling and then responding to their partner’s call, as many other animals do.

But if there’s one thing Wilson agrees with, it’s that the question’s worth asking. “Turn-taking is fundamental to human conversation, so the question of whether it occurs in other social animals is extremely interesting,” she says.

Consider that humans talk so much more than other apes. Many scientists have suggested that this vocal sophistication is rooted in manual gestures—the arm and hand movements that chimps and gorillas use a lot. These became increasingly complex and eventually, the brain circuits for gestures got glommed onto vocals. But Ghazanfar isn’t a fan of this idea. “They have to come up with some magical thing that switches from manual to speech,” he says.

If marmosets take turns, that points to a different hypothesis. Like humans, they are cooperative breeders. Males and females work together to raise their young, typically as a monogamous pair and often with help from older siblings. “The idea is that this strategy of cooperative breeding specifically makes them more friendly,” says Ghazanfar. Turn-taking may be a symptom of this temperament—a by-product of breeding habits that make for a generally more cooperative primate.

Maybe the same thing happened during human evolution? “There could have been a tweak in the way we raise our offspring, which led to more prosocial behaviour,” says Ghazanfar. “And once you have that general prosociality, you may be more inclined to make more contact with other members of the species.”

It’s an interesting scenario, and one that has parallels in domestic dogs. There’s a popular idea that dogs evolved from wolves that were drawn to human settlements, perhaps to scavenge off our garbage. Individauly with more docile temperaments were best-suited to these forays, and gradually became better at reading human cues and gestures. You select for a certain temperament, and many other traits get yanked along for the evolutionary ride.

Of course, this is still conjecture, and the common marmosets are just one (contested) data point. The next step would be to check for turn-taking in other cooperatively breeding primates, such as tamarins, or related species that don’t share the same reproductive habits.