Original audio clip comes from vocabulary.com and features voice repeating one word – but which one do you hear?

This article is more than 2 years old

This article is more than 2 years old

A short audio clip of a computer-generated voice has become the most divisive subject on the internet since the gold/blue dress controversy of 2015.

The audio “illusion”, which first appeared on Reddit, seems to be saying one word – but whether that word is “Yanny” or “Laurel” is the source of furious disagreement.

Cloe Feldman (@CloeCouture) What do you hear?! Yanny or Laurel pic.twitter.com/jvHhCbMc8I

Professor David Alais from the University of Sydney’s school of psychology says the Yanny/Laurel sound is an example of a “perceptually ambiguous stimulus” such as the Necker cube or the face/vase illusion.

“They can be seen in two ways, and often the mind flips back and forth between the two interpretations. This happens because the brain can’t decide on a definitive interpretation,” Alais says.

“If there is little ambiguity, the brain locks on to a single perceptual interpretation. Here, the Yanny/Laurel sound is meant to be ambiguous because each sound has a similar timing and energy content – so in principle it’s confusable.

“All of this goes to highlight just how much the brain is an active interpreter of sensory input, and thus that the external world is less objective than we like to believe.”

Alais says that for him, and presumably many others, it’s “100% Yanny” without any ambiguity.

That lack of ambiguity he says is probably down to two reasons: firstly his age. At 52 his ears lack high frequency sensitivity, a natural result of ageing; and secondly, a difference in pronunciation between the North American accented computer-generated “Yanny” and “Laurel” and how the words would naturally be spoken in Australian or British English.

This argument is further supported by the assistant professor of audition and cognitive neuroscience Lars Riecke at Maastricht University. Speaking to the Verge, Riecke suggests the “secret is frequency … but some of it is also the mechanics of your ears, and what you’re expecting to hear”.

“Most sounds – including L and Y, which are among the ones at issue here – are made up of several frequencies at once ... frequencies of the Y might have been made artificially higher, and the frequencies that make the L sound might have been dropped.”

Prof Hugh McDermott from Melbourne’s Bionics Institute suggests that while the frequency of the device you are listening on does have an impact, there are “a lot of different factors playing into it”.

“When the brain is uncertain of something, it uses surrounding cues to help you make the right decision,” he said.



“If you heard a conversation happening around you regarding ‘Laurel’ you wouldn’t have heard ‘Yanny’.



“Personal history can also give an unconscious preference for one or another. You could know many people named ‘Laurel’ and none called ‘Yanny’.”



McDermott also thinks visual cues may have played a part. “You would have noticed it had both the names appearing on the screen with no other context or information. This forces the brain to make a choice between those two alternatives.

“It is a compelling illusion and you can hear both those sounds either way.”



In National Geographic, Brad Story from the University of Arizona’s speech acoustics and physiology lab, claimed the original recording was “Laurel” but because the audio clip isn’t clear it leaves room for confusion and varying interpretations.



Story has experimented by recording his own voice pronouncing both words and found similar sound patterns for “Yanny” and “Laurel”.



Online commentators have added their own theories as to why people are hearing different words in the clip – and pointed out it varies depending on the level of frequency, amplitude and the type of speakers used to play back the clip.



Steve Pomeroy (@xxv) Ok, so if you pitch-shift it you can hear different things:



down 30%: https://t.co/F5WCUZQJlq

down 20%: https://t.co/CLhY5tvnC1

up 20%: https://t.co/zAc7HomuCS

up 30% https://t.co/JdNUILOvFW

up 40% https://t.co/8VTkjXo3L1 https://t.co/suSw6AmLtn

According to the Twitter user Earth Vessel Quotes, the amount of bass projected from the sound device can have a significant impact.

Earth Vessel Quotes (@earthvessquotes) you can hear both when you adjust the bass levels: pic.twitter.com/22boppUJS1

Lower frequencies increase your chances of hearing the world “Laurel” while higher ones are more likely to sound like “Yanny”.

One user wrote on Reddit: “If you turn the volume very low, there will be practically no bass and you will hear Yanny. Turn the volume up and play it on some speakers that have actual bass response (AKA not your phone) and you will hear Laurel.”

A video posted by another Twitter user, Alex Saad, backs this theory by showing the sound mix morphing from “Yanny” into “Laurel” while toggling through different frequencies.

Alex Saad (@XeSaad) Despite objective proof I still think it’s #Laurel pic.twitter.com/RcJpZZncRC

Others have speculated that the difference may be down to the age of the listener, or individual physiology. As you get older, your hearing range begins to deteriorate, making certain high frequencies hard or impossible to hear. This process can begin from the age of 25.

