Metal heads, jazz purists and folkies may have more in common musically than you imagined. A new study sheds light on the shared ways in which humans perceive music.

What do we really hear when we listen to music? Researchers from Sweden’s KTH Royal Institute of Technology have attempted to close in on the answer by boiling our perception of music down to nine basic elements – or what they call “perceptual features”.

Their findings could help improve computational models that the music industry uses for predicting the individual tastes of listeners.

So-called music information retrieval (MIR) models combine audio signal processing measurements with analysis of musical elements, which are usually drawn from concepts of music theory and music perception, such as beat strength, rhythmic regularity, meter and mode. The models also include analysis of musical genre (for example, punk, dance, experimental), emotion (sad, happy, tender) and other contextual qualities.

But a big limitation arises from how consistently music is perceived by listeners with different backgrounds and varying familiarity with music, not to mention their individual biases and cultural references.

The reliance on music theory could be one of the weaknesses in the ability of MIR programs to capture any commonalities in perception. Not everyone understands music, yet they know what they like. So what is the basis on which these preferences are formed?

The researchers at KTH – Anders Friberg, Anton Hedblad, Marco Fabiani and Anders Elowsson – argued that when people listen to music, their brains may rely on an “intermediate analysis layer” where more basic features of the music are naturally perceived.

Getting a group of people to agree on anything having to do with music is – as any dj can attest – not easily done. But by focusing on nine key features of music, the team was able to find some commonalities in perception that could prove useful.

They conducted an experiment in which 20 people listened to 100 ringtones and 110 snippets of film music and then rated what they heard in terms of each of the nine so-called “perceptual features”:

Speed – slow or fast

Rhythmic clarity – a pulse that’s firm or flowing

Rhythmic complexity – simple patterns or more complex ones

Articulation – the duration of tones, staccato or legato

Dynamics – relates to the estimated effort of the players

Modality – major or minor key

Overall pitch – overall pitch height of the music, high or low

Harmonic complexity – the progression of harmonics

Brightness – dark or light

Friberg says the nine perceptual features reflect how non-musicians try to understand what they are hearing. For example, instead of rating tempo, which refers to the amount of notes in a given time measure, they chose to use the less complicated concept of speed – which non-musicians associate with movement.

The idea was to see whether the ratings they got from the subjects matched up in any significant way.

The subjects used a Likert scale to evaluate each feature, scoring the music for example somewhere between slow and fast, or soft and loud.

“If there was a high agreement among any of them, this would then indicate that a given feature corresponds to something relating more closely to the real music perception going on in our brains,” he says.

Friberg says that for the most part, the nine features did generate common agreement among the participants.

“The nine perceptual features work, and the test subjects’ own references and cultural differences do not matter.

“We could take 20 new volunteers, and the result would be the same,” he says.

While Friberg says the study is by no means the final word on music information retrieval, it does offer a path toward a better understanding of human perception, which could result in better and simpler computational models.

“A common description in terms of different features would then be able to describe some aspects of the music that most listeners have in common,” he says. “Thus, we avoid the huge individual differences in preferred music, what is considered good or bad.”