In a move guaranteed to send audiophiles recoiling back into their sonically pristine caves, two doctoral students at ETH Zurich have come up with an interesting way to embed information into music. What sounds crazy about this is that they’re hiding data firmly in the audible spectrum from 9.8 kHz to 10 kHz. The question is, does it actually sound crazy? Not to our ears, playback remains surprisingly ok.

You can listen to a clip with and without the data on ETH’s site and see for yourself. As a brief example, here’s twelve seconds of the audio presenting two versions of the same clip. The first riff has no data, and the second riff has the encoded data.

You can probably convince yourself that there’s a difference, but it’s negligible. Even if we use a janky bandpass filter over the 8 kHz -10 kHz range to make the differences stand out, it’s not easy to differentiate what you’re hearing:

After many years of performing live music and dabbling in the recording studio, I’d describe the data-encoded clip as having a tinny feedback or a weird reverb effect. However, you wouldn’t notice this in a track playing on the grocery store’s speaker.

Why Use Audible Frequencies?

Why in the world would you want to use an audible frequency to transmit information? The easy answer is that there are already audible transmitters and receivers everywhere. Specifically, cell phones. According to the researchers, this works better than ultrasonic because cell phone microphones have low sensitivity at high frequencies and attenuate faster than audible frequencies.

By encoding data into the audible range of music, coffee shops could broadcast their WiFi passwords inside their Sia-heavy playlists. (Why is it always Sia?) Cell phones could then detect the password and automatically connect.

Why it Sounds Fine: OFDM and Masking Frequencies

Processing steps used

The original paper goes into more detail, but the system doesn’t wreck the music because it uses the music to mask the data. It detects the strongest frequencies in a track, and embeds data around the harmonics of the frequency. This way, the encoded data simply sounds like it’s part of the music.

Of course, it doesn’t just encode data on one frequency. It uses orthogonal frequency-division multiplexing (OFDM). OFDM essentially spreads the transmission out over multiple carrier frequencies to reduce the power of a single frequency. It’s used in technologies like 4G and 802.11a WLAN. OFDM allows a system to push more power in a band while minimizing the amplitude of specific tones.

Unsurprisingly, the data rates are far from fiber speeds. Using low frequency carriers has its disadvantages. Researchers were able to reach 300 – 400 bits/s (yes, bits not bytes). The transmission distance and accuracy is respectable, though, at 24 meters with less than a 10% bit error ratio. The BER and data rate varies by song, with Queen and the Gorillaz leading the charge.

In the real world they expect about 200 bits/s, which is enough to send roughly 25 words per second. This is fast enough to transmit text info or simple data streams, but you won’t want to browse dank memes with this data link.

Thanks [Qes] for the tip!