Spotify Technology: How Spotify Works

CAUTION – THIS PAGE IS PROBABLY NOW OUT-OF-DATE

Spotify uses some particularly clever streaming technology to deliver all that instant music. It’s been described in an academic paper by Spotify techno-wizards Gunnar Kreitz and Fredrik Niemelä, who included some very interesting statistics and analysis of their measurements taken during one week in the early part of 2010. It’s a pretty dense technical read, but there are some fascinating stats to be gleaned amongst the computer science. Read on for a condensed summary below!

.

General Stats

Spotify is the only on-demand music streaming service that’s not web-based. Instead, it uses a peer-to-peer network (p2p) that can scale up to meet the demands of millions of users.

Only 8.8% of music playback comes from Spotify’s servers. The rest comes from the peer-to-peer network (35.8%) or your local cache (55.4%). The exception here is Spotify on smartphones, which gets all the music directly from the Spotify servers.

.

Playback Stats

Most playbacks (61%) are listened to in a predictable order (the user just listens to an album from the first track onwards). This helps Spotify pre-fetch the upcoming music so that the next track can start playing instantly.

About 39% of playback in Spotify is “random access” i.e. the user jumps about from track to track.

The median playback latency is a mere 265 ms (including cached tracks). Without pre-fetching, it has been measured at 390 ms.

Less than 1% of all playbacks stutter. When it does, it’s most likely due to local CPU issues.

.

The Short Tail?

During a week-long analysis of all music played via Spotify:

88% of track accesses were for the most popular 12% of all tracks on Spotify.

79% of server requests were for the most popular 21% of all tracks on Spotify.

60% of all music content available was accessed at least once.

.

The Peer to Peer Network

Spotify’s p2p network works like a BitTorrent network to locate peers (other users who have the song you want to listen to). It uses a proprietary protocol designed especially for streaming music.

There’s no “preferred” peers or supernodes, but a future improvement might be to use peer-to-peer overlays to exploit the overlap in interests between users.

The maximum number of peers in the network is 60, with a soft-limit of 50 peers.

The client uploads to at most 4 peers at a time.

Server-side trackers and network queries are used to locate other users who have the music you’re listening to.

Spotify uses TCP as the transport protocol instead of UDP, since it can take advantage of TCP’s congestion controls and ability to re-send lost packets.

.

Storage

Most users have a large cache – 56% have a maximum size of 5GB or more. This helps keep network traffic down since most users listen to tracks more than once.

At Spotify’s end, there’s a master storage area (290TB) and two production storage areas (90TB in London, 90TB in Stockholm).

.

Audio Files

Spotify audio is encoded using Ogg Vorbis at q5 bitrate (roughly 160 kbps). Premium users can also recieve audio at q9 quality (roughly 320 kbps). The audio files are not pure Ogg Vorbis however: Spotify adds a custom header to each file to help with seeking to a particular part of the track.

File encryption means that a network connection is required to play a track, even when it’s stored in the local cache (unless it’s been stored for offline playback).

.

How a Track is Streamed





User clicks a track to listen to. If it’s in the cache, Spotify just starts playing it from there. Otherwise, the Spotify client requests the first 15 seconds of the track from the Spotify servers so that playback can start as soon as possible. At the same time, the client starts looking for the track on the peer-to-peer network. The rest of the track is streamed, from a combination of multiple sources if available (cache, multiple peers, Spotify servers). The more popular a track is, the more likely it will be streamed using the p2p network instead of the Spotify servers. When the track has 30 seconds to go, the Spotify client begins searching the p2p network for the next track. When the track has 10 seconds to go, if it hasn’t found the next track on the network yet, the client starts pre-fetching it from the Spotify servers.

.

Failsafes

There’s an “emergency mode” built in to Spotify to halt a client from uploading to the p2p network when the playback buffers get too low.

If a buffer underrun occurs in a track, the client pauses playback to adjust for latency.

All data sourced from the paper Spotify – Large Scale, Low Latency, P2P Music-on-Demand Streaming, by Gunnar Kreitz and Fredrik Niemelä. Available as a PDF here.