Research methods ×

Dataset

The dataset used in this research consists out of twelve song characteristics for 60 countries that have a Top 50 list made by the Spotify Official account. The dataset was generated of the current top 50 playlist at 12 July 2019 and contains the following characteristics:

Danceability

Energy

Key

Loudness

Mode

Speechiness

Acousticness

Instrumentalness

Liveness

Valence

Tempo

Duration

Analysis

The following steps were performed in the analysis to reach the end conclusion.

Omit any songs where at least one of the characteristics is empty

Perform correlation analysis on the characteristics

The correlations (+0.27 danceability/valence, -0.07 danceability/tempo, -0.07 valence/tempo) were deemed sufficiently independent to continue analysis with only these three characteristics, other characteristics where excluded

Normalize the characteristics between 0 (lowest) and 1 (highest) to prepare for K-means clustering

Use K-means clustering to find six clusters

Study the average characteristics of these "new continents" to set-up the story line

Visualizations

To create the visualizations, the following packages were used:

Twitter Bootstrap (MIT license)

Open Sans (Apache License, v2.0)

Native-promise-only.js (Kyle Simpson, MIT License)

Lodash (https://raw.githubusercontent.com/lodash/lodash/4.17.15-npm/LICENSE)

D3.js v5.9.7 (Copyright 2019 Mike Bostock)

TopoJson v3.0 (Copyright 2017 Mike Bostock)

GeoProjection v2.4.0 (Copyright 2018 Mike Bostock)

jQuery jQuery v3.3.1 (Copyright JS Foundation and other contributors)

Waypoints.js v4.0.0 (MIT license)

Slider from Ana Tudor (CodePen)

Word of thanks

The team would like to thank Spotify for their open APIs and their embedding options. Furthermore, thanks to all developers from the web development packages on which this story was build.

— Lianne Duinkerken, Frans Geurts, Lisa Kroes, Pim Peeters, Maarten Snijders, Niels Tammes, Joost de Theije and Bas Wagenmaker