Last week, a fun article titled fitteR happieR crowned the most depressing Radiohead song as True Love Waits. This was determined by a “gloom index” calculated with Spotify’s valence metric — which measures the musical positiveness of a track — and sentiment analysis of lyrics.

The results were pretty spot on in this Radiohead fan’s opinion, particularly how the tracks of each album stacked up to one another in this visualization. But the approach failed in a few cases, most notably the eponymous Fitter Happier. According to the gloom index, it’s one of the happiest songs, but trust me when I say that quite the opposite is true.

Why this approach falls short with a song like Fitter Happier is two-fold. First, this particular track overlays a collage of audio samples with a spoken-word synthesizer, breaking away from conventional song structures that, as far as I can guess, Spotify’s machine learning techniques train with to calculate valence. Second, this sentiment analysis takes the lyrics at face value, assuming “happier” means “happier”, blind to the nuances of irony.

Here I’d like to share an alternate approach that tries to deal with these outliers. One that might fare better at contextualizing experimental music and understanding sarcasm. To do this, I’m bringing the wisdom of the crowds into the loop, ranking the most depressing Radiohead songs by applying sentiment analysis to data from the SongMeanings community.

Scraping SongMeanings

As the name suggests, SongMeanings is one of the oldest, most popular online communities for people to share their impressions or interpretations of songs. Unfortunately, it doesn’t offer a public API as Spotify and Genius do, so I had to scrape the comments for each Radiohead song.

Nightmare, which you can see in action below, turned out to be a fantastic tool for this job, especially as someone comfortable with JavaScript.

And here is the code for scraping the comments.

var Nightmare = require('nightmare'); var vo = require('vo'); var fs = require('fs'); function* run() { var nightmare = Nightmare({show: true}); var url = 'http://songmeanings.com/artist/view/songs/200/'; // Artist page // Grab list of song links from artist page var songs = yield nightmare .goto(url) .evaluate(function() { var songs = {}; var songsList = document.querySelectorAll('#songslist tr'); songsList.forEach(function(tr) { var id = tr.id.slice(6); var title = tr.querySelector('td:first-child a').title.slice(0, -7); var href = tr.querySelector('td:first-child a').href; var commentCount = tr.querySelector('.comments a').innerText; songs[title] = {'id': id, 'title': title, 'href': href, 'commentCount': commentCount}; }); return songs; }); // Visit each song page and save comments for (var key in songs) { var song = songs[key]; var page_max = Math.ceil(song.commentCount / 10); var comments = []; yield nightmare .goto(song.href) .wait('body'); for (i = 1; i <= page_max; i++) { yield nightmare .evaluate(function() { var page_comments = [...document.querySelectorAll('#comments-list .com-normal')].map(function(elem) { var text = elem.querySelector('.text').innerHTML.split('</strong>')[1].split('\t\t\t\t

\t\t\t\t<div class=')[0].split('<p>')[0].replace(/<br>

|"/g, ' '); var rating = elem.querySelector('.numb').innerText; return {'text': text, 'rating': rating}; }); return page_comments; }) .then(function (result) { comments.push(...result); console.log('Success: ', result); }) .catch(function (error) { console.error('Error: ', error); }); yield nightmare .exists('a#page-' + (i+1)) .then(function(nextExists) { if (nextExists) { return nightmare .click('#pagination a:last-child') .wait(500); } }); } song.comments = comments; } yield nightmare.end(); return songs; } vo(run)(function(err, result) { if (err) throw err; fs.writeFileSync('whereiend.json', JSON.stringify(result)); });

The text and score of each comment were collected, the latter of which is based on the upvotes and downvotes assigned to a comment by other members of the community.

It turned out that the number of comments for a given song was quite uneven, with some songs having very few comments or none at all. A recent decline in activity on SongMeanings probably contributed to this, bad news for last year’s superb “A Moon Shaped Pool”. For the analysis, I filtered out songs with less than 40 comments.

Sentiment Analysis

The sentiment analysis followed the previous one, calculating the percentage of words in each Radiohead song’s comments that expressed sadness using the word list provided by NRC EmoLex. The node package text-miner was used to clean up, stem, and remove stop words first.

A second score was calculated by weighting comments based on their scores. Comments with negative scores not factored in here, while comments with a score of zero were given a weight of 1 (owing to the author’s implicit self-vote), those with a score of one given a weight of 2, etc. This broadens the crowd from just commenters to lurkers who have cast votes, and gets at the idea of a community reaching consensus about what each song means.

Here is the code that was used for the analysis.

var tm = require('text-miner'); var fs = require('fs'); var songs = JSON.parse(fs.readFileSync('whereiend.json')); var sadWords = loadSadWords('NRC-Emotion-Lexicon-Wordlevel-v0.92.txt'); function loadSadWords(file) { var sadWords = []; var lines = fs.readFileSync(file, 'ascii').split('\r

'); lines.forEach(function(line) { var [term, category, flag] = line.split('\t'); if ((category == 'sadness') && (flag == 1)) { sadWords.push(term); } }); return sadWords; } // Calculate sad score for a song's comments function analyzeComments(song) { var sadWordTotal = 0; var wordTotal = 0; var sadWordWeighted = 0; var wordWeighted = 0; song.comments.forEach(function(comment) { var rating = comment.rating[0] == '+' ? parseInt(comment.rating.slice(1)) : parseInt(comment.rating); var score = scoreComment(comment); sadWordTotal = sadWordTotal + score.sadWordCount; wordTotal = wordTotal + score.wordCount; if (rating >= 0) { sadWordWeighted = sadWordWeighted + ((rating+1) * score.sadWordCount); wordWeighted = wordWeighted + ((rating+1) * score.wordCount); } }); var percentage = 100*sadWordTotal/wordTotal; var weighted = 100*sadWordWeighted/wordWeighted; return {'percentage': percentage, 'weighted': weighted, 'wordTotal': wordTotal}; } // Count number of words and sad words in comment function scoreComment(comment) { var text = comment.text.replace(/\(|\)|:|\'/g, ''); var corpus = new tm.Corpus([]); corpus.addDoc(text); corpus .clean() .removeNewlines() .removeInterpunctuation() .removeInvalidCharacters() .removeWords(tm.STOPWORDS.EN) .toLower() .stem('Lancaster'); var terms = new tm.Terms(corpus); var vocabulary = terms.vocabulary; var counts = terms.dtm[0]; var wordCount = counts.reduce((a, b) => a + b, 0); var sadWordCount = 0; vocabulary.forEach(function(word) { if (sadWords.indexOf(word) > -1) { sadWordCount = sadWordCount + 1; } }); return {'sadWordCount': sadWordCount, 'wordCount': wordCount}; } function run() { var output = 'song\tpercentage\tweighted\twordTotal\tcommentCount

'; for (var key in songs) { var score = analyzeComments(songs[key]); var commentCount = songs[key].commentCount; output = output + key + '\t' + score.percentage + '\t' + score.weighted + '\t' + score.wordTotal + '\t' + commentCount + '

'; } fs.writeFileSync('output.txt', output); } run();

The Saddest Radiohead Songs According to SongMeanings

So how did the songs stack up? Using this approach, the most depressing Radiohead song of all time is Black Star, an early song about interpersonal troubles rather than the more grandiose themes of later songs. The winner in the previous analysis, True Love Waits, also ranks highly here at #4.

Also deserving of spots on the list are No Surprises, Motion Picture Soundtrack, and Big Ideas (Don’t Get Any), though There There and Weird Fishes/Arpeggi are kind of a surprise.

Rank Song Percentage 1 Black Star 7.63 2 No Surprises 6.30 3 High and Dry 6.04 4 True Love Waits 6.02 5 Fog 6.02 6 Motion Picture Soundtrack 5.97 7 Street Spirit 5.68 8 There There 5.57 9 Weird Fishes/Arpeggi 5.47 10 Big Ideas (Don’t Get Any) 5.43

At the bottom of the list are placid songs like Sail to the Moon and The Tourist, plus instrumental interludes like Pulk/Pull Revolving Doors and Treefingers. Not that people didn’t have plenty of comments to make about them too.

Rank Song Percentage 75 Life in a Glasshouse 3.00 76 Anyone Can Play Guitar 2.95 77 Sail to the Moon 2.90 78 Reckoner 2.87 79 The Tourist 2.86 80 Pulk/Pull Revolving Doors 2.73 81 Treefingers 2.72 82 Go to Sleep 2.71 83 (Nice Dream) 2.65 84 Electioneering 2.42

What happens to the rankings when you factor in comment scores? The list generally looks the same, with Black Star still topping the list. But there are a few movers and shakers: Bullet Proof…I Wish I Was and Where I End and You Begin shoot up 10 and 14 spots respectively to land at #2 and #6.

Rank Song Weighted 1 Black Star 6.72 2 Bullet Proof…I Wish I Was 6.41 3 There There 6.25 4 High and Dry 6.24 5 Backdrifts 6.08 6 Where I End and You Begin 6.01 7 No Surprises 6.01 8 True Love Waits 5.78 9 Motion Picture Soundtrack 5.78 10 Fog 5.77

The bottom also looks similar, though some more defiant or triumphant songs like Karma Police and Airbag appear, dropping down 33 and 18 spots respectively.

Rank Song Weighted 75 Thinking About You 2.94 76 Go to Sleep 2.91 77 Airbag 2.91 78 Idioteque 2.89 79 Karma Police 2.86 80 Treefingers 2.73 81 Reckoner 2.67 82 (Nice Dream) 2.22 83 Pulk/Pull Revolving Doors 2.19 84 Electioneering 2.06

When looking at the most frequent sad words found in the comments, many are taken directly from the lyrics. For instance, “black” and “falling” from Black Star, “death” and “unhappy” from No Surprises, “haunted” and “killing” from True Love Waits. So is this nothing more than a proxy for a lyrical analysis? Not entirely.

If you recall, by gloom index Fitter Happier was one of the least depressing Radiohead songs. But here, by percentage Fitter Happier lands in the middle of the pack at #55 of 84, and by the weighted score turns out to be one of the most depressing Radiohead songs of all time at #15.