Heaven Knows I’m Miserable Now: Limitations of Sentiment Analysis with a very Sentimental Band

Using Spotify API and TidyText in R to analyze the Smiths discography

I recently came across this AI lyric generator. As a musician and songwriter myself, I had an existential crisis. Would my journal pages of song lyrics and musings be irrelevant in this brave new world of automated text? Then, I got to thinking about one of the bands that crafted music better than anyone, The Smiths, and what they would have to say when their lyrics are put up to the data challenge.

Introduction

The Smiths occupy a place in music history as one of the pre-eminent bands to emerge from the new wave 80s independent scene in England. During their eight year run together, the boys from Manchester released multiple albums and signed with Rough Trade Records. They found both mainstream success and a cult-like following that continues today.

However, it was not the live performances or touring that set the Smiths apart. Instead, it was their songwriting. With Morrissey on lyrics and Marr on lead guitar, the band created music that challenged the synth pop scene, and wrote lyrics influenced by great thinkers like Oscar Wilde. In doing so, they challenged the status quo of “happy go lucky” pop music, using sarcasm and humor to express complex emotions.

In Morrissey’s words, “It was probably nothing but it felt like the world”. Were our favorite Smiths songs nothing more than a few sad chords and lyrics? I set out to find an answer with the very tools that may one day produce the first AI band to be signed by Rough Trade.

Pulling the Data

The Smiths’ eponymous debut album was released in 1984, followed by three more studio albums: Meat Is Murder, The Queen is Dead, and Strangeways, Here We Come. This short, less than a decade time together would be followed by a tumultuous breakup. I pulled album data from these four studio albums, although Rank (their live album) is also available via the SpotifyR package. The compilation albums and singles are regrettably left out of this analysis. All of the code can be found here.

Meat <- get_album_data("The Smiths","Meat Is Murder",

authorization = get_spotify_access_token()) Queen <- get_album_data("The Smiths","The Queen is Dead",

authorization = get_spotify_access_token()) Strangeways <- get_album_data("The Smiths","Strangeways, Here We Come",

authorization = get_spotify_access_token()) TheSmiths <- get_album_data("The Smiths","The Smiths",

authorization = get_spotify_access_token())

Keys of Songs

The next part may inspire some to read more about music theory, but for now, we just need the basics. A key signature is the tonal center, or focal point of a song. A song in the key of C, for instance, has the pitch C as its tonal center. The song Imagine by the Beatles is written in the key of C. The song Brown Eyed Girl by Van Morrison is written in the key of G.

Some suggest that different keys illicit different emotions. The key of A is often thought to be one of the happiest keys (Get Lucky by Daft Punk is in A major), and songs in E minor (like Seven Nation Army by the White Stripes) are considered to be more serious. Major and minor, or the mode of a song, also matter. Changing a chord from a D major to a D minor affects the tone or emotion of that chord, and changes the overall song as well.

Of course, there are exceptions to every rule. And what makes art truly great is when artists know the rules, and then break them.

Using the tidyverse tools in R, I organized and visualized data on the key (A or C# for instance) and the mode (major or minor) for each album and song. Meat is Murder has twice as many songs in the minor mode as in the major mode. Strangeways, The Queen is Dead, and The Smiths, on the other hand, have almost twice as many any songs in major as they do in minor.

pie_data <- data %>%

select(key_name, mode_name, key_mode, album_name) %>%

group_by(album_name, mode_name) %>%

dplyr::summarise(n = n())

bp <- ggplot(merged_pie, aes(x = "", y = av, fill = mode_name)) +

geom_bar(stat = "identity") +

facet_wrap(~album_name)

pie <- bp + coord_polar("y", start = 0) +

theme_minimal() +

theme(plot.title = element_text())+

facet_wrap(~album_name) +

theme_minimal() +

theme(panel.grid = element_blank(),

axis.title.x = element_blank(),

axis.text = element_blank()) +

geom_text(aes(label = proportion), size = 4.5, hjust = .4, vjust = -3, color = "white") +

scale_fill_discrete(name = "Mode") +

theme(axis.title = element_blank())

Next, I created a chart that breaks down the mode and key for each song. There are 40 songs in total from the 4 albums. The Smith’s favorite key was A major (songs like Death of a Disco Dancer, Still Ill, and Paint a Vulgar Picture), followed by C# minor.

data %>%

dplyr::group_by(key_name, mode_name) %>%

dplyr::summarise(n = n())

I was also curious about the additional variables that provided by the SpotifyR package, so I analyzed the distribution of these variables across major and minor song modes.

p1<- ggplot(data, aes(x = tempo, fill = mode_name)) +

geom_density(binwidth = 5, alpha = .3) +

xlim(0,300) +

ggtitle("Tempo")

p2<- ggplot(data, aes(x = energy, fill = mode_name)) +

geom_density(binwidth = 5, alpha = .3) +

xlim(0,1.5) +

ggtitle("Energy")

p3<- ggplot(data, aes(x = loudness, fill = mode_name)) +

geom_density(binwidth = 5, alpha = .3) +

xlim(-15,0) +

ggtitle("Loudness")

p4<- ggplot(data, aes(x = danceability, fill = mode_name)) +

geom_density(binwidth = 5, alpha = .3) +

xlim(0,1) +

ggtitle("Danceability")

library(gridExtra)

grid.arrange(p1,p2,p3,p4)

The SpotifyR package includes a time signature variable, as well. Only 5 of the 40 songs from the four albums were written in ¾ time signature.

data %>%

dplyr::select(time_signature, album_name, track_name, key_mode) %>%

filter(time_signature == 3) %>%

kable() %>%

kable_styling(bootstrap_options = c("striped", "hover"))

With key signatures, keys, and modes gathered, I turned to sentiment analysis using the AFINN lexicon, which rates words with a polarity score between -5 and +5.

afinn <- tidy_lyrics %>%

inner_join(get_sentiments("afinn"))%>%

group_by(album,track_number) %>%

dplyr::summarise(sentiment = sum(value)) %>%

mutate(method = "AFINN")

Aggregating the mean sentiment score over all songs for each album, Meat is Murder had the lowest overall sentiment score with -8.56. Meat is Murder is also the album with the most songs in a minor key, and it is more outspoken and political than the others. Take the song Headmaster Ritual, with a sentiment score in the lowest quartile, where Morrissey tells of his experience as a student in school, referring to his teachers as “spineless swines” with “cemented minds”.

However, look at the distribution of sentiment scores for songs within each album:

plot <- afinn %>%

ggplot(aes(track_number, sentiment, fill= sentiment)) +

geom_col() +

facet_wrap(~album, ncol = 1, scales = "free_y") +

theme(axis.text.x = element_blank()) +

theme(axis.ticks = element_blank()) +

theme(axis.title.x.bottom = element_blank())

Using facet wrapping in ggplot2, we can look at how the sentiments change in songs as we progress through the album. Let’s look more into the lyrics of the songs that have a high positive sentiment. Death of a Disco Dancer, for instance, off of the Strangeways album has an AFINN score of+58 points.

Written in the Key of A Major, Morrissey sings:

Love, peace and harmony?

Oh, very nice very nice very nice,

but maybe in the next world…

And if you think peace

is a common goal,

that goes to show

you little you know

Here is the genius of the Smiths. This song scored high (+58) on AFINN sentiment, and is written in a major key, but looking closer at the lyrics, we pick up on satire. The sentiment algorithm correctly classified love, peace, harmony, and how very very nice these things are, but failed to take into account the classic Morrissey habit of ending phrases with a derisive bit that reverses most of what he said a few lines before.

Another song that scored high on the AFINN sentiment scale is Rusholme Ruffians. One of the few songs in a major key off of the Meat is Murder album, Morrissey sings of a young boy and girl at a fair. Read more into the lyrics, however, and the positive AFINN sentiment score may not be so valid after all. In the backdrop of the fairgrounds:

Someone’s beaten up

Someone falls in love

And the senses being dulled are mine.

Unhappy Birthday, a song off of Strangeways, Here We Come, is written in the key of C Major. With a -27 sentiment score and mordant lyrics, the song does not fall in line with the bright emotions that the C major scale is usually known for:

I’ve come to wish you an unhappy birthday

’Cause you’re evil

And you lie

And if you should die

I may feel slightly sad

But I won’t cry

Finally, I visualize everything together. Using boxplots, I mapped the sentiment distributions for each mode onto each album. Then, I mapped the key names onto each graph. The minor songs off of the album The Queen is Dead fall in a smaller range of sentiments, and Strangeways may be the most diverse in distribution of key, mode, and sentiment score.

plot <- ggplot(data, aes (x = mode_name, y = sentiment))+

geom_boxplot(width = .3) +

facet_wrap(~album) +

geom_point(aes(colour = key_name, size = 3, alpha = .8)) +

theme(axis.text.x.bottom = element_text(angle = 70, vjust = .5)) +

guides(color = guide_legend("Key"), fill = FALSE) +

scale_size_continuous(guide= FALSE) +

scale_alpha_continuous(guide = FALSE)

Conclusion

It is no surprise that the Smiths have a reputation for being melancholy. One study compared the songs of the Smiths to other popular British rock bands and confirmed that There is a Light that Never Goes Out is indeed one of the saddest songs of all time. When NLP models for lyrical and music creation are inevitably let loose, I hope they do more than just classify emotions in the binary, and instead consider training their models on Morrissey’s “tongue in cheek” lyricism.

“The band happened because I walked home in the rain once too often”, Morrissey said. I hope that AI music bots can one day incorporate wry humor and irony, too. If not, we will be at risk of raising a generation of kids on algorithmically designed love songs in C major, rather than producing music with the quirky chord structure and dark humor of the Morrissey/Marr power team. After all, we’d hate for future rock fans to miss out on lines like this one:

“Nothing’s changed, I still love you… only slightly less than I used to, my love.”