I Miss The Old Kanye? Evan Eskew

What follows is my best shot at this analysis. Major credit to Julia Silge here. My work followed pretty easily from the code she already had written. In the parlance of hip hop, I sampled her code and remixed it a little. If you don’t care anything about how I actually did the analysis in R, you can skip The Geek Stuff section and head right along to The Rap Stuff .

In the same way that Julia looked at Jane Austen’s entire body of work, I thought it would be interesting to analyze one musical artist’s whole commercial output. Within hip hop, Kanye seems like a natural choice. He’s got seven albums out, which I’d argue encompass a greater musical diversity than probably any other recent, popular musician. Some of his later albums, particularly 808s & Heartbreak and Yeezus have been polarizing, even for fans of his early stuff. In short, his catalog seems ripe for a sentiment analysis. Would such an analysis reveal major differences in feeling across his career arc? Is there really an “old Kanye” to miss , lyrically speaking?

This whole thing started after I read an excellent blog post by Julia Silge . You should definitely check it out, but the short version is that she used language processing tools within R to analyze the sentiment in Jane Austen’s novels. This led me to ask: can I do this with rap albums? I didn’t learn R coding just to write a dissertation, right?!

This entire analysis can be done with surprisingly few R packages. To summarize, I’m using dplyr for general data manipulation, rvest for web scraping, stringr for manipulation of lyrics once I get them into R, syuzhet to conduct the sentiment analysis, and ggplot2 and png to create the plots. If you want to know more than that, check out all the code on the GitHub repository .

Again, I followed Julia’s example here. I thought her visuals using ggplot2 looked really nice, so I just modified her code slightly. Whereas Julia was analyzing novel-length texts, my texts had clear dividing points since an album is made up of discrete tracks. Thus, I added some visual elements to the plots to help distinguish between tracks within an album.

For the sentiment analysis, I just followed Julia’s lead. Major disclaimer: I know very little about sentiment analysis. My basic understanding is that it’s used to quantify the subjective feeling in a text. Julia has a nice comparison of metrics that might be used to compute sentiment scores. She settled on the bing method from the package syuzhet since it did not seem to be directionally biased or overly variable. Good enough for her, good enough for me.

To get the lyrics I wanted, the best source seemed pretty obvious to me: the website Genius (formerly Rap Genius). If you haven’t browsed Genius, you should. It’s essentially a wiki-type site dedicated to song lyrics. Contributors both transcribe the lyrics and annotate them for meaning. For my purposes, I just needed to get the lyrics out of the webpages and into R, preferably for whole albums at once. To do this, I used the R package rvest . It was my first time working with it, but it helped me scrape lyrics off the web quickly and easily. I wrote a few R functions that allowed me to simply input the Genius page of the album I was interested in and process the entire album’s worth of lyrics.

Get the lyrics for the albums that I want to analyze. Julia already had this solved for her work because she had the text of Austen’s novels packaged up nice and neat. I was starting from scratch.

The Rap Stuff

Ok, let’s see some results! But first, one revelation: I lied. This project is not just about Kanye. I wanted to “ground truth” the sentiment analysis method with some other popular hip hop albums (and just because I was interested). The first album that came to mind was Kendrick Lamar’s To Pimp A Butterfly (TPAB), so that’s what I did first:

The resulting sentiment plot is very similar to what Julia produced in her analyses, except I’ve used vertical dotted lines to delineate tracks within an album. I also plotted positive sentiments in green and negative sentiments in red just to make the difference more obvious. I think this is also a good time to mention that I really have no idea how well this bing sentiment method may or may not be doing in analyzing a hip hop album. The Genius user community certainly does a very thorough job of transcribing lyrics accurately, but I would imagine that much of the content of these albums is quite unusual for such an analysis. To give you an idea, here are a few bars from my personal favorite on TPAB, “Momma”:

## [1] "I know everything" ## [2] "I know everything, know myself" ## [3] "I know morality, spirituality, good and bad health" ## [4] "I know fatality might haunt you" ## [5] "I know everything, I know Compton" ## [6] "I know street shit, I know shit that's conscious" ## [7] "I know everything, I know lawyers, advertisement and sponsors" ## [8] "I know wisdom, I know bad religion, I know good karma"

If you’ve heard any of the albums I’ll be analyzing, you know it gets a lot more colorful than that. So while these sentiment analysis methods are capable of giving us “answers”, I’d take everything with a grain of salt.

For this album, I think the analysis confirms what many listeners would intuit: the general mood is pretty negative. The tracks with the most negative scores include “King Kunta,” “u,” “Hood Politics,” and “The Blacker The Berry.” These tracks highlight one of the interesting things about sentiment analysis applied to music lyrics: sometimes tracks that are musically dark also have negative lyrics, but not always. For example, “u” is a pretty devastating reflection on (a lack of) self-worth. I would expect it to have a strongly negative sentiment, which it does. And the music matches this content: it sounds like a negative song. In contrast, “Hood Politics” is decidedly more upbeat musically yet has a negative lyrical sentiment according to the metrics. So keep that potential tension between music and lyrics in mind.

Since I did a Kendrick album, I felt obligated to give Drake some attention too. President Obama may have given Kendrick the nod, but what does the sentiment tell us? A lot of people would vote Take Care as Drake’s best album to this point, so let’s plot that one:

Given that Drake has self-identified as being all in his feelings, it was a bit of a surprise to me that Take Care was actually less negative than TPAB. TPAB has five songs that dip down below -5 on the sentiment scale. Take Care has only one: “Take Care.” While Take Care’s tracks don’t get as negative as Kendrick’s on TPAB, they’re still negative. In fact, the album starts with a run of seven pretty solidly negative songs: “Over My Dead Body” through “Buried Alive (Interlude).” In contrast, there are positive peaks in tracks like “Make Me Proud” and “The Real Her.”

At this point I was just having fun with it, so I decided I also had to do two inarguable classics of the genre: Nas’ Illmatic and Jay-Z’s The Blueprint.

Well, yeah, that makes sense. Illmatic is pretty unremittingly bleak. The album is an artistic masterpiece and arguably the best ever in hip hop. But it’s definitely not happy. Ironically, the song with the most negative section is “One Love.” Then again, the song is written to someone who’s locked up.

The Blueprint is a much more balanced album overall. Its contours are similar in a way to Take Care: never too negative and some noticeable peaks in positivity. The sentiment analysis largely gels with my expectations. I think songs with very negative sentiment scores like “Takeover,” “Song Cry” (Jay must really be feeling this one after LEMONADE), and the Eminem-produced “Renegade” would certainly sound that way to your average listener.