“Men lie, women lie, numbers don’t” – Jay Z

Among the many things rappers like to boast about, some are relatively easy to quantify, like money, whereas rhyming skills are something that have been very difficult to measure – up till now. In this post, I’ll present Raplyzer, a computer program which automatically detects rhymes from rap lyrics and which is used to rank popular rappers based on their average Rhyme factor. I’ll also present another program called BattleBot, which is a search engine for rhyming rap lines based on the algorithm used in Raplyzer.

Rap Rhyming 101

In rap lyrics, assonance, where words don’t have necessarily the same ending but they share a vowel sound, is the most typical form of rhyming nowadays [1]. In multi-syllable rhymes (multis), it is not only the last syllable but multiple syllables that share a vowel sound. For example:

“This is a job – I get paid to sling some raps,

What you made last year was less than my income tax” [2]

As one author puts it: “Multis are hallmarks of all the dopest flows, and all the best rappers use them” [2].

Automatic Rhyme Detection

I’ve developed an algorithm called Raplyzer for detecting assonance rhymes. If you’re not interested in the technical details, you might want to skip directly to the results in the next section.

In order to detect rhymes, the key thing is to find matching vowel sound sequences. Unfortunately, vowel sounds can’t be trivially extracted from English text since words are not pronounced as they are written (as opposed to the Finnish language for which I originally developed Raplyzer). Luckily, there’s a great open source speech synthesizer, eSpeak, which can be used to obtain a phonetic transcription of the lyrics.

From the phonetic transcription, we remove everything but vowels and do the following:

Go through a song word by word. For each word, find the longest matching vowel sequence that ends with one of the 15 previous words (start with the last vowels of two words, if they’re the same, proceed to the second to last vowels, third to last, and so on. Proceed ignoring word boundaries until the first non-matching vowels have been encountered). Compute the average rhyme length (= Rhyme factor) by averaging the lengths of the longest matching vowel sequences of all words.

When finding the longest matching vowel sequence, we do not accept matches where any of the rhyming words are exactly the same, since typically some phrases, e.g., in the chorus, are repeated several times and these shouldn’t count as rhymes. Also, before running the phonetic transcription, we remove all duplicate lines from the text to normalize the lyrics, as in some cases the lyrics contain the chorus repeated many times, whereas in some cases they might just have “Chorus 4X”. And when matching vowels, we accept certain pairs of vowel phonemes that sound very similar (as specified here).

Most of the rhymes are typically located in the end of a line but since it’s not always straightforward to infer line endings from the lyrics files, rhyme lengths are averaged over all words. This way we can capture not only end rhymes but also internal rhymes. On the other hand, the algorithm might detect some matching vowel sequences that are not intended as rhymes. In order to suppress the effect of such false positives, we only consider the rhymes that consist of at least two vowels.

For more details about how Raplyzer works, you might want to go directly to the source code which is freely available on GitHub or just ask me.

The Longest Multis Are Written by…

I scraped the lyrics of 94 rap artists from a lyrics website. Intro, Outro, Skit, and Interlude tracks where filtered out, leaving me with a total of 10,082 songs. For each artist, I computed the Rhyme factor averaged over all the songs of the artist. The FULL RESULTS can be found HERE. In the table below, I’ve listed the top-5 and a selection of other artists that are the most familiar to me.

Rank Artist Rhyme factor 1. Inspectah Deck 1.187 2. Rakim 1.180 3. Redrama 1.168 4. Shai Linne 1.152 5. Earl Sweatshirt 1.152 9. Paleface 1.132 10. Tech N9ne 1.127 26. Jedi Mind Tricks 1.067 27. Wiz Khalifa 1.062 28. T.I. 1.062 30. The Notorious B.I.G. 1.059 31. Lil Wayne 1.056 32. Nicki Minaj 1.056 33. 2Pac 1.054 34. Xzibit 1.053 35. Aesop Rock 1.052 39. Eminem 1.047 40. Nas 1.043 43. The Game 1.041 47. Lecrae 1.028 50. Jay-Z 1.026 56. Diddy 1.017 58. 50 Cent 1.013 63. Wu-Tang Clan 1.002 64. Kanye West 1.002 65. Drake 0.995 76. DMX 0.967 77. Snoop Dogg 0.967 78. Dr. Dre 0.966 79. Pitbull 0.964 85. Shakespeare 0.952 86. Outkast 0.951 90. Ice Cube 0.927 92. will.i.am 0.923 94. The Lonely Island 0.870

Some of the results are not too surprising; for instance Rakim, who is #2, is known for “his pioneering use of internal rhymes and multisyllabic rhymes” according to Wikipedia. Similarly, Inspectah Deck (#1) from Wu-Tang Clan uses lots of multis in his lyrics. As a benchmark, I took all the poems by William Shakespeare and computed Shakespeare’s Rhyme factor (0.952). He falls way behind the majority of rappers, which is understandable since multis are not commonly used in poetry.

Here are some of the longest multis detected by Raplyzer (rhyming part is shown in boldface, diphthongs are counted as two distinct vowels):

Tech N9ne — It’s Alive: “Six six triple eight forty-six ninety-nine three / Sick with nickel plates whorry chicks mighty mine be”(15 rhyming vowels)

Shai Linne — Solus Christus: “My vision is clear, my eyesight more vivid / I’m commissioned here by Christ, I’m salt in it”(13 rhyming vowels)

MF Doom — Born Like This: “Dimes quiet as minds by design, mighty fine / Slight rewind, tightly bind, blind lead blind”(12 rhyming vowels)

Rakim — I Know: “You gone love this, it’s marvelous, baby / It gotta thug’s twist-it start to get crazy” (9 rhyming vowels)

One should note that Raplyzer assumes a typical American English pronunciation of the words (as defined by the eSpeak software) so some artists, who use a lot of multis but construct them often by bending words, are not as high on the list as one might expect. One example of such an artist is Eminem who, according to Stat Quo, “makes words rhyme that typically don’t rhyme together—he’s good at that. It’s about how he pronounces it”[3].

Another way of analyzing rap lyrics computationally is to estimate the size of artists’ vocabularies as famously done by Matt Daniels [4]. In order to get a more holistic picture of the rhyming skills of different rappers, I computed both the Rhyme factor and the vocabulary size and plotted the results. Instead of 35,000 first words, I used 20,000 first words for the vocabulary size estimation in order to be able to include more artists, which seemed to have little effect on the order of the artists. (You can view the figure below in full resolution by clicking it.)

One interesting point made by Daniels is that Jay Z contrasts his lyrical skills with Common and Talib Kweli in his track “Moment of Clarity” saying that he has had to dumb down his lyrics to double his dollars. Daniels points out that both Common and Kweli rank higher on the vocabulary scale, and interestingly, this also holds for the Rhyme factor scale.

“They said I rap like a robot, so call me rap-bot”

To further demonstrate that the algorithm described above is able to find meaningful rhymes and to let people play with it, I decided to create a website called BattleBot, with the help of my friend Stephen Fenech, who took care of web programming. On this site you can “spit” any line that comes to your mind and BattleBot will respond with a list of the best rhyming lines found among the half a million lines from the 94 artists I’ve analyzed for this post. Check it out at:

http://emalmi.kapsi.fi/battlebot/

Final Words

What sets good rap lyrics apart from the rest is not only how well they are written technically. A good rapper can simultaneously write complex multisyllable rhymes and tell a coherent story. Furthermore, he or she may spice up the lyrics with some clever wordplays and metaphors that make the listener either to think or even laugh.

BattleBot exemplifies the fact that it’s possible to come up with lines that form a technically good rhyme but are otherwise totally unrelated. However, rappers beware! I’m currently working on an improved version of BattleBot which tries to find lines that both rhyme well and are semantically similar 😉

Acknowledgements:

While doing this analysis I’ve discovered several great artists that were previously unknown to me and also learned a lot about rap. I’d like to thank Joonas “Skandaali” Palmgren and Tommi Terä for teaching me many new things about rhyming. I would also like to thank Jouni Harjumäki, for linguistic consultation, Niki Paajala, who suggested several key artists I had overlooked in my initial analyses, and many other people, with whom I have had the chance to talk about this project, giving me lots of useful ideas.

References:

[1] http://genius.com/posts/24-Rap-genius-university-rhyme-types

[2] http://www.flocabulary.com/multies/

[3] Edwards, P. How to Rap: The Art and Science of the Hip-Hop MC, 2009.

[4] http://rappers.mdaniels.com.s3-website-us-east-1.amazonaws.com/