Finally, data science has begun tackling rap. It makes sense, because rap is a pretty good subject for algorithms to latch onto: lyrics are a dense data set, analysts have a lot of words to work with, and songs are heavy on allusions and references that make for fascinating connections.

In 2014, Matt Daniels ranked rappers by the breadth of their vocabularies (Aesop Rock and GZA took first and second place, respectively). Now, Eric Malmi, a Finnish doctoral candidate, has looked at something more integral to rap: rhymes.

Assonance is a great way to judge rhyming in rap

Malmi analyzed popular rap lyrics using something called assonant rhymes. This might be the best way to judge rap algorithmically.

Without getting too deep into the phonetic weeds, an assonant rhyme is one where the vowels rhyme, but the consonants may or may not. These rhymes can happen inside one line of rap or between multiple lines. Because it's such a flexible measure of rhyme, it's a great way to quantify the overall "flow" of a lyric.

It's also a better measure than looking at just vocabulary or rhymes at the end of a line. For example, this rap has a simple end rhyme, but it's probably similar to your own horrible freestyles in the shower:

My name is Joe, I walk down the street, And now I'm going to look at my feet.

Yes, "street" and "feet" rhyme, but the flow leaves something to be desired. Even though it has a solid end rhyme, where both the consonants and vowels rhyme, the rest of the lyric is weak.

Now, look at a couplet from Rakim that Malmi singles out as particularly impressive:

You gone love this, it’s marvelous, baby It gotta thug’s twist-it start to get crazy

The assonant rhymes are a great metric for analyzing flow. "Baby" and "Crazy" don't rhyme the same way "street" and "feet" do in your amateur freestyle. But both vowels do, as do the vowels in "this" and "twist," "marvelous" and "start to get," and "love" and "thug." Those rhymes serve as a good proxy for the flow that emerges when a rap is performed by measuring internal rhymes, end rhymes, and multi-rhymes (multi-syllabic rhymes that span across words or lines).

So Malmi analyzed those rhymes instead of everything else about the lyrics. He fed thousands of lyrics through a text-to-speech reader and got a phonetic analysis of each line. Then, he found the number of assonant rhymes in the phonetic text (while trying to control for weirdness, like choruses and repeats). Phonetic text is surprisingly accurate when it comes to showing rhymes — you can see for yourself when you try Malmi's tool for testing his rhyme engine. His Battle-Bot lets you put in your own poetry and spits back rhymes, and it does a pretty good job.

Who is the best rapper?

Inspectah Deck, familiar from Wu-Tang Clan and his own solo efforts, ranked highest. His "rhyme factor" of 1.187 beat out the other top 5: Rakim, Redrama, Shai Linne, and Earl Sweatshirt. You can see the full data set here. The top fifteen include (with rhyme factor in parentheses):

Inspectah Deck (1.187) Rakim (1.180) Redrama (1.168) Shai Linne (1.152) Earl Sweatshirt (1.152) AZ (1.144) Chief Keef (1.144) ASAP Rocky (1.132) Paleface (1.132) Tech N9ne (1.127) Kool G Rap (1.123) MF Doom (1.120) Slaughterhouse (1.17) Sage Francis (1.115) Elzhi (1.105)

For good measure, Malmi graphed the rappers' rhyme factors against the mix of words in their vocabulary:

So is Vanilla Ice better than Shakespeare?

For fun, Malmi fed William Shakespeare's poetry into the algorithm as well, and his "rhyme factor" ended up being lower than rappers including Pitbull, Xzibit, and, yes, Vanilla Ice.

Does this mean Vanilla Ice is a better poet than Shakespeare? Not really — it's an apples and oranges comparison, because Shakespeare wasn't focused on maximizing internal or assonant rhymes, but on fitting an entirely different meter and rhyme scheme.

In a way, that comparison highlights the shortcomings of data-based rapper-ranking. Aesthetics isn't easily ranked, no matter how good the data set. Yes, Earl Sweatshirt has a higher "rhyme factor" than his Odd Future colleague Tyler the Creator, but that doesn't mean they were trying to hit the same artistic beats. Few people would rank Cypress Hill above Ice Cube, even though the rhyme factor does. It's proof that the science can only go so far when it comes to art.

However, just because the "Rhyme Factor" is an imperfect tool doesn't mean it's a useless one. Rating rhymes shows the evolution of musical styles from the playground-simple Sugar Hill Gang to ornate rhymes by Tech N9ne. It also gives lyrically complex rappers their due. Even the engine's shortcomings can teach us about rap: the text-to-speech phonetic analysis fails to measure the performative aspects of rap, like how rappers bend words into rhyming through sheer force of will (i.e. weird pronunciation or accents).