For many people, listening to music elicits such an emotional response that the idea of dredging it for statistics and structure can seem odd or even misguided. But knowing these patterns can give one a deeper more fundamental sense for how music works; for me this makes listening to music a lot more interesting. Of course, if you play an instrument or want to write songs, being aware of these things is obviously of great practical importance.

In this article, we’ll look at the statistics gathered from 1300 choruses, verses, etc. of popular songs to discover the answer to a few basic questions. First we’ll look at the relative popularity of different chords based on the frequency that they appear in the chord progressions of popular music. Then we’ll begin to look at the relationship that different chords have with one another. For example, if a chord is found in a song, what can we say about the probability for what the next chord will be that comes after it?

The Database

To make quantitative statements about music you need to have data; lots of it. Guitar tab websites have tons of information about the chord progressions that songs use, but the quality is not very high. Just as important, the information is not in a format suitable for gathering statistics. So, over the past 2 years we’ve been slowly and painstakingly building up a database of songs taken mainly from the billboard 100 and analyzing them 1 at a time. At the moment the database of songs has over 1300 entries indexed. The genre and where they are taken from is important. This is an analysis of mainly “popular” music, not jazz or classical, so the results are not meant to be treated as universal. If you’re interested, you can check out the database here. The entries contain raw information about the chords and melody, while throwing out information about the arrangement and instrumentation.

We can use the information in the song database to answer all sorts of questions. In this introductory post, I’ll look at a few interesting preliminary results, but we invite you to propose your own questions in the comments at the end of the article.

Let’s get started.

Are some chords more commonly used than others?

This seems like such a basic question, but the answer doesn’t actually tell us much because songs are written in different keys. A song written in the key of C♯ will have lots of C# chords in it, while a song written in G will probably have lots of G‘s. That G chords are more popular than C♯ chords is likely only a reflection of the fact that it’s easier to play on the guitar and piano. So instead of answering this meaningless question, I’ll answer the slightly more interesting one of, what keys are most popular for the songs in the database?

C (and its relative minor, A) are the most common by far. After that there is a general trend favoring key signatures with less sharps and flats but this is not universal. E♭ with three flats, for instance, is slightly (though not statistically significantly) more common than F with only one flat. B♭ only has two flats but is way at the end of the popularity scale with only 4% of songs using that as the key.

What are the most common chords? Part 2

It’s much more interesting to look at songs written in a single common key. That way direct comparisons are possible and more illuminating. We transposed every song in the database to be in the key of C to make them directly comparable. Then we looked at the number of chord progressions that contained a given chord. Below we’ve plotted the relative frequency that different chords occurred in descending order.

As expected, C major is a very common chord for songs written in C (it’s the I chord in Roman numeral or Nashville Number notation), but F major and G major (the IV and V respectively) are used just as often. Interestingly, F and G actually show up in more chord progressions than C! C major is the tonal center and one might expect it to be ubiquitous, but it turns out to be pretty common to omit this chord in some sections of a song for effect. “My Heart Will Go On” by Celine Dion is one of many examples in the database that exhibit this behavior. Clicking on the above link will take you to the song’s entry in the database and show you that of the two sections that were analyzed (the chorus, and the verse), only one contains a C.

The A minor chord is the next most popular, but after that there is a significant drop off in use. If you’ve ever heard someone complain about the “four chord pop song”, this is what they are talking about.

Is there a reasonable explanation for the relative popularity of certain chords?

Why are A minor chords so popular but A major chords practically non existent? There won’t always be easy answers, but in this case these results can easily be explained with some basic music theory. A discussion of this is out of the scope of this post, but we’ll definitely explore the music theory behind this in future articles.

Even if you don’t know the music theory behind this yet, there is a lot of practical information to take away. If your song is written in C and you want it to sound good, you probably shouldn’t use any A major chords unless you really know what you’re doing. Better stick with A minor, for example.

The team over at Apple, Inc. evidently know their music theory. Their latest version of GarageBand lets you play with “Smart Instruments” that “make you sound like an expert musician… even if you’ve never player a note before.”

I’m skeptical of their claims, but look at the chords they’ve chosen for these “Smart Instruments”:

Don’t those chords look familiar? Based on what our database is showing, I might suggest some small changes.

In particular, Bdim, while diatonic in C, is much less common than some other chords, like D, and E. Perhaps in the next version of garageband, Apple will fix this (they really should).

However, overall Apple is making good choices for the chords that the average “garage band musician” might want to start with.

If a song happens to use a particular chord, what chord is most likely to come next?

The previous question took an overall look at the relative popularity of different chords, but we can also look at the relationship that different chords have to one another. For example, a great question to ask is, if a song happens to use a particular chord, what chord is most likely to come next? Is it random, or will certain chords sound better than others and thus be more likely to show up in the popular songs that make up our database?

There are a lot of relationships to analyze, but we’ll start it off by looking at just one for now: For songs written in C, what chords are most likely to come after an E minor chord? The relative popularity of what the next chord will be is shown below:

This result is striking. If you write a song in C with an E minor in it, you should probably think very hard if you want to put a chord that is anything other than A minor or F major after the E minor. For the songs in the database, 93% of the time one of these two chords came next!

There are lot of interesting questions to ask, and we want to know what is most interesting to you. Let us know in the comments below.

Other posts in the 1300 song series