Introduction

Have you ever wondered what the ten most common guitar chords in western music are? Well, here they are, and unfortunately, you still have to learn that damned F.

# CHORD PERCENT OF ALL 1 G 14.0854% 2 C 11.5644% 3 D 11.3339% 4 A 7.8566% 5 F 5.9005% 6 Am 4.9978% 7 E 4.9817% 8 Em 4.518% 9 Dm 2.1277% 10 Bm 2.0687%

If you read the rest, I’ve posted the top 100 chords below, and a method to repeat my experiment with your own data set.

Background

As a programmer and amateur guitarist, I have often used online resources to improve my playing. However, I’ve always seen a few questions the internet seems unable to answer satisfactorily. Key among these was one of the first questions I ever asked anyone who played guitar: “What chords should I learn first?” To the skilled guitarist, this seems like an easy answer: “All of them.”, but to a new player, who wants to gain a love for playing, they need a strong starting point, from which they can learn new songs, to keep them interested and growing in skill. Some places seemed to recite the standard litany: Learn C, G, D, Am, & Em, and you’ll have the majority of pop music (if you capo). But that also didn’t seem sufficient for me to pick up a chord-sheet from the internet, and start playing. They all seemed to need chords I didn’t know yet. Frankly, It was a frustrating time for me, that I just had to tough out. Last week I started back to work on my sheet music software, “Repertoire”, and decided to add chord renderings. Obviously, I couldn’t predefine all 10,000+ chord forms available to a skilled player. But, I certainly could add the 100 or so most common first-position chords. This lead me to the same merry question I started with: “What are they?”

Methodology

Luckily, this time around I had a few extra resources available to me. I could code, and, more importantly, I had a 4mb folder of chordpro-formatted chord sheets in a folder on my computer, from http://getsome.org/guitar/olga/chordpro/. I was working in PHP already, so I just wrote up a quick script:

<?php function listFilesRecursive($dir, $extension){ $array = array(); $ffs = scandir($dir); foreach($ffs as $ff){ if($ff != '.' && $ff != '..'){ $path = $dir.'/'.$ff; if(is_dir($path)) { $contents = listFilesRecursive($path, $extension); $array = array_merge($array, $contents); } else { $info = pathinfo($path); if(strtolower($info['extension']) == $extension) { $array[] = $path; } } } } return $array; } $directory = "chords"; $files = listFilesRecursive($directory, "chopro"); $chords = array(); $total = 0; foreach($files as $file) { $matches = null; $contents = file_get_contents($file); preg_match_all('/\[(.*?)\]/', $contents, $matches); if(isset($matches[1]) && $matches[1]) { foreach($matches[1] as $chord) { $chord = str_replace(array('maj', 'min'), array('', 'm'), $chord); if(!isset($chords[$chord])) $chords[$chord] = 0; $chords[$chord]++; $total++; } } } arsort($chords); echo "<table>

"; echo "<thead><tr><th>#</th><th>Chord</th><th>Times Seen</th></th><th>Percent</th></tr></thead><tbody>

"; $i=0; foreach($chords as $chord=>$count) { $percent = round($count/$total*100, 4); echo "<tr><td>".$i."</td><td><b>".$chord."</b></td><td>".$count."</td><td>".$percent."%</td></tr>

"; $i++; } echo "</tbody></table>

"; ?>

The script needs to be in a folder, with a subfolder called “chords”, that contains files with the .chopro extension. It will scan all the files in the subfolder (and all its subfolders) and extract the chords. Then it counts occurrences of each chord, and produces an ordered list from most common to least.

Results

Here are the top 100 chords (of almost 1300) returned from processing my data set:

# Chord Times Seen Percent 0 G 21019 14.0854% 1 C 17257 11.5644% 2 D 16913 11.3339% 3 A 11724 7.8566% 4 F 8805 5.9005% 5 Am 7458 4.9978% 6 E 7434 4.9817% 7 Em 6742 4.518% 8 Dm 3175 2.1277% 9 Bm 3087 2.0687% 10 B 2950 1.9769% 11 Bb 2786 1.867% 12 G7 2027 1.3584% 13 A7 1880 1.2598% 14 D7 1832 1.2277% 15 F#m 1790 1.1995% 16 E7 1480 0.9918% 17 C7 1479 0.9911% 18 Am7 1275 0.8544% 19 C#m 1246 0.835% 20 F# 1222 0.8189% 21 Eb 1023 0.6855% 22 Gm 996 0.6674% 23 B7 973 0.652% 24 Em7 921 0.6172% 25 F7 824 0.5522% 26 Dm7 817 0.5475% 27 Ab 596 0.3994% 28 Cm 577 0.3867% 29 Bm7 537 0.3599% 30 C# 511 0.3424% 31 D/F# 454 0.3042% 32 Gm7 447 0.2995% 33 G#m 367 0.2459% 34 G# 363 0.2433% 35 C/G 361 0.2419% 36 Fm 355 0.2379% 37 F#m7 334 0.2238% 38 G/B 321 0.2151% 39 F#7 270 0.1809% 40 G6 264 0.1769% 41 Asus4 259 0.1736% 42 Bb7 249 0.1669% 43 Cm7 228 0.1528% 44 D# 223 0.1494% 45 C9 209 0.1401% 46 Hm 206 0.138% 47 C/B 184 0.1233% 48 Dsus4 184 0.1233% 49 H7 180 0.1206% 50 A# 179 0.12% 51 Db 177 0.1186% 52 C/E 172 0.1153% 53 D9 150 0.1005% 54 Bbm 149 0.0998% 55 Gb 148 0.0992% 56 Asus2 146 0.0978% 57 C#m7 146 0.0978% 58 Esus4 143 0.0958% 59 G/F# 142 0.0952% 60 Dsus 141 0.0945% 61 Cadd9 139 0.0931% 62 G/D 130 0.0871% 63 D/A 127 0.0851% 64 A/C# 120 0.0804% 65 N.C. 109 0.073% 66 G5 104 0.0697% 67 Dsus2 99 0.0663% 68 C#7 99 0.0663% 69 A5 97 0.065% 70 E/G# 92 0.0617% 71 Ebm 91 0.061% 72 G9 91 0.061% 73 F/G 87 0.0583% 74 D6 87 0.0583% 75 Eb7 85 0.057% 76 A/E 84 0.0563% 77 Gsus4 82 0.055% 78 F/A 81 0.0543% 79 A9 79 0.0529% 80 C(9) 77 0.0516% 81 E9 76 0.0509% 82 Abm 75 0.0503% 83 D#m 75 0.0503% 84 D/C 72 0.0482% 85 Fm7 72 0.0482% 86 Esus 70 0.0469% 87 G/A 68 0.0456% 88 D2 68 0.0456% 89 Csus4 67 0.0449% 90 A7sus4 65 0.0436% 91 E5 65 0.0436% 92 em 64 0.0429% 93 A6 64 0.0429% 94 D/E 64 0.0429% 95 Ab7 63 0.0422% 96 Gm6 62 0.0415% 97 Am/G 59 0.0395% 98 A/D 55 0.0369% 99 G+ 55 0.0369%

Disclosures

Now, looking through the list of results, I find several potential issues with the data set used.

As an amateur-transcribed list, it may be unfairly weighted to the chords that amateurs are told to learn first. The choice of chord name is not always consistent. For example, some chord sheets write A# and some write Bb, to describe the same chord. There are some incorrectly formatted files and typos that added a few false entries into the set, but they are statistically insignificant. Some transcribers seem intent on using the German/Scandinavian naming system for notes, where there is an H chord instead of B and B is used for A#/Bb. This is surprisingly so common in my set that it appears in the top 100 chords. This clearly would be different for a more standardized data set.

Conclusions

Overall, I’d say the results seem pretty solid. And essentially they say C, G, D, Am, & Em is a fairly good starting point, but you’re still gonna have to work on that damned F chord.