We investigate the association between musical chords and lyrics by analysing a large dataset of user-contributed guitar tablatures. Motivated by the idea that the emotional content of chords is reflected in the words used in corresponding lyrics, we analyse associations between lyrics and chord categories. We also examine the usage patterns of chords and lyrics in different musical genres, historical eras and geographical regions. Our overall results confirm a previously known association between Major chords and positive valence. We also report a wide variation in this association across regions, genres and eras. Our results suggest possible existence of different emotional associations for other types of chords.

1. Introduction

The power of music to evoke strong feelings has long been admired and explored [1–5]. Although music has accompanied humanity since the dawn of culture [6] and its underlying mathematical structure has been studied for many years [7–10], understanding the link between music and emotion remains a challenge [1,11–13].

The study of music perception has been dominated by methods that directly measure emotional responses, such as self-reports, physiological and cognitive measurements, and developmental observations [11,13]. Such methods may produce high-quality data, but the data collection process involved is both labour- and resource-intensive. As a result, creating large datasets and discovering statistical regularities has been a challenge.

Meanwhile, the growth of music databases [14–19] as well as the advancement of the field of Music Information Retrieval (MIR) [20–22] opened new avenues for data-driven studies of music. For instance, sentiment analysis [23–26] has been applied to uncover a long-term trend of declining valence in popular song lyrics [27,28]. It has been shown that the lexical features from lyrics [29–34], metadata [35], social tags [36,37] and audio-based features can be used to predict the mood of a song. There has been also an attempt to examine the associations between lyrics and individual chords using a machine translation approach, which confirmed the notion that Major and Minor chords are associated with happy and sad words respectively [38].

Here, we propose a novel method to study the associations between chord types and emotional valence. In particular, we use sentiment analysis to analyse chord categories (e.g. Major and Minor) and their associations with sentiment and words across genres, regions and eras.

To do so, we adopt a sentiment analysis method that uses the crowd-sourced LabMT 1.0 valence lexicon [24,39]. Valence is one of the two basic emotional axes [13], with higher valence corresponding to more attractive/positive emotion. The lexicon contains valence scores ranging from 0.0 (saddest) to 9.0 (happiest) for 10 222 common English words obtained by surveying Amazon’s Mechanical Turk workers. The overall valence of a piece of text, such as a sentence or document, is measured by averaging the valence score of individual words within the text. This method has been successfully used to obtain insight into a wide variety of corpora [40–43].

Here, we apply this sentiment analysis method to a dataset of guitar tablatures—which contain both lyrics and chords—extracted from ultimate-guitar.com. We collect all words that appear with a specific chord and create a large ‘bag of words’—a frequency list of words—for each chord (figure 1). We perform our analysis by aggregating chords based on their ‘chord category’ (e.g. Major chords, Minor chords, Dominant 7th chords, etc.). In addition, we also acquire metadata from the Gracenote API regarding the genre of albums, as well as era and geographical region of musical artists. We then perform our analysis of associations between lyrics sentiment and chords within the categories of genre, era and region. Details of our methodology are described in the next section. Figure 1. A schematic of our process of collecting guitar tablatures and metadata, parsing chord–word associations and analysing the results using data mining and sentiment analysis. Note that ‘genre’ is an album-specific label, while ‘era’ and ‘region’ are artist-specific labels (rather than song-specific).

2. Material and methods

Guitar tabs were obtained from ultimate-guitar.com [18], a large online user-generated database of tabs, while information about album genre, artist era and artist region was obtained from Gracenote API [19], an online musical metadata service.

2.1. Chords–lyrics association

ultimate-guitar.com is one of the largest user-contributed tab archives, hosting more than 800 000 songs. We examined 123 837 songs that passed the following criteria: (i) we only kept guitar tabs and ignored those for other instruments such as the ukulele; (ii) we ignored non-English songs (those having less than half of their words in an English word list [44] or identified as non-English by the langdetect library [45]); (iii) when multiple tabs were available for a song, we kept only the one with the highest user-assigned rating. We then cleaned the raw HTML sources and extracted chords and lyrics transcriptions. As an example, figure 1 shows how the tablature of Leonard Cohen’s ‘Hallelujah’ [46] is processed to produce a chord-lyrics table.

Sometimes, chord symbols appeared in the middle of words; in such cases, we associated the entire words with the chord that appears in its middle, rather than the previous chord. In addition, chords that could not be successfully parsed or that had no associated lyrics were dropped.

2.2. Metadata collection using Gracenote API

We used the Gracenote API (http://gracenote.com) to obtain metadata about artists and song albums. We queried the title and the artist name of the 124 101 songs that were initially obtained from ultimate-guitar.com, successfully retrieving Gracenote records for 89 821 songs. Songs that did not match a Gracenote record were dropped. For each song, we extracted the following metadata fields:

— The geographic region from which the artist originated (e.g. North America ). This was extracted from the highest-level geographic labels provided by Gracenote.

— The musical genre (e.g. 50s Rock ). This was extracted from the second-level genre labels assigned to each album by Gracenote.

— The historical era at the level of decades (e.g. 1970s). This was extracted from the first-level era labels assigned to each artist by Gracenote. Approximately 6000 songs were not assigned to an era, in which case they were assigned to the decade of the album release year as specified by Gracenote.

In our analysis, we reported individual statistics only for the most popular regions (Asia, Australia/Oceania, North America, Scandinavia, Western Europe), genres (top 20 most popular genres), and eras (1950s through 2010s).

2.3. Determining chord categories

We normalized chord names and classified them into chord categories according to chord notation rules from an online resource [47]. All valid chord names begin with one or two characters indicating the root note (e.g. G or Bb ) which are followed by characters which indicate the chord category. We considered the following chord categories [48]:

— Major : A Major chord is a triad with a root, a Major third and a perfect fifth. Major chords are indicated using either only the root note, or the root note followed by M or maj . For instance, F , FM , G , Gmaj were considered Major chords.

— Minor : A Minor chord is also a triad, containing a root, Minor third and a perfect fifth. The notation for Minor chords is to have the root note followed by m or min . For example, Emin , F#m and Bbm were considered Minor chords.

— 7th : A seventh chord has seventh interval in addition to a Major or Minor triad. A Major 7th consists of a Major triad and an additional Major seventh, and is indicated by the root note followed by M7 or maj7 (e.g. GM7 ). A Minor 7th consists of a Minor triad with an additional Minor seventh, and is indicated by the root note followed by m7 or min7 (e.g. Fm7 ). A Dominant 7th is a diatonic seventh chord that consists of a Major triad with additional Minor seventh, and is indicated by the root note followed by the numeral 7 or dom7 (e.g. D7 , Gdom7 ).

— Special chords with ‘*’ : In tab notation, the asterisk ‘*’ is used to indicate special instructions and can have many different meanings. For instance, G* may indicate that the G should be played with a palm mute, with a single strum, or some other special instruction usually indicated in free text in the tablature. Because in most cases the underlying chord is still played, in this study we map chords with asterisks to their respective non-asterisk versions. For instance, we consider G* to be the same as G and C7* to be the same as C7 .

— Other chords: There were several other categories of chords that we do not analyse individually in this study. One of these is Power chords, which are dyads consisting of a root and a perfect fifth. Because Power chords are highly genre specific, and because they sometimes function musically as Minor or Major chords, we eliminated them from our study. For reasons of simplicity and statistical significance, we also eliminated several other categories of chords, such as Augmented and Diminished chords, which appeared infrequently in our dataset.

In total, we analysed 924 418 chords (see the next subsection). Figure 2 shows the prevalence of different chord categories among these chords. Figure 2. Prevalence of different chord categories within the dataset (note logarithmic scale).

2.4. Sentiment analysis

Sentiment analysis was used to measure the valence (happiness versus unhappiness) of chord lyrics. We employed a simple methodology based on a crowd-sourced valence lexicon ( LabMT 1.0 ) [24,39]. This method was chosen because (i) it is simple and scalable, (ii) it is transparent, allowing us to calculate the contribution from each word to the final valence score and (iii) it has been shown to be useful in many studies [40–43]. The LabMT 1.0 lexicon contains valence scores ranging from 0.0 (saddest) to 9.0 (happiest) for 10 222 common English words as obtained by surveying Amazon’s Mechanical Turk workers. The valence assigned to some sequence of words (e.g. words in the lyrics corresponding to Major chords) was computed by mapping each word to its corresponding valence score and then computing the mean. Words not found in the LabMT lexicon were ignored; in addition, following recommended practices for increasing sentiment signal [24], we ignored emotionally neutral words having a valence strictly between 3.0 and 7.0. Chords that were not associated with any sentiment-mapped words were ignored. The final dataset contained 924 418 chords from 86 627 songs.

2.5. Word shift graphs

In order to show how a set of lyrics (e.g. lyrics corresponding to songs in the Punk genre) differs from the overall lyrics dataset, we use the word shift graphs [27,41]. We designate the whole dataset as the reference (baseline) corpus and call the set of lyrics that we want to compare comparison corpus. The difference in their overall valence can now be broken down into the contribution from each individual word. Increased valence can result from either having a higher prevalence (frequency of occurrence) of high-valence words or a lower prevalence of low-valence words. Conversely, lower valence can result from having a higher prevalence of low-valence words or a lower prevalence of high-valence words. The percentage contribution of an individual word i to the valence difference between a comparison and reference corpus can be

expressed as:

100 × ( h i − h ( ref ) ) ⏞ + / − ( p i − p i ( ref ) ) ⏞ ↑ / ↓ | h ( comp ) − h ( ref ) | ,

i

i

p i ( ref )

p i = n i / ∑ i ′ n i ′

i

2.6. Model comparison

whereis the valence score of wordin the lexicon,andare the mean valences of the words in the reference corpus and comparison corpus respectively,is the normalized frequency of wordin the comparison corpus, andis the normalized frequency of wordin the reference corpus (normalized frequencies are computed as, whereis the number of occurrences of word). The first term (indicated by ‘+/−’) measures the difference in word valence between wordand the mean valence of the reference corpus, while the second term (indicated by ↑/↓) looks at the difference in word prevalence between the comparison and reference corpus. In plotting the word shift graphs, for each word we use +/− signs and blue/orange bar colours to indicate the (positive or negative) sign of the valence term and ↑/↓ arrows to indicate the sign of the prevalence term.

In the Results section, we evaluate what explanatory factors (chord category, genre, era and region) best account for differences in valence scores. Using the statsmodels toolbox [49], we estimated linear regression models where the mean valence of each chord served as the response variable and the most popular chord categories, genres, eras and regions served as the categorical predictor variables. The variance of the residuals was used to compute the proportion of variance explained when using each factor in turn.

We also compared models that used combinations of factors. As before, we fit linear models that predicted valence. Now, however, explanatory factors were added in a greedy fashion, with each additional factor to minimize the Akaike information criterion (AIC) of the overall model.

3. Results

3.1. Valence of chord categories

We measure the mean valence of lyrics associated with different chord categories (figure 3a). We find that Major chords have higher valence than Minor chords, concurring with numerous studies which argue that human subjects perceive Major chords as more emotionally positive than Minor chords [50–52]. However, our results suggest that Major chords are not the happiest: all three categories of 7th chords considered here (Minor 7th, Major 7th and Dominant 7th) exhibit higher valence than Major chords. This effect holds with high significance (p≪10−3 for all, one-tailed Mann–Whitney tests). Figure 3. (a) Mean valence for chord categories. Error bars indicate 95% CI of the mean (error bars for Minor and Major chords are smaller than the symbols). (b–f) Word shift plots for different chord categories. High-valence words are indicated using ‘+’ symbol and orange colour, while low-valence words are indicated by ‘−’ symbol and blue colour. Words that are overexpressed in the lyrics corresponding to a given chord category are indicated by ‘↑’, while underexpressed words are indicated by ‘↓’.

In figure 3b–f, we use the word shift graphs [27] to identify words that contribute most to the difference between the valence of each chord category and baseline (mean valence of all lyrics in the dataset). For instance, ‘lost’ is a lower-valence word (blue colour and ‘−’ sign) that is underexpressed in Major chords (‘↓’ sign), increasing the mean valence of Major chords above baseline. Many negative words, such as ‘pain’, ‘fight’, ‘die’ and ‘lost’ are overexpressed in Minor chord lyrics and under-represented in Major chord lyrics.

Although the three types of 7th chords have similar valence scores (figure 3a), word shift graphs reveals that they may have different ‘meanings’ in terms of associated words. Overexpressed high-valence words for Dominant 7th chords include terms of endearment or affection, such as ‘baby’, ‘sweet’ and ‘good’ while for Minor 7th chords they are ‘life’ and ‘god’. Lyrics associated with Major 7th chords, on the other hand, have a stronger under-representation of negative words (e.g. ‘die’ and ‘hell’).

3.2. Genres

We analyse the emotional content and meaning of lyrics from albums in different musical genres. Figure 4a shows the mean valence of different genres, demonstrating that an intuitive ranking emerges when genres are sorted by valence: Religious and 60s Rock lyrics reside at the positive end of the spectrum while Metal and Punk lyrics appear at the most negative. Figure 4. (a) Mean valence of lyrics by album genre. (b) Major versus Minor valence differences for lyrics by album genre. (c) Word shift plot for the Religious genre. (d) Word shift plot for the Punk genre. See caption of figure 3 for explanation of word shift plot symbols.

As mentioned in the previous section, Minor chords have a lower mean valence than Major chords. We computed the numerical differences in valence between Major and Minor chords for albums in different genres (figure 4b). All considered genres, with the exception of Contemporary R&B/Soul, had a mean valence of Major chords higher than that of Minor chords. Some of the genres with the largest Major versus Minor valence differences include Classic R&B/Soul, Classic Country, Religious and 60s Rock.

We show word shift graphs for two musical genres: Religious (figure 4c) and Punk (figure 4d). The highest contributions to the Religious genre come from the overexpression of high-valence words having to do with worship (‘love’, ‘praise’, ‘glory’, ‘sing’). Conversely, the highest contributions to the Punk genre come from the overexpression of low-valence words (e.g. ‘dead’, ‘sick’, ‘hell’). Some exceptions exist: for example, Religious lyrics underexpress high-valence words such as ‘baby’ and ‘like’, while Punk lyrics underexpress the low-valence word ‘cry’.

3.3. Era

In this section, we explore sentiment trends for artists across different historical eras. Figure 5a shows the mean valence of lyrics in different eras. We find that valence has steadily decreased since the 1950s, confirming results of previous sentiment analysis of lyrics [27], which attributed the decline to the recent emergence of ‘dark’ genres such as metal and punk. However, our results demonstrate that this trend has recently undergone a reversal: lyrics have higher valence in the 2010s era than in the 2000s era. Figure 5. (a) Mean valence of lyrics by artist era. (b) Major versus Minor valence differences by artist era. (c) Proportion of chords in each chord category in different eras (note logarithmic scale).

As in the last section, we analyse differences between Major and Minor chords for lyrics belonging to different eras (figure 5b). Although Major chords have a generally higher valence than Minor chords, surprisingly this distinction does not hold in the 1980s era, in which Minor and Major chord valences are similar. The genres in the 1980s that had Minor chords with higher mean valence than Major chords—in other words, which had an ‘inverted’ Major/Minor valence pattern—include Alternative Folk, Indie Rock and Punk (data not shown).

Finally, we report changes in chord usage patterns across time. Figure 5c shows the proportion of chords belonging to each chord category in different eras (note the logarithmic scale). Since the 1950s, Major chord usage has been stable while Minor chords usage has been steadily growing. Dominant 7th chords have become less prevalent, while Major 7th and Minor 7th chords had an increase in usage during the 1970s.

The finding that Minor chords have become more prevalent while Dominant 7th chords have become rarer agrees with a recent data-driven study of the evolution of popular music genres [53]. The authors attribute the latter effect to the decline in the popularity of blues and jazz, which frequently use Dominant 7th chords. However, we find that this effect holds widely, with Dominant 7th chords diminishing in prevalence even when we exclude genres associated with Blues and Jazz (data not shown). More qualitatively, musicologists have argued that many popular music styles in the 1970s exhibited a decline in the use of Dominant 7th chords and a growth in the use of Major 7th and Minor 7th chords [54]—clearly seen in the corresponding increases in figure 5c.

3.4. Region

In this section, we evaluate the emotional content of lyrics from artists in different geographical regions. Figure 6a shows that artists from Asia have the highest-valence lyrics, followed by artists from Australia/Oceania, Western Europe, North America and finally Scandinavia, the lowest valence geographical region. The latter region’s low valence is probably due to the over-representation of ‘dark’ genres (such as metal) among Scandinavian artists [55]. Figure 6. (a) Mean valence of lyrics by artist region. (b) Major versus Minor valence differences by artist region.

As in previous sections, we compare differences in valence of Major and Minor chords for different regions (figure 6b). All regions except Asia have a higher mean valence for Major chords than Minor chords, while for the Asian region there is no significant difference.

There are several important caveats to our geographical analysis. In particular, our dataset consisted of only English-language songs, and is thus unlikely to be representative of overall musical trends in non-English speaking countries. This bias, along with others, is discussed in more depth in the Discussion section.

3.5. Model comparison

We have shown that mean valence varies as a function of chord category, genre, era and region (which are here called explanatory factors). We evaluate what explanatory factors best account for differences in valence scores. Figure 7a shows the proportion of variance explained when using each factor in turn as a predictor of valence. This shows that genre explains most variation in valence, followed by era, chord category and finally region. Figure 7. (a) Per cent category, genre, era and region as categorical predictor variables. (b) AIC model selection scores for models that predict valence using different explanatory factors. The model that includes all explanatory factors is preferred.

It is possible that variation in valence associated with some explanatory factors is in fact ‘mediated’ by other factors. For example, we found that mean valence declined from the 1950s era through the 2000s, confirming previous work [27] that explained this decline by the growing popularity of ‘dark’ genres like Metal and Punk over time; this is an example in which valence variation over historical eras is argued to actually be attributable to variation in the popularities of different genres. As another example, it is possible that Minor chords are lower valence than Major chords because they are overexpressed in dark genres, rather than due to their inherent emotional content.

We investigate this effect using statistical model selection. For instance, if the valence variation over chord categories can be entirely attributed to genre (i.e. darker genres have more Minor chords), then model selection should prefer a model that contains only the genre explanatory factor to the one that contains both the genre and chord category explanatory factors.

We fit increasingly large models while computing their Akaike information criterion (AIC) scores, a model selection score (lower is better). As figure 7b shows, the model that includes all four explanatory factors has the lowest AIC, suggesting that chord category, genre, era and region are all important factors for explaining valence variation.

4. Discussion

In this paper, we propose a novel data-driven method to uncover emotional valence associated with different chords as well as different geographical regions, historical eras and musical genres. We then apply it to a novel dataset of guitar tablatures extracted from ultimate-guitar.com along with musical metadata provided by the Gracenote API. We use word shift graphs to characterize the meaning of chord categories as well as categories of lyrics.

We find that Major chords are associated with higher valence lyrics than Minor chords, consistent with the previous music perception studies that showed that Major chords evoke more positive emotional responses than Minor chords [50–52,56]. For an intuition regarding the magnitude of the difference, the mean valence of Minor chord lyrics is approximately 6.16 (e.g. the valence of the word ‘people’ in our sentiment dictionary), while the mean valence of Major chord lyrics is approximately 6.28 (e.g. the valence of the word ‘community’ in our sentiment dictionary). Interestingly, we also uncover that three types of 7th chords—Dominant 7ths, Major 7ths and Minor 7ths—have even higher valence than Major chords. This effect has not been deeply researched, except for one music perception study which reported that, in contrast to our findings, 7th chords evoke emotional responses intermediate in valence between Major and Minor chords [57].

Significant variation exists in the lyrics associated with different geographical regions, musical genres and historical eras. For example, musical genres demonstrate an intuitive ranking when ordered by mean valence, ranging from low-valence Punk and Metal genres to high-valence Religious and 60s Rock genres. We also found that sentiment declined over time from the 1950s until the 2000s. Both of these findings are consistent with the results of a previous study conducted using a different dataset and lexicon [27]. At the same time, we report a new finding that the trend in declining valence has reversed itself, with lyrics from the 2010s era having higher valence than those from the 2000s era. Finally, we perform an analysis of the variation of valence among geographical regions. We find that Asia has the highest valence while Scandinavia has the lowest (probably due to prevalence of ‘dark’ genres in that region).

We perform a novel data-driven analysis of the Major versus Minor distinction by measuring the difference between Major and Minor valence for different regions, genres and eras. All examined genres except Contemporary R&B/Soul exhibited higher Major chord valence than Minor chord. Interestingly, the largest differences of Major above Minor may indicate genres (Classic R&B/Soul, Classic Country, Religious, 60s Rock) that are more musically ‘traditional’. In terms of historical periods, we find that, unlike other eras, the 1980s era did not have a significant Major–Minor valence difference. This phenomenon calls for further investigation; one possibility is that it may be related to an important period of musical innovation in 1980s, which was recently reported in a data-driven study of musical evolution [53]. Finally, analysis of geographic variation indicates that songs from the Asian region—unlike those from other regions—do not show a significant difference in the valence of Major versus Minor chords. In fact, it is known in the musicological literature that the association of positive emotions with Major chords and negative emotions with Minor chords is culture-dependent, and that some Asian cultures do not display this association [58]. Our results may provide new supporting evidence of cultural variation in the emotional connotations of the Major/Minor distinction.

Finally, we evaluate how much of the variation in valence in our dataset is attributable to chord category, genre, era and region (we call these four types of attributes ‘explanatory factors’). We find that genre is the most explanatory, followed by era, chord category and region. We use statistical model selection to evaluate whether certain explanatory factors ‘mediate’ the influence of others (an example of mediation would be if variation in valence of different eras is actually due to variation in the prevalence of different genres during those eras). We find that all four explanatory factors are important for explaining variation in valence; no explanatory factors totally mediate the effect of others.

Our approach has several limitations. First, the accuracy of tablature chord annotations may be limited because users are not generally professional musicians; for instance, more complex chords (e.g. dim or 11th ) may be mis-transcribed as simpler chords. To deal with this, we analyse relatively basic chords—Major, Minor and 7ths—and, when there are multiple versions of a song, use tabs with the highest user-assigned rating. Our manual inspection of a small sample of parsed tabs indicated acceptable quality, although a more systematic evaluation of the error rate can be performed using more extensive manual inspection of tabs by professional musicians.

There are also significant biases in our dataset. We only consider tablatures for songs entered by users of ultimate-guitar.com, which is likely to be heavily biased towards North American and European users and songs. In addition, this dataset is restricted to songs playable by guitar, which selects for guitar-oriented musical genres and may be biased toward emotional meanings tied to guitar-specific acoustic properties, such as the instrument’s timbre. Furthermore, our dataset includes only English-language songs and is not likely to be a representative sample of popular music from non-English speaking regions. Thus, for example, the high valence of songs from Asia should not be taken as conclusive evidence that Asian popular music is overall happier than popular music in English-speaking countries. For this reason, the absence of a Major versus Minor chord distinction in Asia is speculative and requires significant further investigation.

Despite these limitations, we believe that our results reveal meaningful patterns of association between music and emotion at least in guitar-based English-language popular music, and show the potential of our novel data-driven methods. At the same time, applying these methods to other datasets—in particular those representative of other geographical regions, historical eras, instruments and musical styles—is of great interest for future work. Another promising direction for future work is to move the analysis of emotional content beyond single chords, since emotional meaning is likely to be more closely associated with melodies rather than individual chords. For this reason, we hope to extend our methodology to study chord progressions.

Data accessibility

The datasets generated during and/or analysed during this study are available in the Figshare repository, https://doi.org/10.6084/m9.figshare.5413060.v1. Code for performing the analysis and generating plots in this manuscript is available at https://github.com/artemyk/chordsentiment.

Authors' contributions

N.D. and Y.Y.A. conceived the idea of analysing the association between lyric sentiment and chord categories, while A.K. contributed the idea of analysing the variation of this association across genres, eras and regions. N.D. and A.K. downloaded and processed the dataset. All authors analysed the dataset and results. A.K. and N.D. prepared the initial draft of the manuscript. All authors reviewed and edited the manuscript.

Competing interests

The authors declare no competing financial interests.

Funding

Y.Y.A thanks Microsoft Research for MSR Faculty Fellowship.

Acknowledgements We thank Rob Goldstone, Christopher Raphael, Peter Miksza, Daphne Tan and Gretchen Horlacher for helpful discussions and comments.

Footnotes