Are popular songs today happier or sadder than they were 50 years ago? In recent years, the availability of large digital datasets online and the relative ease of processing them means that we can now give precise and informed answers to questions such as this. A straightforward way to measure the emotional content of a text is just to count how many emotion words are present. How many times are negative-emotion words – ‘pain’, ‘hate’ or ‘sorrow’ – used? How many times are words associated with positive emotions – ‘love’, ‘joy’ or ‘happy’ – used? As simple as it sounds, this method works pretty well, given certain conditions (eg, the longer the available text is, the better the estimate of mood). This is a possible technique for what is called ‘sentiment analysis’. Sentiment analysis is often applied to social media posts, or contemporary political messages, but it can also be applied to longer timescales, such as decades of newspaper articles or centuries of literary works.

The same technique can be applied to song lyrics. For our analysis, we used two different datasets. One contained the songs included in the year-end Billboard Hot 100 charts. These are songs that reached wide success, at least in the United States, from The Rolling Stones’ ‘(I Can’t Get No) Satisfaction’ (in 1965, the first year we considered) to Mark Ronson’s ‘Uptown Funk’ (in 2015, the last year we considered). The second dataset was based on the lyrics voluntarily provided to the website Musixmatch. With this dataset, we were able to analyse the lyrics of more than 150,000 English-language songs. These include worldwide examples, and therefore provide a wider, more diverse, sample. Here we found the same trends that we found in the Billboard dataset, so we can be confident that they can be generalised beyond top hits.

English-language popular songs have become more negative. The use of words related to negative emotions has increased by more than one third. Let’s take the example of the Billboard dataset. If we assume an average of 300 words per song, every year there are 30,000 words in the lyrics of the top-100 hits. In 1965, around 450 of these words were associated with negative emotions, whereas in 2015 their number was above 700. Meanwhile, words associated with positive emotions decreased in the same time period. There were more than 1,750 positive-emotion words in the songs of 1965, and only around 1,150 in 2015. Notice that, in absolute number, there are always more words associated with positive emotions than there are words associated with negative ones. This is a universal feature of human language, also known as the Pollyanna principle (from the flawlessly optimistic protagonist of the eponymous novel), and we would hardly expect this to reverse: what does matter, though, is the direction of the trends.

The effect can be seen even when we look at single words: the usage of ‘love’, for example, practically halved in 50 years, going from around 400 to 200 instances. The word ‘hate’, on the contrary, which until the 1990s was not even mentioned in any of the top-100 songs, is now used between 20 and 30 times each year.

Our results are consistent with other, independent analyses of song moods, some of which used completely different methodologies, and focused on other characteristics of the songs. For example, researchers analysed a dataset of 500,000 songs released in the UK between 1985 and 2015 and found a similar decrease in what they define ‘happiness’ and ‘brightness’, coupled with a slight increase in ‘sadness’. These labels resulted from algorithms analysing low-level acoustic features, such as the tempo or the tonality. The tempo and the tonality of the top-100 Billboard songs was also examined: Billboard hits have become slower, and minor tonalities have become more frequent. Minor tonalities are perceived as gloomier with respect to major tonalities. You can try this for yourself by listening to any of the YouTube examples of songs that have been digitally shifted from major to minor, or vice versa, and see how it feels: an unsettlingly happy major-shifted version of REM’s ‘Losing My Religion’ (1991) surfaces periodically on social media.

What is going on here? Discovering and describing trends is important and satisfying, but we also need to try to understand and explain them. In other words, big data needs big theory. One such big theory is cultural evolution. As the name implies, the theory stipulates that culture evolves over time partly following the same principles of Darwinian natural selection, namely, if there is variation, selection and reproduction, then we can expect more successful cultural traits to fixate in the population, and others to go extinct.

By culture, we mean any trait that is socially transmitted as opposed to genetically transmitted. Examples include the language that we speak depending on where we are born, the recipes we use when cooking and, in fact, the music we enjoy. These traits are transmitted socially, in that one individual learns them from observing and imitating other individuals. In contrast, hair colour and eye colour are genetically transmitted from parent to offspring.

The fact that many behaviours are socially learnt is not too surprising. However, for social learning to be adaptive – that is, for it to increase the likelihood of the individual surviving to reproduce – learning has to be selective. It is better to learn from an adult who knows how to cook well, than from siblings who are themselves still learning to cook. Preferentially copying the behaviour of successful individuals is called ‘success-biased transmission’ in cultural evolution lingo. Similarly, there are many other learning biases that might come into play, such as conformity bias, prestige bias or content bias. Learning biases have been used to understand a multitude of cultural traits in both human and nonhuman animal populations over the years, and are proving a fruitful avenue for understanding complex cultural patterns. To try to understand why song lyrics have increased in negativity and decreased in positivity over time, we employed the theory of cultural evolution to see if the pattern can be explained via social learning biases.

We checked for success bias by testing whether songs had more negative lyrics if the top-10 songs of the previous few years had negative lyrics: in other words, were songwriters predominantly influenced by the content of previously successful songs? Similarly, prestige bias was tested for by checking if the songs of prestigious artists of the previous few years also had more negative lyrics. Prestigious artists were defined as those who appeared in the Billboard charts a disproportionate number of times, such as Madonna, who has 36 songs in the Billboard Hot 100. Content bias was checked for by looking at whether songs with more negative lyrics also happened to do better in the charts. If this was the case, it would suggest that there was something about the content of negative lyrics that made the songs more appealing, and thus more popular.

Although we found small evidence for success and prestige bias operating in the datasets, content bias was the most reliable effect of the three in explaining the rise of negative lyrics. This is consistent with other findings in cultural evolution, in which negative information appears to be remembered and transmitted more than neutral or positive information. However, we also found that including unbiased transmission in our analytical models greatly reduced the appearance of success and prestige effects, and seemed to hold the most weight in explaining the patterns. ‘Unbiased transmission’ here can be thought of in a similar way to genetic drift, in which traits appear to drift to fixation through random fluctuations, and in the apparent absence of any selection pressure. This process has been found to explain the popularity of other cultural traits, from decorations in Neolithic pottery to contemporary baby names and dog breeds. Importantly, finding evidence of unbiased transmission does not mean that the patterns have no explanation or are predominantly random, but that there is likely a whole multitude of processes explaining the pattern, and that none of the processes we checked for are strong enough to dominate the explanation.

The rise of negative lyrics in popular English-language songs is a fascinating phenomenon, and we showed that this can be due to a widespread preference for negative content plus some other, yet to be discovered, causes. Given this preference, what we need to explain is why pop-song lyrics before the 1980s were more positive than today. It could be that a more centralised record industry had more control on the songs that were produced and sold. A similar effect could have been brought about by the diffusion of more personalised distribution channels (from blank cassette tapes to Spotify’s ‘Made For You’ algorithmic tailoring). And other, broader, societal changes could have contributed to make it more acceptable, or even rewarded, to explicitly express negative feelings. All these hypotheses could be tested using the data described here as a starting point. Realising that there’s more work to be done to better understand the pattern is always a good sign in science. It leaves room for fine-tuning theories, improving analysis methods, or sometimes going back to the drawing board to ask different questions.