A massive language study, spanning Google Books, Twitter, popular songs lyrics and The New York Times, has found that English tends to look on the bright side of things. Positive words outnumber the negative.

The findings are preliminary, but offer a glimpse of the origins and fundamental nature of English, and perhaps of language itself.

"In taking the view that humans are in part storytellers – Homo narrativus – we can look to language itself for quantifiable evidence of our social nature," wrote mathematicians from Cornell University and the University of Vermont in an Aug. 29 arxiv paper.

While traditional explanations for the exceptionally rich evolution of human language have involved explicitly goal-directed behaviors like coordinating a hunt, some anthropologists see language as a vehicle for humanity's essential social characteristics, especially our capacities for sharing, altruism and other "pro-social" behavior. From this perspective, language should reflect underlying social imperatives.

However, earlier research into emotional and social architectures underlying English has returned conflicting results. Relatively small-scale analyses find that frequently used words tend to have positive rather than negative emotional connotations: the so-called Pollyanna hypothesis, which states that pleasant, optimistic concepts spread more easily than negative, pessimistic sentiments. But in experimental settings, people prompted to convey emotion have tended to be negative.

Led by the University of Vermont's Isabel Klouman, the researchers decided to approach the question with overwhelming mathematical force. They analyzed four enormous textual databases – 361 billion words in 3.29 million books on Google Books, 9 billion words in 821 million tweets issued between 2008 and 2010, 1 billion words in 1.8 million New York Times articles published from 1987 to 2007, and 58.6 million words from the lyrics of 295,000 popular songs – and compiled for each a list of the 5,000 most-used words.

This produced a list of 10,122 words. The researchers then used Amazon's Mechanical Turk labor-outsourcing service to obtain 50 separate evaluations of each word, which were scored from negative to positive on a scale of 1 to 9. ("Terrorist," for example, received an average score of 1.30, while "laughter" merited an 8.50, the highest of any word.)

Altogether, positive-inflected words outnumbered the negative, and were used more frequently. The findings "suggest that a positivity bias is universal," wrote Klouman and colleagues. "In our stories and writings we tend toward pro-social communication."

The study raises many questions: What would it be like to live and learn in a language that skewed negative? How might the emotional charge of a language correlate with cultural norms and social features? Under the evolutionary pressures that shape languages as well as organisms, would a negative language be any more or less fit? How do other languages stack up?

Other languages and dialects need to be studied, wrote Kloumann's team, and phrases and sentences studied as precisely as individual words.

Image: Scott & Elaine van der Chijs

See Also:

Citation: "Positivity of the English language." By Isabel M. Kloumann, Christopher M. Danforth, Kameron Decker Harris, Catherine A. Bliss, Peter Sheridan Dodds. arXiv, August 29, 2011.