WHAT do two Twitter users who live halfway around the world from each other have in common? They might speak the same “super-dialect”. An analysis of millions of Spanish tweets found two popular speaking styles: one favoured by people living in cities, another by those in small rural towns.

Bruno Gonçalves at Aix-Marseille University in France and David Sánchez at the Institute for Cross-Disciplinary Physics and Complex Systems in Palma, Majorca, Spain, analysed more than 50 million tweets sent over a two-year period. Each tweet was tagged with a GPS marker showing whether the message came from a user somewhere in Spain, Latin America, or Spanish-speaking pockets of Europe and the US.

The team then searched the tweets for variations on common words. Someone tweeting about their socks might use the word calcetas, medias, or soquetes, for example. Another person referring to their car might call it their coche, auto, movi, or one of three other variations with roughly the same meaning. By comparing these word choices to where they came from, the researchers were able to map preferences across continents (arxiv.org/abs/1407.7094).

According to their data, Twitter users in major cities thousands of miles apart, like Quito in Ecuador and San Diego in California, tend to have more language in common with each other than with a person tweeting from the nearby countryside, probably due to the influence of mass media.


Studies like these may allow us to dig deeper into how language varies across place, time and culture, says Eric Holt at the University of South Carolina in Columbia.

This article appeared in print under the headline “Super-dialects exposed via millions of tweets”