Cell phone tower data predicts which parts of London can expect a spike in crime (1). Google searches for polling place information on the day of an election reveal the consequences of different voter registration laws (2). Mathematical models explain how interactions among financial investors produce better yields, and even how they generate economic bubbles (3).

Using cell-phone and taxi GPS data, researchers classified people in San Francisco into “tribal networks,” clustering them according to their behavioral patterns. Student’s, tourists, and businesspeople all travel through the city in various ways, congregating and socializing in different neighborhoods. Image courtesy of Alex Pentland (Massachusetts Institute of Technology, Cambridge, MA).

These are just a few examples of how a suite of technologies is helping bring sociology, political science, and economics into the digital age. Such social science fields have historically relied on interviews and survey data, as well as censuses and other government databases, to answer important questions about human behavior. These tools often produce results based on individuals—showing, for example, that a wealthy, well-educated, white person is statistically more likely to vote (4)—but struggle to deal with complex situations involving the interactions of many different people.

A growing field called “computational social science” is now using digital tools to analyze the rich and interactive lives we lead. The discipline uses powerful computer simulations of networks, data collected from cell phones and online social networks, and online experiments involving hundreds of thousands of individuals to answer questions that were previously impossible to investigate. Humans are fundamentally social creatures and these new tools and huge datasets are giving social scientists insights into exactly how connections among people create societal trends or heretofore undetected patterns, related to everything from crime to economic fortunes to political persuasions. Although the field provides powerful ways to study the world, it's an ongoing challenge to ensure that researchers collect and store the requisite information safely, and that they and others use that information ethically.

Society in High Resolution Although it builds on traditional methods, computational social science is a young discipline. In February 2009, 15 researchers published a paper in Science announcing the emergence of the field (5). Computer scientist Alex Pentland of the Massachusetts Institute of Technology, one of the paper’s coauthors, admits that declaring the birth of a new field was “a bit cheeky.” But the article made a splash and has since been cited more than 500 times, according to the Web of Science. New technology has made possible the types of observations driving the field’s growth. A social scientist in the 1930s had to go door to door asking people how much money they spent last year. Today, researchers can follow transactions across an entire city, on millisecond timescales, through credit card data. This incredible abundance of data is allowing computational social science practitioners to tease out much more subtle, high-resolution results than older methods could have ever provided. “It’s like having an electron microscope versus a light microscope,” says sociologist Michael Macy of Cornell University in Ithaca, New York. Powerful computer simulations have been a particular boon to the field. Starting in the mid- to late-2000s, researchers showed that as cities add more residents, many of their traits—from gross domestic products to patents per head to crime and sexually transmitted disease transmission rates—increase exponentially. For example, a doubling in population led to an average 130% increase in economic productivity. But nobody could figure out exactly why this should be. Pentland and a team of colleagues investigated this phenomenon with a computer model that simulated social ties in virtual cities of from 10 thousand to 10 million residents (6). They found that, as the population density grew, the number of interactions each individual could have increased by an exponential factor. From their model, they derived a mathematical curve that almost perfectly predicted the observational data from cities around the globe. The work (6) suggested possible ways to improve real-world cities that didn’t seem to be living up to their potential; for example, in third-world countries the exponential increase in productivity didn’t materialize, despite increasing populations. The team believes it’s because the transportation networks in these places are usually underdeveloped, meaning that people can’t get around and interact with one another easily. “So if you want to make a richer city, make transportation better,” says Pentland. Where people hail from in the Mexico City area, here indicated by different colors, feeds into a crime-prediction model devised by Alex Pentland and colleagues (6). Image courtesy of Alex Pentland (Massachusetts Institute of Technology, Cambridge, MA).

Mining the Social Network As digital means have come to dominate how we communicate, social scientists have also discovered a great deal more about our real-life interactions. Every day we share links on Facebook, publish pictures on Instagram, and listen to music on Spotify. “Each time we express our views, send an email, or post something online, we generate breadcrumbs of behavior,” notes political scientist Solomon Messing of Stanford University in California. Cell phone data in particular has become a valuable computational social science tool. Research from David Lazer and his colleagues has shown how mobile phones can lead to better predictions of unemployment rates (7). On the surface, the two don’t seem to have anything in common. But cell towers provide a proxy for people’s movements, and the employed have different movement patterns than the unemployed. The most obvious disparity: the employed tend to regularly travel back and forth between two points on weekdays. “The thing you have to remember about unemployment statistics is they’re very slow and noisy,” says Lazer, who teaches political science and communication at Northeastern University in Boston. It takes months to collect and publish such conventional unemployment data, which can contain errors resulting simply from the fact that sometimes people don’t immediately admit that they’re unemployed. Cell phone towers provided Lazer’s team with information that was both more up-to-date and fine-grained than that which had been gathered via traditional means. With these data in hand, they were able to accurately forecast unemployment rates up to four months before the release of official reports.