by Muhammad Aurangzeb Ahmad

Science Fiction literature is fraught with examples of what-ifs of history which speculate on how the would have looked like if certain events had happened a different way e.g., if the Confederates had won the American Civil War, if the Western Roman Empire had not fallen, if Islam had made inroads in the imperial household in China etc. At best these are speculations that we can entertain to shed light on our own world but imagine if there was a way to gauge how societies react under certain environmental constraints, social structures and stress. Simulation is often described as the Third Paradigm in Science and the field of Social Simulation seeks to model social phenomenon that cannot otherwise be studied because of practical and ethical constraints. Isaac Asimov envisions the science of predicting future with the psychohistory in the foundation series of science fiction novels.

The history of social simulation can be traced back to the idea of Cellular Automata by Stainlaw Ulam and John von Neumann: A cellular automata is a system of cell objects that can interact with its neighbors given a set of rules. The most famous example of this phenomenon being Conway’s Game of Life, which is a very simple simulation, that generates self-organizing patterns, which one could not really have predicted by just knowing the rules. To illustrate the concept of Social Simulation consider Schilling’s model of how racial segregation happens. Consider a two dimensional grid where each cell represents an individual. The cells are divided into two groups represented by different colors. Initially the cells are randomly seeded in the grid representing an integrated neighborhood. The cells however have preference with respect to what percentage of cells that are their neighbors should belong to the same group (color). The simulation is run for a large number of steps. At each step a person (cell) checks if the number of such neighbors is less than a pre-defined threshold then the person can move by a single cell. If the number of such neighbors meets the threshold then the person (cell) remains at its current position. Even with such a simple setup we observe that the integrated neighborhood slowly becomes segregated so that after some iterations the neighborhood is completed segregated. The evolution of the simulation can be observed in Figure 1. The main lesson to be learned here is that even without overt racism and just having a preference about one’s neighbors can lead to a segregated neighborhood.

Schilling’s model of segregation is almost half a century old and much more sophisticated models to simulate social and economic phenomenon have been created since then. One such pioneering simulation that came out in the early 1990s was Sugarscape, which simulated things like how populations respond to changes in the environment and availability of resources. Using data about rainfall, soil fertility and Native American settlements, scientists from the then newly formed Santa Fe institute tried to simulate the patterns of settlements of Anasazi Indians in the American South West over the course of centuries. By changing the parameters of the simulations anthropologists for the first time were able to simulate how a group of people could have responded to their changing environment given certain constraints. Thus the field of simulated archeology was born. There was however one problem, which to this day has not been really solved, since one could vary different set of parameters for the simulation only in a very narrow range of parameters could one observe what the historical data showed. Thus the question arises, are we really simulating the behaviors of people in this case or are we forcibly trying to fit a mathematical model given a certain data. The answer is not clear and researchers have taken strong positions on either side of the debate.

The virtual Anasazi of the simulations had a number of real world characteristics like procreation, food consumption, resource exploitation, migration etc. but there only so much data that one can collect about the past especially about a civilization that existed a thousand years ago. A described previously the virtual Anasazi simulation is too sensitive to the parameters of the simulation. Another failure of the virtual Anasazi model came from the fact that even though the model approximated the raise and fall of the Anasazi fairly well. It predicted that the Anasazi should have had a substantial population when in fact they abandoned their dwellings for good. An interesting question arises here, could one do better if one had access to more detailed data about the Anasazi or for any other group of people? This question is no longer hypothetical given that we now routinely collect data about hundreds of millions of people? Could such data be used to better model modern societies? Some people may scoff at the idea and state that given the complexity of human societies such an endeavor is impossible. That said, it is possible to predict aggregate behaviors even while recognizing that individuals are unique. Large masses of people do exhibit certain behaviors that can be described by statistical properties. It is the interaction of people and not just their individual behaviors that one has to get right. Recent advances in Big Data and the science of simulation moves us one step closer to such a possibility. One might not be able to predict behaviors of all the people all the time but predict most of the people most of the time. This might be sufficient for something approximating Asimov’s psychohistory in the real world.

It may be the case that such a project might be unfeasible because in order to make it work one would have to severely violate people’s privacy to collect sufficiently rich data. In the end there might be a trade off that has to be make between collecting data for sufficiently rich simulations vs. preserving people’s privacy. On one hand one wants to make the simulations simple so that one can study the effect of a particular phenomenon e.g., racial/national preference in case of neighborhoods. On the other hand this also runs the risk of oversimplifying a human phenomenon where multiple factor may be at play. After all humans are complex creatures with multiple, often contradictory, proclivities that can yield unexpected results.

A word of caution should be added here, even after more than half a century since its inception simulation does not enjoy near universal acceptance in the Social Sciences. Historical data collected on urbanization patterns, climate change, land usage, conflicts etc. could also be used to get a better understanding of local history of different civilizations e.g., imagine what could one learn about the history of Europe and the Middle East by applying such modeling techniques to the Black Plague. We are living at the beginning of the age of Big Data. Marrying Big Data and simulations can do a great deal of social good or evil depending upon how one uses this technology. Predictability and algorithmic control may become facts of life for our descendants.