It was bound to happen. Someone was inevitably going to have to write this book, entitled Social Physics, and now someone has just up and done it. Namely, Alex “Sandy” Pentland, data scientist evangelist, director of MIT’s Human Dynamics Laboratory, and co-founder of the MIT Media Lab.

A review by Nicholas Carr

This article entitled The Limits of Social Engineering, published in MIT’s Technology Review and written by Nicholas Carr (hat tip Billy Kaos) is more or less a review of the book. From the article:

Pentland argues that our greatly expanded ability to gather behavioral data will allow scientists to develop “a causal theory of social structure” and ultimately establish “a mathematical explanation for why society reacts as it does” in all manner of circumstances. As the book’s title makes clear, Pentland thinks that the social world, no less than the material world, operates according to rules. There are “statistical regularities within human movement and communication,” he writes, and once we fully understand those regularities, we’ll discover “the basic mechanisms of social interactions.”

By collecting all the data – credit card, sensor, cell phones that can pick up your moods, etc. – Pentland seems to think we can put the science into social sciences. He thinks we can predict a person like we now predict planetary motion.

OK, let’s just take a pause here to say: eeeew. How invasive does that sound? And how insulting is its premise? But wait, it gets way worse.

The next think Pentland wants to do is use micro-nudges to affect people’s actions. Like paying them to act a certain way, and exerting social and peer pressure. It’s like Nudge in overdrive.

Vomit. But also not the worst part.

Here’s the worst part about Pentland’s book, from the article:

Ultimately, Pentland argues, looking at people’s interactions through a mathematical lens will free us of time-worn notions about class and class struggle. Political and economic classes, he contends, are “oversimplified stereotypes of a fluid and overlapping matrix of peer groups.” Peer groups, unlike classes, are defined by “shared norms” rather than just “standard features such as income” or “their relationship to the means of production.” Armed with exhaustive information about individuals’ habits and associations, civic planners will be able to trace the full flow of influences that shape personal behavior. Abandoning general categories like “rich” and “poor” or “haves” and “have-nots,” we’ll be able to understand people as individuals—even if those individuals are no more than the sums of all the peer pressures and other social influences that affect them.

Kill. Me. Now.

The good news is that the author of the article, Nicholas Carr, doesn’t buy it, and makes all sorts of reasonable complaints about this theory, like privacy concerns, and structural sources of society’s ills. In fact Carr absolutely nails it (emphasis mine):

Pentland may be right that our behavior is determined largely by social norms and the influences of our peers, but what he fails to see is that those norms and influences are themselves shaped by history, politics, and economics, not to mention power and prejudice. People don’t have complete freedom in choosing their peer groups. Their choices are constrained by where they live, where they come from, how much money they have, and what they look like. A statistical model of society that ignores issues of class, that takes patterns of influence as givens rather than as historical contingencies, will tend to perpetuate existing social structures and dynamics. It will encourage us to optimize the status quo rather than challenge it.

How to see how dumb this is in two examples

This brings to mind examples of models that do or do not combat sexism.

First, the orchestra audition example: in order to avoid nepotism, they started making auditioners sit behind a sheet. The result has been way more women in orchestras.

This is a model, even if it’s not a big data model. It is the “orchestra audition” model, and the most important thing about this example is that they defined success very carefully and made it all about one thing: sound. They decided to define the requirements for the job to be “makes good sounding music” and they decided that other information, like how they look, would be by definition not used. It is explicitly non-discriminatory.

By contrast, let’s think about how most big data models work. They take historical information about successes and failures and automate them – rather than challenging their past definition of success, and making it deliberately fair, they are if anything codifying their discriminatory practices in code.

My standard made-up example of this is close to the kind of thing actually happening and being evangelized in big data. Namely, a resume sorting model that helps out HR. But, using historical training data, this model notices that women don’t fare so well historically at a the made-up company as computer programmers – they often leave after only 6 months and they never get promoted. A model will interpret that to mean they are bad employees and never look into structural causes. And moreover, as a result of this historical data, it will discard women’s resumes. Yay, big data!

Thanks, Pentland

I’m kind of glad Pentland has written such an awful book, because it gives me an enemy to rail against in this big data hype world. I don’t think most people are as far on the “big data will solve all our problems” spectrum as he is, but he and his book present a convenient target. And it honestly cannot surprise anyone that he is a successful white dude as well when he talks about how big data is going to optimize the status quo if we’d just all wear sensors to work and to bed.