At this point in the deafening media cycle around the story, it’s probably unnecessary to summarize the going Facebook/Cambridge Analytica scandal, but briefly and just in case: Facebook recently announced the suspension of a marketing data company called Cambridge Analytica from its platform after a whistleblower confirmed it had misused ill-gotten Facebook data to construct so-called “psychographic” models and help Trump win the presidency.

For the impatient, my fundamental thesis is this: Cambridge Analytica’s data theft and targeting efforts probably didn’t even work, but Facebook should be embarrassed anyhow.

For the more patient: What on earth is the sinister-sound “psychographics” about, and how is your Facebook data involved?

Antonio García Martínez (@antoniogm) is an Ideas contributor for WIRED. Before turning to writing, he dropped out of a doctoral program in physics to work on Goldman Sachs’ credit trading desk, then joined the Silicon Valley startup world, where he founded his own startup (acquired by Twitter in 2011), and finally joined Facebook’s early monetization team, where he headed its targeting efforts. His 2016 memoir, Chaos Monkeys, was a New York Times best seller and NPR Best Book of the Year, and his writing has appeared in Vanity Fair, The Guardian, and The Washington Post. He splits his time between a sailboat on the SF Bay and a yurt in Washington’s San Juan Islands.

The awkward portmanteau coinage of “psychographics” is meant to be a riff on the “demographics” (i.e. age, gender, geography), which are the usual parameters of how marketers talk about advertising audiences. The difference here is that the marketer attempts to capture some essential psychological state, or some particular combination of values and lifestyle, that imply a proclivity for the product in question. If it sounds nebulous, not to say somewhat astrological, it is. As a great example of the type of cartoonish zodiac that emerges from this approach, take the age-old classic, the Claritas PRIZM segments (now owned and marketed by Nielsen), which have been around since the 90s. One sample segment:

Kids & Cul-de-Sacs: Upscale, suburban, married couples with children - that's the skinny on Kids & Cul-de-Sacs, an enviable lifestyle of large families in recently built subdivisions. […] Their nexus of education, affluence and children translates into large outlays for child-centered products and services.

This sort of caricature of a consumer segment was created as much for potential targeting as for populating ad agency pitches to clients. It took a complex and bewildering world of consumer data and preferences and reduced them to a neat mythology of just-so stories that got ad budgets approved. (“Aspirational Annie wants a starter car!” “Gregarious Greg spends over $400 per month on entertainment!”)

With the rise of programmatic, software-driven advertising in the late aughts, these truthy marketing fairy tales have taken a more quantitative tinge. Which, in the context of Facebook and Cambridge Analytics, is where the psychometricians at Cambridge University come in. Two researchers at the Department of Psychology there, Michal Kosinksi and David Stillwell, had endeavored to craft completely algorithmic approaches to human psychological evaluation. Those efforts included a popular 2007 Facebook app called myPersonality that allowed Facebook users to take a psychometric test and see themselves ranked against the ‘Big Five’ personality traits of openness, conscientiousness, extraversion, agreeableness, and neuroticism (often shortened to OCEAN). According to the report in The Guardian,which first ran the whistleblower’s claims, Cambridge Analytica had approached the authors of the myPersonality app for help with its ads targeting campaign. On being rebuffed, another researcher associated with Cambridge’s psychology faculty, Aleksandr Kogan, offered to step in and reproduce the model.

(Interestingly, you can still take some of their psychometric personality tests here. Don’t worry! No Facebook login required!)

Academic research centers with experimental volunteers and small sample sizes are one thing, but how do you do the study psychographics at Facebook scale? With an app, of course. Kogan wrote a Facebook app that asked Facebook users to walk through their computer-driven rating criteria with the specific view of ranking their ‘OCEAN’ characteristics, plus political inclinations.

Here is where the skullduggery comes in: Let’s assume you build a model that can actually predict a voter’s likelihood of voting for Trump or Brexit based on some set of polled psychological traits. For it to be more than a research paper, you need to somehow exploit the model for actual ads targeting. But the problem is that Facebook doesn’t actually give you the tools to target a psychological state of mind (not yet, anyway)—it only offers pieces of user data such as Likes. To effectively target an ad, Kogan would need to peg diffuse characteristics like neuroticism and openness to a series of probable Facebook Likes, and for Cambridge Analytics, he had to do it at a large scale.

Whether Kogan’s subjects realized it or not when they opted-in to his Facebook app, they allowed him to read some of their Facebook profile data. And for his collaboration with Cambridge Analytica, Kogan then hoovered in those users’ data, plus their friends’ data as well. (Facebook’s platform rules allowed this until mid-2015). That’s how the number of compromised users got as high as a reported 50 million. Kogan and Cambridge Analytica didn’t lure that many test subjects. They simply paid for or attracted hundreds of thousands, and pulling data from their subjects’ friends got them something like a third of the US electorate.

With the Facebook police asleep, and data theft pulled off, what was Cambridge Analytica’s next step?

They had to train a predictive model that guessed what sorts of Likes or Facebook profile data their targeted political archetypes possessed. In other words, now that Cambridge had a test set of people likely to vote for Trump, and knowing their profile data, how do they turn around and create a set of profile data the Trump campaign can input to the Facebook targeting system to reach more people like them?