Please enjoy this guest post by Sean Nixon!

The heart of science is measurement. Seconds came out of the Middle Ages as a measure of time, meters were created in 1793 to standardize the measure of distance, amperes were developed in 1880’s to handle the study of electrical currents, and in the early 1900’s, scientists (or rather statisticians), drunk on their previous success, invented p-values to measure truth. More precisely, p-values give a measurement of how likely it is that the results from an experiment are complete rubbish. A p-value 0 means the scientist’s hypothesis is absolute truth and a p-value of 1 means the experiment gave no support for the hypothesis (all p-values lie between 0 and 1). Ronald Fisher popularized the use of p-values in the 1920s and 30s and set the standard of p<0.05 as the arbitrary cutoff for truth (usually referred to as statistical significance).This had the predictable corollary of making p-values less than 0.05 the litmus test for publishable research. Searching through data for publishable p-values is called p-hacking.

(If you are unfamiliar with p-hacking, the article “Science Isn’t Broken” has an absolutely fantastic interactive illustration that lets you try your hand at p-hacking in real time.)

Briefly, P-hacking occurs when researchers attempt to bend the numbers to fit their narrative. In an idealized situation, statistics aims to divine truth from the tea leaves of chaos, but translating this analysis into something compressible for mere mortals requires constructing a narrative around the results. It’s the difference between saying “subjects who added twenty grams of dark chocolate to their weekly diet over the course of twelve months showed an average decrease in blood pressure of 5+/- 2.43 mmHg,” and saying “dark chocolate is good for combating the effects of stress.” Now imagine that eating the dark chocolate also drove up cholesterol. Or imagine that the effect was only found in people with low sugar diets. Or imagine fifteen different diets were tried and this was the only one that produced a change. Even the most fastidious scientist must make judgement calls about which data is relevant.

Of course, there’s another side to science. In the deep, dark recesses of theoretical physics lie theories and conjectures that utterly defy measurement. The many-worlds interpretation of quantum mechanics makes sense of the underlying mathematics without producing any predictions. Or rather, the predictions literally exist in another dimension. With no observable difference between living in a single lone universe and living in one of an infinite collection of universe, the theory of the multiverse is more philosophy than science.

To understand the branching of the multiverse, imagine two dice: one Newtonian die and one Quantum die. When you roll the Newtonian die it seems like the results are random, but this is an illusion. With enough information about the initial state of the die (the density of the die, the trajectory of the throw, air resistance, etc.) there’s only one solution to Newton’s governing equations of motion. The results of the die roll is uniquely determined. In fact, Stanford mathematician Persi Diaconis has made a career out of showing when things like this aren’t really random. When you roll the Quantum die, however, something strange happens. The governing equations admit six possible solutions. And, when the underlying math has multiple solutions, scientists expect them all to exist somewhere out there. For example, the equations Dirac used to define electrons also predicted the existence of positrons, as a second solution, years before their discovery by Carl Anderson. So, when the the Quantum die appears to come up three, where do the other five possible solutions exist? The branching theory suggests that five other universes must come into being to house the other possible solutions: one universe for each solution. (If you’re a Rick and Morty fan, you might remember this as the premise of the Community episode “Remedial Chaos Theory”).

Science fiction writers have been enamored of alternate realities since long before they were a trendy scientific theory. There are parallel worlds which take the form of an otherworldly dimensions such as in Alice in Wonderland or the Narnia series. Alternate histories where the consequences of a single changed moment in time are teased out like in The Man in the High Castle or it’s spiritual predecessor Bring the Jubilee. Countless TV shows have visited a second earth where the characters’ relationships have been scrambled. And finally, there are franchises that canonically include a full multiverse like Sliders, the DC Comics continuity, or more recently Rick and Morty, Dan Harmon’s wildly popular, animated brainchild, now entering its third season. It is in this last case that the immeasurable strangeness of quantum branching and parallel realities circles back around to the practical calculations of statistics.

Selection Bias of the Rick Kind

In the penultimate episode of Rick and Morty’s first season, the show goes full multiverse. The concept had previously been introduced briefly when, after turning everyone on earth into a Cronenberg-esque horror, the titular duo literally abandon planet and set up shop in someone else’s universe. In the episode “Close Rick-counters of the Rick Kind,” our heroes visit the “Citadel of Ricks,” an entire city populated by versions of Rick Sanchez and his grandson Morty Smith from across the multiverse. Cowboy Rick, insurance salesman Rick, and an entire governing council of Ricks with avant garde haircuts. For purposes of identification, the primary Rick is designated C-137 (the serial number for his home universe).

The Council of Ricks considers Rick C-137 a trouble maker, stating: “…of all the Ricks in the central finite curve, you’re the malcontent. The rogue.”

A condemnation that Rick C-137 wears as a badge of honor, since part of being a Rick means being anti-authority. Rick C-137 believes himself to be the Rick-est Rick, and he later cheers up his grandson by musing that the Rick-est Rick would naturally have the Morty-est Morty. Like an SAT question, Ricks are to the general population what Rick C-137 is to other Ricks. Now, a reasonable person might accept this at face value, smile at the touching moment the relentlessly acerbic Rick (C-137) manages to share with Morty and move onto more important things.

However, that’s not why you read Overthinking It.

Imagine that each Rick is a datum in the enormous dataset of all possible universes. Furthermore, imagine a person’s “Rick-ness” can be distilled into a single constant, R, with large negative R values representing nice, law abiding dimwits, and large positive R values representing brilliant, anarchist asshats. This gives a mathematical framework to transform the existential assertion of being the Rick-est Rickinto a statistical question about the distribution of R values. For instance, looking at the distribution of “Rick-ness” among the general population, we would expect a big lump of average folks with a Rick-ness value near zero and any Rick (even doofus Rick, J19-zeta-7) would appear in the right tail of the distribution for large R values. See Figure 1. The threshold between regular R values and Rick level R values is indicated by a color change from yellow to blue in both distributions.

Similarly, when looking at just the population of Ricks there would be a lump of average Ricks centered around some large R value and then Rick C-137 would fall somewhere in the right hand tail of this already right skewed distribution. See Figure 2. For the moment, we’ll set aside the argument that the Rickest-Rick might instead be the Rick whose R value is exactly the mean (average) among Ricks.

Interdimensional data collection probably presents a great many challenges. Even mere terrestrial endeavors are far from perfect. For example, the United States census tends to undercount minorities; in 2010 it undercounted African-Americans by 2.1% and Hispanics by 1.5% (for a total of about 1.5 million people). Also, that Twitter poll you saw a while ago? Not super accurate.

The problem is selection bias. The impossibility of collecting exactly the data you require means that scientists, statisticians, and pollsters rely on samples (or subsets) of the total population. For example, out of a city of a few thousand Ricks, only about a hundred Ricks appear – maybe ten with actual speaking roles. The audience is left to infer that the rest of the Ricks are similar to the ones that we’re shown. You notice that one-in-twenty Rick/Morty pairs exhibit some visually striking difference (cyborg, cyclops, sea creature, Cronenbergian horror), and you presume that this holds true of the whole population. Ideally, individuals are chosen randomly to get a representative sample. And, when this ultimately fails, other techniques are deployed to try to fake how a completely random sample should look. With a few notable exceptions (like the scientist formerly know as Rick), all of the Ricks that we see in this episode have voluntarily decided to join the Rick Collective. In other words, Ricks who live and work at the Citadel are oversampled.

In a scientific study, this sort of selection bias can taint your results, and there are numerous methods to adjust for oversampling. A notably extreme example of selection bias comes from World War II: the British Airforce was deciding how to most efficiently reinforce their bombers in order to reduce losses without making the planes too heavy or costly. They had been collecting data on all the returning bombers and made plans to reinforce the spots on the planes that most commonly returned with bullet holes. Luckily, before this plan could be implemented, they consulted with the Austrian mathematician Abraham Wald for a second opinion. Wald rightly realized that the data had been tainted by selection bias; the only planes in the data set were those that had successfully returned from their mission. If enemy fire were to bring the plane down, those bullet holes would be conspicuously missing from the data. Thus, the plan was reconsidered. Eventually, the Airforce decided to reinforce the airplane parts displaying the least amount of damage in the data, figuring that the damaged areas of returned planes were of least concern.

In Rick and Morty, the selection bias serves a narrative purpose. Whether or not the original Rick truly represents the epitome of Rick-ness is immaterial; the ocean of Ricks (all geniuses, all sarcastic jerks) creates a background against which Rick C-137’s other characteristics come into stark relief. Artificially skewing the R distribution by only showing Ricks who choose to play by the council’s rules (see Figure 3) seems perfectly reasonable with this goal in mind. As a statistical outlier, Rick C-137 naturally appears to be non-conformist. But, beyond just the anti-council stance, Rick C-137 also shows a greater degree of compassion for Morty, relative to your average Rick.

Throughout the series, Rick C-137 belittles, endangers, neglects, traumatizes, and alienates Morty, so it may seem strange to talk of his compassion. Which is exactly why it takes an entire city of soulless bastards for Rick C-137 to look warm and fuzzy by comparison. The Ricks of the Citadel treat Mortys like pets (at best) or bits of fancy tech, complete with accessories, insurance plans, and coupons for replacement; the plot of the episode where Rick and Morty visit the Citadel involves an enormous engineering project fueled by the suffering of stolen Mortys. Rick C-137, by contrast, spends the episode doing an admittedly terrible job of apologizing to Morty and wraps things up by not quite complimenting Morty for a job well done. Baby steps, as it were. While this doesn’t make Rick C-137’s usual treatment of Morty any less comically horrifying, it does illustrate how Rick C-137 is reaching out (even if he’s failing at it) and establishes a theme that will be expanded on going forward. It’s Rick nature to be a dick, but our Rick is trying to improve.

Narrative p-hacking

It is not entirely surprising that statistics and the multiverses would find themselves so intertwined as both derive from the underlying study of probability. Where statisticians seek to study the distribution of heights or the likelihood of disease across a population of similar people, the multiverse can extend this idea ad absurdum to a population consisting of duplicates of a single person. Not just how likely you are to go bald based on your age, race, sex, etc., but how likely you are to go bald based on the fact that you are Jerry Fling living in Nashville with the stressful job of grizzly bear psychiatrist. Where statisticians seek to assess risk and predict the future, the multiverse confers reality upon all possible futures. A scenario where counterfactuals can be investigated not just hypothetically, but concretely. The only caveat? Our inability to access this data.

The many-world interpretation of quantum mechanics is like having data points from parallel universes. Picking and choosing data to build a narrative constitutes p-hacking. Thus, constructing a narrative set across the multiverse represents the fiction author’s version of p-hacking – or, as I like to think of it, narrative p-hacking. A far less pernicious analog, since tailoring a world (or many worlds) that conveys the writer’s ideas to the audience is, if not the whole point of fiction, at least strongly encouraged.

As a final thought, I’ve been, up to this point, fairly positive about the application of “narrative p-hacking.” However, there’s also a downside to using these techniques. Once you know that when people say “That’s a good question,” they are just stalling for time, once you know advertisers put “Fat Free” on things that never had fat to begin with, once you’re told how the magic is performed, the trick loses its potency. Similarly, once you start seeing the storytelling analogs of p-hacking at work, the curated multiverses lose authenticity. Rick’s unique standing among the other Ricks is really just a failure to take a random sampling. In a half-hour late-night comedy like Rick and Morty, perhaps it doesn’t matter. But what about Sliders? What about Groundhog Day, or Edge of Tomorrow? The Long Earth series? His Dark Materials? At some point, does narrative p-hacking become lazy storytelling?

Got another example of narrative p-hacking? Think the writer’s a nutcase? Leave it in the comments, Morty.