[Note: This post contains spoilers for Black Mirror]

A few days ago I finished the last available episode of Black Mirror. If you haven’t seen the show, this post will make very little sense to you and I recommend closing this tab, watching the series (ONE EPISODE PER DAY, MAX. DO NOT BINGE.), and then come back after that approximate fortnight to read this.

For those who have seen the show but don’t remember every episode title, here is a reminder (plot descriptions from Imdb.com):

Season 1

The National Anthem

Prime Minister Michael Callow faces a shocking dilemma when Princess Susannah, a much-loved member of the Royal Family, is kidnapped.

Fifteen Million Merits

After failing to impress the judges on a singing competition show, a woman must either perform degrading acts or return to a slave-like existence.

The Entire History of You

In the near future, everyone has access to a memory implant that records everything they do, see and hear. You need never forget a face again – but is that always a good thing?

Season 2

Be Right Back

After losing her husband in a car crash, a grieving woman uses a computer software that allows you to “talk” to the deceased.

White Bear

A woman wakes up in a strange dystopian world with no memory, where everyone is glued to their phones and there are hunters out to kill her.

The Waldo Moment

A failed comedian who voices a popular cartoon bear named Waldo finds himself mixing in politics when TV executives want Waldo to run for office.

White Christmas

In a mysterious and remote snowy outpost, Matt and Potter share an interesting Christmas meal together, swapping creepy tales of their earlier lives in the outside world.

Season 3

Nosedive

In a future entirely controlled by how people evaluate others on social media, a girl is trying to keep her “score” high while preparing for her oldest childhood friend’s wedding.

Playtest

An American traveler short on cash signs up to test a revolutionary new gaming system, but soon can’t tell where the hoot game ends and reality begins.

Shut Up and Dance

When withdrawn Kenny stumbles headlong into an online trap, he is quickly forced into an uneasy alliance with shifty Hector – both at the mercy of persons unknown.

San Junipero

In a seaside town in 1987, a shy young woman and an outgoing party girl strike up a powerful bond that seems to defy the laws of space and time.

Men Against Fire

Future soldiers Stripe and Raiman must protect frightened villagers from an infestation of vicious feral mutants. Technologically, they have the edge – but will that help them survive?

Hated in the Nation

In near-future London, police detective Karin Parke and her tech-savvy sidekick Blue investigate a string of mysterious deaths with a sinister link to social media.

•

I like Black Mirror, for some strange, masochistic, Stockholm-syndromic meaning of “like”. Most things on tv are disappointing in that they don’t surprise you, they work firmly within particular frameworks of genre and style. Some frameworks are more narrow and cramped than others (police procedurals or teen dramas barely leave place to stand), while others are more open. You can be ambitious and constantly work to break out of the previously established frames, but if you keep doing that over and over again you’ll likely to end up with a incoherent mess of a story (looking at you, “Lost”) unless you’re really awesome and have planned everything perfectly.

Black Mirror does it by cheating. It cheats by being an anthology, meaning you start over with a clean slate every time, never having any idea what you’re going to get. The episodes are all separate stories tied together by a common ethos and viewpoint more than anything else. They vary considerably in tone, feel, subject matter, aesthetics and genre. This has certain consequences for the fanbase; while everybody obviously likes the show, taste in individual episodes are all over the place and nearly every story manages to be divisive among fans, who nonetheless are united in their appreciation for the show as a whole.

This offers an interesting opportunity for soma data-driven erisology. I’ve touched upon a few times before here that I’m interested in people’s differing taste in art and stories and what it is, psychologically, that makes us prefer different things. “There is no accounting for taste”, they say. My reaction to that is something like: “Well why the hell not? Have you even tried? There is obviously some kind of explanation”. I won’t try to offer some explanations now, I’m not in a position to do that. But some exploratory work is possible, and Black Mirror is a great subject for it.

•

Black Mirror episodes are all different and the fans react differently to them, which means the set of 13 episodes are a small sample with quite a lot of variation in “story-DNA” terms. Divisive things can be used as clues to what defines people’s taste, but a single divisive movie, tv show or book don’t offer a lot of data — you only get one thing people differ on. What if you could get the same set of people to watch many divisive stories, and these stories were divisive in different ways, splitting people along different dimensions?

Usually this is difficult or impossible because people’s taste doesn’t just determine how they react to stories but which ones they seek out in the first place — and which ones they care to rate, making such data biased. That’s why its good to have a quite small set of relatively diverse stories that you know everyone in the group has seen and can remember separately. Black Mirror offers not a perfect but a pretty good example.

I wanted to examine Black Mirror episode preferences to see if there was any interesting structure to it. When you browse threads on Reddit’s r/blackmirror where fans rank their favorites from top to bottom there is a remarkable amount of disagreement and with a few exceptions the rankings look almost random.

I copypasted data from a few of those threads, comparing usernames to avoid duplicates, and ended up with 89 full ranking lists. Ideally I’d have more but I didn’t want to start a new thread when there already were several of them. What did I find? Lets dig in.

First of all, which is the best episode? Here are they all, ordered from top to bottom by average rank (lower figures indicating higher positions).

Episode title Average rank White Christmas 3.9 Fifteen Million Merits 4.4 Shut Up and Dance 5.0 San Junipero 5.4 White Bear 5.9 The Entire History of You 6.0 Be Right Back 7.2 Hated in the Nation 7.6 Playtest 7.8 Nosedive 8.4 The National Anthem 8.5 Men Against Fire 9.4 The Waldo Moment 11.5

Seems like, among Black Mirror fans hardcore enough to post their full lists on Reddit, “White Christmas” is the favorite. Note however that its average rank is only 4th place, far from a consensus. The community is far more in agreement about the outlier “The Waldo Moment” being the weakest episode, ranking 11.5 out of 13 on average. Interestingly, all three seasons were about equally popular, with averages 6.3, 7.1 and 6.2, respectively.

Since I’m interested in divisiveness and disagreement I also checked which episodes were the most controversial. Based on the discussions on Reddit, I suspected “San Junipero” would top the list, being hailed by many as the best of the series but strongly disliked by others for deviating from the shows’ usual ethos of pessimism and grimness. I also expected “The National Anthem” to be controversial, considering that fans often advise newcomers to not start with it even though it’s the first episode because its primeministerial pig sex puts many people off for some reason.

I was right. Here is the full list of standard deviations in rank, from most to least divisive.

Episode title Standard deviation San Junipero 3.76 The National Anthem 3.52 Playtest 3.27 Fifteen Million Merits 3.25 Shut Up and Dance 3.22 Be Right Back 3.21 Hated in the Nation 3.12 White Bear 3.07 The Entire History of You 3.07 Nosedive 2.98 White Christmas 2.80 Men Against Fire 2.46 The Waldo Moment 2.29

Just looking at the standard deviations doesn’t quite do the data justice because we don’t really have intuitions for standard deviations the way we have for averages. What does 3.76 mean? Here is a figure showing the distribution of rankings for each episode, from best to worst. Note that every single episode has people ranking it in the top three (green) and bottom three (red), and a full 9 out of the 13 is both someone’s favorite and someone’s least favorite.

Ok, so there is a lot of variety. But is it all random and inscrutable or is there some kind of sense to the variation? Does liking one particular episode or episodes make you more likely to like another? I’d presume so, all kinds of recommendation systems for movies, books and whatever are built on that principle and I don’t think people’s preferences are random — there is accounting for taste.

Recommendation systems generally work with spotty and flawed data for reasons I described before, and Black Mirror episodes are an unusually clean data set (hopefully compensating for the small size of my sample) so odds are good we can find something.

I could look at correlations between the rankings of different episodes, but that would only give pairwise relationships. Instead I ran a statistical procedure called Principal Component Analysis, or PCA. What PCA does is to take a multidimensional data set (this set has 13 dimensions, one for each episode) and retain as much as possible of the variation in the set while reducing the number of dimensions by creating complex properties (“principal components”) that consist of weighted combinations of the raw dimensions. It’s technical, requires quite a bit of “data analysis literacy” to really get and hard to explain properly without pictures and way more than one paragraph. What it does, in layman’s terms, is look at all relationships at once and try to find the underlying axes along which the data varies the most.

I ran the analysis and found three components with eigenvalues significantly over 1 (that just means three strong dimensions that very probably are not random noise). By using these three combined properties instead of the full 13-dimensional rankings we can keep about 45% of the total variation, which is decent but not spectacular. Maybe more data would offer better results.

So without further ado, here is the first and strongest axis along which tastes vary. The numbers refer to how strongly each episode defines this dimension (1.000 is the theoretical maximum, 0.0 means complete irrelevance). The next issue is the interpretation of what the axes actually mean, and while PCA is “scientific”, interpreting the resulting axes is an art.

Shut Up and Dance 0.776 The Waldo Moment 0.482 The National Anthem 0.377 White Christmas 0.323 White Bear 0.272 Men Against Fire 0.131 Hated in the Nation 0.089 Playtest 0.044 The Entire History of You -0.136 Fifteen Million Merits -0.243 Nosedive -0.554 Be Right Back -0.589 San Junipero -0.708

So the most powerful pattern is that people who like “Shut Up and Dance”, “The Waldo Moment” and “White Christmas” more than others tend to dislike “San Junipero”, “Be Right Back” and “Nosedive”. This makes sense to me. The top 5 here are kind of grim and shocking (The Waldo Moment sticks out, but it’s quite cynical which I guess fits and also a bit wonky statistically because of its outlier status making the data highly asymmetrical which isn’t ideal for PCA), while the bottom 3 (and to a lesser extent the next 2) are gentler and softer, more relationship-oriented. If I was interested in opening up a jar of angry bees I might also guess that there could be something male vs. female about this axis.

This first axis explains 18% of the variation and is about as important as the second and third put together. The second and third are about equally strong. Here is the second:

Playtest 0,696 White Christmas 0,617 White Bear 0,286 The Entire History of You 0,148 Men Against Fire 0,143 Nosedive 0,026 San Junipero -0,003 Shut Up and Dance -0,008 Be Right Back -0,212 Fifteen Million Merits -0,234 Hated in the Nation -0,331 The Waldo Moment -0,389 The National Anthem -0,67

What this means is less obvious, but what stands out to me is that the top 2 and somewhat no. 3 and even less 4 and 5 all deal with mind games and terrifying, freakish mental experiences. “The National Anthem” and the others near the bottom are more about society and politics, more “extroverted”, you might say.

The third and final dimension is even harder for me to interpret.

Men Against Fire 0,67 Hated in the Nation 0,63 Be Right Back 0,219 White Christmas 0,13 San Junipero 0,13 Shut Up and Dance 0,065 White Bear 0,039 Nosedive -0,051 The National Anthem -0,193 Playtest -0,22 The Waldo Moment -0,309 Fifteen Million Merits -0,333 The Entire History of You -0,703

Ok, the top two are both kind of suspense-based. But so are the middle ones… They’re critical of society in a broad sense, but so is “Fifteen Million Merits” and “The Waldo Moment”. Could there be something about season 1 vs season 3? The top two are the last of season 3 and the bottom 2 are from season 1. Could a certain group of people rate older episodes lower because they haven’t seen them for while and the impression has faded? Another possibility is genre-conformity. “Men Against Fire” is like an action movie while “Hated in the Nation” is like a police procedural, both down-to-earth style wise. The bottom three are a bit more mixed up and more difficult to parse. But I’m grasping now. Suggestions welcome.

•

So there is a pattern behind who likes what. But few if any people will recognize their own taste perfectly in any of the dimensions. I know I dont. I’ll end this post with my own list:

San Junipero The Entire History of You Fifteen Million Merits White Bear Nosedive White Christmas Shut Up and Dance Hated in the Nation The National Anthem Be Right Back Playtest The Waldo Moment Men Against Fire

Yes, in the for-or-against “San Junipero” controversy, I come down on the “pro” side. It was such a wonderful catharsis after so much grimness (and the show surprised me yet again). But note that “SJ” would not be that high up on its own, its place at the top (for me) depends entirely on other episodes in the series being so disturbing [1]. We earned that happy ending, especially after the double gut-punch of the two preceding chapters “Playtest” and “Shut Up and Dance”. After the end of the latter my first words were: “Sometimes I wonder why we’re even watching this show”.

I noticed that ranking all the episodes is really hard. No order seems fair because the episodes are so different as to be incomparable. And how do you rate the episodes that are extremely effective but makes you feel like shit? “White Christmas”[2], “White Bear” and “Shut Up and Dance” are outstandingly well put together but I don’t want to watch them again. It makes me think of Funny Games, a masterpiece that made me want to throw up my intestines.

•••

[1] This is a common problem for any long story. Often the very best and most appreciated elements are the ones that stand out and deviate from expectations (especially in comedy, since subversion of expectation is kind of what humor is). But you want to do more of what works, so those things become less and less outstanding as you do them more and more and a feedback mechanism that amounts to “do more of whatever you do the least” is ultimately self-defeating. Its the mechanism behind Flanderization (Warning, tvtropes link) and the reason Family Guy cutscenes stopped being funny.

[2] I rank the fan favorite “White Christmas” somewhat low. Not because its not powerful or affected me, it is and it did. I think it’s because it contains three separate stories and I find that kind of messy. In general my personal preference is for focused, highly cohesive stories.