(Authors Note — I wrote this yesterday, shopped it around a bit, and decided to post it here instead. The dates are the real dates of when I originally wrote this. Contains some not too surprising spoilers for a Harry Potter Fanfiction).

Writers of Fan Fiction come from all walks, united by their love of the underlying book, movie, game (or whatever). And Harry Potter has an immense following at www.fanfiction.net, with over who knows how many stories and hundreds of thousands chapters posted. Eliezer Yudkowsky writes one of the most popular, Harry Potter and the Methods or Rationality (or HPMOR). This story is explicitly a pedagogical device – a Rationalist tract to teach readers how to think better. (One of Yudkowsky’s other sites is “Less Wrong”) The sugar for this medicine go down is Harry Potter. Specifically, what if Harry Potter had been raised by a loving couple including a scientist, and blessed with a Richard Feynman like intelligence at a young age?

11 year old Harry James Potter – Evans – Verres lectures his friends (and Dumbledore!) about findings from cognitive science and regular science, including proper brainstorming technique, over-condfidence, and Bayesian thinking. Important psychological works like Cialdini’s classic book Influence or Asch’s Conformity Experiments are explained; numerous others are name checked.

It wouldn’t be popular without a great story. Harry fights bullies, leads an army in mock battles at school (replacing Quidditch), makes friends and enemies and conducts experiments on magic’s secrets. Harry pokes and prods, spells, sometimes with fantastic discoveries, sometimes to no avail. As the story progresses, he edges towards becoming a Dark Wizard himself. Harry jokes “World domination is such an ugly phrase. I prefer to call it world optimisation.” He’s a chaos magnet, polite but dangerous, a mile-a-minute mind in a world where almost anything is possible. He’s not infallible and not the Harry Potter you know; this is an 11 year old genius Muggles can’t handle. The Wizarding world has never seen his like.

Lectures mingle with the plot, all while finding time to make allusions, references and jokes about Rowling’s work and other classics. Harry is an 11 year old science geek; he knows all about Ender’s Game, Batman, Army of Darkness, Star Wars and other comics, films, manga and books. He argues with Dumbledore via Tolkien references.

This peculiar Harry Potter fiction had been on hiatus after nearly 600,000 words when Yudkowsky announced (last year) that the final arc would be published between Valentine’s day and Pi Day (3/14). Fans rejoiced and online discussion blossomed again. For the last two weeks, chapters had been arriving every day or two.

February 28th, afternoon.

Then came Chapter 113, titled “Final Exam” posted on February 28th. This chapter is the hero’s low point, where things look bleakest. Harry is trapped by Voldemort and all the remaining Death Eaters, who have the drop on him. Voldemort (unlike the “canonical’ one from the books) won’t stupidly cast a spell he knows may backfire. This Voldemort agrees with Scott Evil (Doctor Evil’s nephew, played by Seth Green). No elaborate death traps and leaving the hero alone. Just shoot him. Voldemort has a gun (as well as a number of other lethal devices) because he’s worried about magical resonance.

So Chapter 113 ends … and the Author’s Challenge begins : the fans must devise Harry’s escape.

This is your final exam. You have 60 hours. Your solution must at least allow Harry to evade immediate death, despite being naked, holding only his wand, facing 36 Death Eaters plus the fully resurrected Lord Voldemort.…

Any acceptable solution must follow a ridiculously long list of meticulous constraints: any movement, any spell leads to certain death. Nobody knows where Harry is (or that he was even missing). Harry could use any power he’d demonstrated (within those constraints) but couldn’t gain any new ones. There’s no Cavalry, No Deus ex Magica. And ….

If a viable solution is posted before 12:01AM Pacific Time the story will continue to Ch. 121…..Otherwise you will get a shorter and sadder ending.

(Emphasis mine). A small section of the Internet exploded in disbelief.

Yudkowsky had done this before with a Science Fiction story called Three Worlds Collide. But this was on his old site with many fewer readers. I’d read the story well after he’d challenged his fans. Now he was working on a bigger scale. Final Exam was posted five years (to the day!) that Chapter 1 first appeared online. HPMOR has well over half a million page views. Readers faced having a story they’d invested weeks of reading (and sometimes years discussing) just end with the hero’s death. There seemed to be no solution. Voldemort, terrified and highly intelligent had planned this trap out in detail; Harry had blundered into it. (Being smart doesn’t magically give you all the critical information you may need, and Voldemort has decades of training and a few insights Harry lacked).

Harry James Potter-Evans-Verres had, in the preceding chapters, solved complex puzzles and all of them played fair (within the constraints of the world) and provided enough clues to satisfy the strictest mystery writer. But this seemed impossible. Fans despaired. I concocted a solution requiring a Patronus, the Cloak of Invisibility, a time turner, the Sorting Hat and still required negligence on Voldemort’s part that would make SPECTRE rip up your bond villain card. Other solutions were not arguably better.

Complex problems are Yudkowsky’s day job, a Research Fellow at the Machine Intelligence Research Institute. He spends his time (when not writing about Hogwarts) dealing with thorny problems related to Artificial Intelligence – its benefits and risks. The big risk, basis for countless fiction from Frankenstein to Terminator, is “Can we control our creation?” Yudkowsky’s research aims to create guidelines for a Friendly Artificial Intelligence, a machine we can trust to guide humanity into a new Golden Age, and avoiding “Unfriendly A.I.”



Other researchers (See update at end) suggest we isolate A.I. from the internet (and machinery) to keep us safe. We’d keep the A.I. “In a box.” Yudkowsky contends that Artificial Intelligence worthy of the name will be so advanced it will simply talk its way out of the box (assuming it couldn’t hack its way out). To further this argument, Yudkowsky developed “The AI Box experiment” where one player takes the role of the AI and tries to convince his opponent (the “Gatekeeper”) that it is safe to release him. He’s done this several times, and published protocols for this thought experiment.

Yudkowsky has taken the role of the AI in those prior games. After all, He’s the expert and trying to prove the point. If he can convince you to let an unknown quantity run free; what problem would an AI have. You’d probably think it’s your idea all along. Yudkowsky does this in order to draw attention to the dangers of unfriendly AI development. Once the AI gets out, nobody will be able to put it back. And if the AI is unfriendly, that’s Extinction. Game over.

(For a much more detailed introduction to this line of thought, I recommend the Wait but Why articles The Road to SuperIntelligence, and Our Immortality or Extinction.)



March 1st, AM.

Some readers (most on the discussion group I follow) knew this; but this was fan fiction, not a serious research effort. Harry Potter, not HAL and Dave. Less than 24 hours after the challenge had been issued, some discussion groups proposed the thesis – The entire story had built up to renact the AI in a BOX thought experiment with Eliezer playing Gatekeeper against his entire fanbase.

The argument seems compelling.

Harry James Potter-Evans-Verres is a super-intelligent, rational being, capable of discovering the inner workings of magic (well beyond what Harry did in the Rowling’s series, even though the entire series of HPMOR takes place in his first year at Hogwarts).

He was acquiring power at an alarming rate.

He was now trapped with Voldemort himself ready to pull the plug.

Worse still, Voldemort knows that Harry Potter is not friendly. You would think this goes without saying, but Voldemort is not simply afraid for himself but for all wizardkind. (There’s a prophecy, and it’s a long, complicated story). Acting out of a fear of an extinction level event, Voldemort has done everything in his considerable power to catch and neutralize Harry Potter. And done it well. Harry can’t cast spells without permission. He can’t speak to anyone but Voldemort, who is about to pull the trigger. He’s even forced Harry to speak only the truth (via magic) and answer questions like “Have you thought of a plan to defeat me yet?” so he’ll know how long he can delay.

The only thing Harry can do is talk to Voldemort.

Your strength as a rationalist is your ability to be more confused by fiction than by reality — HJPEV

All the constraints were, proponents argued, a clue. In an earlier chapter HJPEV explains that a rationalist avoids needless complexity. And all the solutions proposed were fairly insane. Harry’s internal dialogue mentally “assigns penalties” to complex explanations. You can chart orbits with the Earth in the Center of the Solar System, but its much easier if you put the Sun at the center. The proponents for the box theory argued that fans couldn’t find a solution because they had put the earth in the center of the solar system. The fanbase was trying to write a Hollywood ending where Harry wins, the argument went. But in the real world people talk out their differences all the time. And people who are in a bad situation have to accept it. (That was an explicit lesson that Harry even learned in Defense class early in the story).

So, in this reading (which I consider more likely) Harry Potter and the Methods of Rationality is no less than a five year buildup to Eliezer Yudkowsky taking the other side of the Box Challenge – the side played by the less intelligent person. Yudkowsky appears to have engineered a situation where a small but dedicated portion of the humanity simulates his AI for him in the Potter-verse. He’s spent years explaining how to calmly tackle a seemingly impossible problem, list assets, evaluate what they know and discern truth from fiction. He’s unquestionably provided ample motivation. With the deadline approximately 36 hours away, chat rooms are alive with proposals, debates, strategems, tactics, and detailed analysis of any and all relevant documents available on the internet. Arguments are weighed, flaws discovered and discarded and useful nuggets saved and added to a master list.

You know, like an AI might do.

Can the combined super-intelligence talk their creator out of killing their story, with the odds stacked against them? As day turns to evening on March 1st, some discussion groups aren’t interested in what Harry has, they are listing what he knows about Voldemort’s beliefs; what information he can volunteer that would stay Voldemort’s hand. Others are discussing Eliezier Yudkowsky’s beliefs and knowledge, adding another level of meta to the analysis. In the story, Voldemort himself knows (via magic) that Harry Potter cannot lie. What appeared to be a horribly binding constraint is suddenly a fantastic advantage. Could we trust whatever an advanced being with unknown (or malevolent) motives told us?



Watching the discussion forums with a bit over a day to go, I believe this is the broad stroke solution (with lots of in universe details to be worked out), although I’m irrationally attached to my earlier, needlessly complex answer. I believe this is the author’s intent. It’s elegant. In the universe, Harry Potter will (I suspect) exchange some information about Prophecies and then deduce an alternate (correct) interpretation where it is to everyone’s advantage to keep him alive. To let him out of the box.

In the real world, Yudkowsky gets another argument in his favor. “A few hundred or thousand people could do this to me. An AI could do this to you, easily.” I suspect the answer has already been posted, but I haven’t checked. The submissions page for the final exam already has three hundred thousand words. In less than 36 hours. The author has asked for help summarizing the solutions.



How does magic work in Harry Potter’s world? His experiments are still ongoing. Out here, in the real world, Teller (of Penn and Teller) wrote that “You will be fooled by a trick if it involves more time, money and practice than you (or any other sane onlooker) would be willing to invest.” In our world, Eliezer Yudkowsky spent five years appearing to be writing a story, and just recently the wool has fallen from my eyes.

Footnote #1 — A reader pointed out I did not cite this. I realize that I did not know who proposed this. Some quick googling doesn’t reveal this either. It may be discussed in this Armstrong, Sandberg, Bostrom paper, but I have not bought it. Bostrom’s name is all over the stuff I’ve read, so he probably knows. I’ll try again tomorrow.

Update — March 2nd, 5pm

The deadline is 8 hours away, and Yudkowsky is overwhelmed by the response and requesting help. I have decided to post this now, because I am reasonably confident of the solution, so I am making an advanced prediction. I am less confident of the exact solution, but I do believe that it will involve Aumann’s agreement theorem. My answer certainly will.

I suspect the internet will get a viable solution. However, will the solution make a good story? I’m not sure.

Update 9:30pm (< 5 hours left). I posted my solution to FF.net hours ago. I have no idea how to link to it (since I can’t find it) and I left out a key step hours in any case (oops). But I have posted my actual solution (heavily abbreviated) on reddit in case someone else wants to post it, and as a prediction of the correct answer. I may revise this as errors are noted and I correct them (and add more links), but will put new information in a new post.

Followup post March 3rd — I was wrong.