I enjoyed this story by Dylan Matthews on Vox about effective altruism, an idea I’m all in favor of and wanted to say more about.

EA is the philosophy that we should use science, rather than warm fuzzy feelings or guesswork, to direct our charitable giving where it will do the most good. Compared to the enormous need in the world, the amount of money and energy available for charity is small to begin with, and too many of those scarce and precious dollars and volunteer hours have been squandered on feel-good projects that made no lasting difference. (Celebrity-run charities seem especially susceptible to this problem.) Meanwhile, humbler and cheaper interventions, like bed nets or deworming pills or iodine supplements – or even just giving money to the poor directly – can have far more of an impact, using the standard measure of DALYs.

So far, so good; there’s nothing here I’d argue with. As a universal utilitarian, I want to do the most good I can with the finite resources I have available. But effective altruism has a dark side, capably if unflatteringly showcased by the Vox article.

Specifically, many of EA’s most fervent advocates are wealthy, white, male, tech-obsessed futurists, and that shapes their view of what counts as a “pressing” problem. A large number of them argue that existential risk, or X-risk – extinction-level events for the human species, like meteor impact, alien invasion, or the emergence of evil artificial intelligence – ought to take precedence above all else. As Matthews’ article puts it:

The number of future humans who will never exist if humans go extinct is so great that reducing the risk of extinction by 0.00000000000000001 percent can be expected to save 100 billion more lives than, say, preventing the genocide of 1 billion people. That argues, in the judgment of Bostrom and others, for prioritizing efforts to prevent human extinction above other endeavors. This is what X-risk obsessives mean when they claim ending world poverty would be a “rounding error.”

These EAers believe that a threat which could kill not only every living human, but all the virtually unlimited number of humans who might have lived otherwise, is effectively infinitely bad, even if you consider it vanishingly unlikely. To them, averting this worst-possible outcome is so important that any amount of resources we pour into making it even a tiny bit less probable far outweighs anything we could do for actual, living humans who are suffering. What this often boils down to is that we should spend as much money as possible on AI research so we can guarantee the intelligent machines we’ll eventually build are friendly and helpful, rather than malevolent. One EA advocate went so far as to claim that every dollar spent on computer science research saves eight lives (source).

What all these smart, rational people somehow haven’t noticed is that all they’ve done is reinvent Pascal’s Wager. The classic logic of the Wager is: “No matter how unlikely you consider God’s existence, the punishment for nonbelief is infinitely bad!” Their logic is the same, just with the substitution of “evil AI” for God’s existence, and “human annihilation” for nonbelief.

This is even more explicit in the strange case of Roko’s basilisk, a bizarre thought experiment about an all-powerful future AI that will eternally torture people who didn’t help bring it into existence. The comparison to Pascal’s Wager is isomorphic, right down to the idea that simply being told about the basilisk makes you, too, liable to be tortured in some future existence, if you don’t begin donating money to artificial-intelligence research right after reading this post. Allegedly, some posters on the rationalist Less Wrong message board, where Roko’s basilisk was conceived of, suffered nightmares and other psychological harm from contemplating it.

Again, it amazes me that people who devote themselves to reason and logic couldn’t see the fatal contradiction in this. If they’d consulted an atheist skilled in dealing with religious apologetics, we could have helped them out.

The flaw in Pascal’s Wager is that it becomes meaningless if there are competing possibilities for which god you should believe in, and the EA argument about X-risk eats its own tail in the same way. If I claim that I have the power to annihilate the human species unless I’m given a billion dollars, you should pay me the blackmail, no matter how implausible you think my threat is, because even a tiny reduction in the probability of human extinction is worth any price. But what if many people start making the same threat? What happens when there are too many blackmail demands to pay them all?

Similarly, there are an infinite number of theoretically conceivable future AIs that might choose to torture you unless you perform (or refrain from performing) an infinite number of incompatible actions. What if there’s a “Roko’s anti-basilisk” that hates its own existence and will eternally torment you for supporting the research that brought it into being? That’s completely conceivable and just as plausible as the opposite.

This is a classic example of how an insular group can start with a reasonable premise and then, step by step, back themselves into a ridiculous conclusion. There’s nothing wrong with taking the future into account in our moral calculus – in fact, it’s a necessity. But if a threat is so outside our experience that we have no way to even begin calculating the odds, that’s a sign that personal gut feelings may be getting smuggled in under the guise of science. We shouldn’t worry too much about such fantasy scenarios, much less prioritize them over genuine problems that are hurting real people today. A group of self-proclaimed rationalists, especially rationalists who claim to be devoted to human welfare, have no excuse for not understanding this.