From What I've Tasted of Desire

Tue 24 January 2017 in in commentary

Oh, we have to get this right

Yes, we have to make them see —"Ballad of the Crystal Empire", My Little Pony: Friendship Is Magic

(Epistemic status: somewhat tongue-in-cheek, but also far more plausible than it has any right to be. Assumes the correctness of Blanchard's transsexualism typology without arguing it here.)

So, not a lot of people understand this, but the end of the world is, in fact, nigh. Conditional on civilization not collapsing (which is itself a kind of end of the world), sometime in the next century or so, someone is going to invent better-than-human artificial general intelligence. And from that point on, humans are not really in control of what happens in this planet's future light cone.

This is a counterintuitive point. It's tempting to think that you could program the AI to just obey orders ("Write an adventure novel for my daughter's birthday", "Output the design of a nanofactory") and not otherwise intervene in (or take over) the universe. And maybe something like that could be made to work, but it's much harder than it looks.

Our simple framework for benchmarking how intelligence has to work is expected utility maximization: model the world, use your model to compute a probability distribution over outcomes conditional on choosing to perform an action for some set of actions, and then perform the action with the highest expected utility with respect to your utility function (a mapping from outcomes to ℝ). Any agent that behaves in a way that can't be shoved into this framework is in violation of the von Neumann–Morgenstern axioms, which look so "reasonable" that we expect any "reasonable" agent to self-modify to be in harmony with them.

So as AIs get more and more general, more like agents capable of autonomously solving new problems rather than unusually clever-looking ordinary computer programs, we should expect them to look more and more like expected utility maximizers, optimizing the universe with respect to some internal value criterion.

But humans are a mess of conflicting desires inherited from our evolutionary and sociocultural history; we don't have a utility function written down anywhere that we can just put in the AI. So if the systems that ultimately run the world end up with a utility function that's not in the incredibly specific class of those we would have wanted if we knew how to translate everything humans want or would-want into a utility function, then the machines disassemble us for spare atoms and tile the universe with something else. There's no reason for them to protect human life or forms of life that we would find valuable unless we specifically code that in.

This looks like a hard problem. This looks like a really hard problem with unimaginably high stakes: once the handoff of control of our civilization from humans to machines happens, we don't get a second chance to do it over. The ultimate fate of the human species rests on the competence of the AI research community: the inferential power and discipline to cut through to the correct answer and bet the world on it, rather than clinging to one's favorite pet hypothesis and leaving science to advance funeral by funeral.

Stereotypically at least, computer programming is the quintessential profession of autogynephilic trans women, although it's unclear how much of this is inherent to the work (a correlation between erotic target location erroneousness and general nerdiness) and how much is just a selection effect (well-to-do programmers with non-customer-facing jobs in Silicon Valley can afford to take the "publicly decide that this is my True Gender Identity" trajectory, whereas businessmen, lawyers, and poor people are trapped in the "secret, shameful crossdressing/dreaming" trajectory).

Thus, the bad epistemic hygiene habits of the trans community that are required to maintain the socially-acceptable alibi that transitioning is about expressing some innate "gender identity", are necessarily spread to the computer science community, as an intransigent minority of trans activist-types successfully enforce social norms mandating that everyone must pretend not to notice that trans women are eccentric men. With social reality placing such tight constraints on perception of actual reality, our chances of developing the advanced epistemology needed to rise to the occasion of solving the alignment problem seem slim at best. (If we can't put our weight down on the right answer to a really easy scientific question like the two-type taxonomy of MtF—which lots of people just notice without having to do careful research—then what hope do we have for hard problems?)

Essentially, we may be living in a scenario where the world is literally destroyed specifically because no one wants to talk about their masturbation fantasies.