0:33 Intro. [Recording date: November 14, 2014.] Russ: So, we're going to be talking about smart machines, artificial intelligence--those are topics we've talked about before on EconTalk. But your book is about really smart machines, what you call, are smart entities; and you call that 'superintelligence.' So, what is superintelligence? Guest: Well, I define it as any intellect that radically outperforms humanity in all practically relevant fields. So it would include things like scientific creativity, social skills, general wisdom. Russ: But, you concede in the books, at some point, that intelligence, what we use that word to mean, is not always a good guide to social outcomes, policy decisions, etc. Right? Guest: Well, certainly I think, and this is one of the claims I elaborate in the book, no necessary connection between being very intelligent and being very good or having a beneficial impact. Russ: So, when you talk about superintelligence, are you ruling anything out? Guest: Well, I'm ruling out all kinds of intelligences that are less than radically superior to humanity in all practically relevant fields. All other non-human animal intelligences, all current human-level intelligences. I think, though, that there is a level of general intelligence that becomes effectively universal. Once you have a sufficient level of general intelligence, you should be able to design new cognitive modules to do whatever forms of intelligent information processing that you might not initially have been capable of doing. So, if you were an engineering superintelligence and you lacked the ability to understand poetry, you should be able to use your engineering superintelligence to construct additional mental modules to also be able to understand poetry. So, I think there is a kind of universality once you reach a sufficiently high level of intelligence. Russ: I'm pressing you on this because I think it's somewhat important. It may not be ultimately important, but I think it's somewhat important in that--and a couple of times in the book you encourage the reader to use his or her imagination to realize the fact that there's not--we think of an Albert Einstein being dramatically smarter than a person of, say, below average IQ (Intelligence Quotient); but that you're imagining somebody that would dwarf an Einstein by many--not a someone--you are imagining an entity that would dwarf an Einstein by many, many magnitudes. And I'm trying to get a feel for what that would mean. So, one of the things it might mean, of course, is that you would process--you could make calculations more quickly. These are ways that we understand the way that computers have outpaced humans today. A computer can find the author of a poem, say, more quickly than I can trying to remember it. It might take me a while; I might not even be able to remember it at all. It could be in my memory; I may have heard of it at one time. Try to flesh out what you mean, then, by intelligence--if you mean something more than, say, speed of computing power. Guest: I think we can distinguish 3 different flavors of superintelligence. They might all be combined into one. But speed superintelligence is one dimension, the easiest one conceptually. Take a human-like mind and just imagined that it operated, say, a million times faster. Then you would have some kind of superintelligence-ish type of thing in that this human mind could achieve things that the human could not, within a given interval of time. So that's like one of these dimensions--you just speed it up. Another dimension is collective superintelligence, where you just have more minds. And so we know that like 1000 people might be able together to solve a problem within a given amount of time that would be beyond any one of them--that could maybe divide the problem up into pieces and make faster progress. Speed superintelligence might be if you imagine it having human-like mind as components, but it had, say, 20 trillion of them instead of 6 or 7 billion. And then, thirdly and finally, there is the notion of a quality superintelligence. There is something that is not just more numerous or faster, but has qualitatively cleverer algorithms. I think that's also possible, although it's harder to rigorously pinpoint how that would work. Russ: Yeah, it's hard to imagine it, because if we could, we could get going in that direction, obviously. Guest: Yeah. Well, maybe by analogy. So we can maybe think of nonhuman animals as having inferior [?] intelligence [?]--inferior in the sense, not that they don't work perfectly for what the animals have time to do, like each animal's intelligence might be very well suited to its ecological niche. But certainly inferior in terms of being able to do, say, science and engineering and technology. It's just not very useful to have a dog trying to work on those problems. And if you sped up the dog, you still probably wouldn't get much progress on an engineering problem. Nor, if you made more copies of the dog. There seems to be something about the human mind that enables us to be able to form abstract concepts more readily, and that gives us kind of a decisive edge in some of these areas. So, similarly, that could maybe be superintelligence, that even if it didn't have more competition on resources that a particular human mind could more quickly jump to the right conclusions and conceptualize things in different ways.

6:52 Russ: So, talk about the two ways that we might get to such a world, that you discuss in your book. Guest: I think [?] that more than two ways one can in principle get to a weak form of superintelligence just by enhancing biological human cognition. I think that initially might happen through genetic selection and genetic engineering. Ultimately, of course there are limits to the amount of information processing that could take place in a biological brain. We are limited to working with biological neurons, which are slow. The brain can only be so big, because the skull is fairly small; whereas [?] a supercomputer can be the size of a warehouse or larger. So, the ultimate potential for information processing in machines is just vastly greater than in biology. Most of the book concentrates, therefore, on the prospect of machine superintelligence. Although it is important I think, and maybe we'll get to that later when one is considering strategic challenges posed by this machine intelligence prospect to think also about whether perhaps efforts to enhance biological cognition might be a good way to improve our chances of managing this transition to machine intelligence era. But if we then think more specifically about paths to machine intelligence, one can distinguish between different types of approaches, ones that would try to achieve a general intelligence by reverse engineering a human brain. We have an existence proof of the system that generates general intelligence, human brain, and one way to proceed would be by studying how the human brain does this. And that may be similar data structures and algorithms in machine [?]. The delimiting variation of that would be why you wouldn't[?] try to copy biology through a process [?] whole brain emulation. But at the other end of the extreme you have these purely synthetic approaches, where you pay no heed to how biology achieves intelligence; you just try to do some basics maths and basic computer science and come up with algorithms that don't look very much like what goes on in our brains. Russ: The analogy used in the book I found helpful is flight--heavier than air flight. So, we can fly. Humans can fly. But we don't fly like birds. Guest: Yeah. And I think that it's an open question which of these paths will lead to machine intelligence first. Ultimately, whichever way you get there I think the synthetic form of artificial intelligence has just more ultimate potential. There is no reason to think that the computational structures that evolution has produced in our brains are close to optimal. There is probably some different way of organizing the computation, if one has these machine elements to work with, that would be more efficient.

9:55 Russ: So, you suggest it's only a matter of time. Maybe a long time before we get to this markedly greater intelligence, say. Let's stick with the machine kind. And you suggest it's going to be dangerous, and it poses a serious threat, potentially, to humanity. Now, you do talk about economics in the book. It's a few pages. And we will get to that, I think. But you are really talking about a threat that is much different than the standard worry that, say, these machines will do everything that humans can do and therefore wages will be low and the only people who have prosperity will be people who own the machines or can program the machines. That is an interesting question; we may get to it. But putting to the side, you have a much different set of worries. What are they? Guest: Well, so we can distinguish two general types of outcomes. I focus most of the book on outcomes in which the first superintelligence becomes extremely powerful. Basically, the idea for thinking that that has a nontrivial probability of happening is that it looks like once you have a machine intelligence that reaches sort of human level, or maybe somewhat above human level, you might get a very rapid feedback loop. So that, even if it takes a very long time to get to human level, machine intelligence, the step from there to superintelligence, to radically greater than human intelligence, might be fairly brief. So, if you have a fast transition from human level machine intelligence to superintelligence, a transition that plays out over hours, days, or weeks, let us say, then it's likely that you will only have one superintelligence at first, before any other system is even roughly comparable. And then this first superintelligence might be very powerful. It's the only superintelligence in the world. And for the same reasons basically that humans are very powerful compared to other non-human animals. This superintelligence which would be radically superior to us in intelligence, might be very powerful compared to homo sapiens. It can develop all kinds of new technologies very quickly, and then strategize and plan-- Russ: Replicate itself. Guest: Yeah, it could [?] Russ: Improve itself. Guest: Indeed. Yeah. To the point where maybe one can consider that it would be able to shape the future according to its preferences, whatever those might be. So, in this scenario, we have one, singleton, forming. Everything might then depend on what the preferences of this first superintelligence are. For instance, I go into some depth in the book: it looks really hard to engineer a c-ai[?] such that it will result in a superintelligence with human-friendly preferences. [?] Russ: Develop a what? What did you call it? Guest: A human-friendly-- Russ: A c-ai[?] Guest: A seed AI. So, you start with something that is less than a superintelligence, probably less than a human. And then that system eventually becomes superintelligent by either improving itself or by us improving it. And so the thing you start with would be a seed AI that gradually becomes a mature AI. And the idea is that we may only be able to work on this seed AI. Once it's a full-fledged superintelligence it could resist further attempts by us to change its values. Russ: It runs amok. From our perspective. Guest: That's the general kind of concern in this singleton outcome. You have one extremely powerful artificial entity, and unless it shares our values we might discover that our values have no place in the future. Now, the different class of scenarios is where you have multiple outcome, where you don't just have one system that gets so far ahead of everything else that it can just lay down the law. But instead you have many systems emerging in parallel, all may be ending up superintelligence, but at no point is one so far ahead of all the others that it can just dictate the future. So, in this multiple outcome, you have a very different set of concerns. Not necessarily less serious concerns; but they look quite different. So, there you could have economic competition setting in and evolutionary dynamics operating on this population of digital minds. And one might worry about the fitness landscape, that would shape the evolution of these digital minds. And I can expand on that if you want.

14:37 Russ: Well, one of the criticisms of this worry, this pessimistic approach or concerned approach, is that: Oh, well, we'll just program them not to do crazy things; and since we're in charge of the code, we humans, we can stop this. I want to say, before you answer that, that--and I'm going to lay my cards on the table--I'm not as worried as you are. For a different set of reasons. Which we are going to come to in a minute. But I do want to concede that the people that I know in the artificial intelligence community are just as worried as you are. You are a philosopher. They are in the trenches. And they are deeply concerned that they are creating a Frankenstein. That they are creating a technology that will essentially cut itself loose from human control and do its own thing. That is the worry. Correct? Guest: Well, I think there are a lot of different worries that people have regarding computers and automation and like [?] worrying about jobs or privacy and unemployment and all of that. But those are not the focus of my book. I'm specifically concerned with the dangers that arise only when you have a system that reaches human-level intelligence or superintelligence. And so I think that--although, I mean, obviously, someone should worry about these other things as well. There is a very distinctive set of issues. And so, superintelligence would not just be yet another cool invention that humans make, another cool gadget-- Russ: A GPS (Global Positioning System)-- Guest: economically. It would be the last invention that we would ever need to make. After that, other inventions would be more efficiently done by the superintelligence. And so that this transition to the machine intelligence era will be a unique point in all of human history, maybe comparable to the rights of the human species in the first place, or the rights of life from inanimate matter. It will be that order of magnitude. It's really one of a kind thing. So that's the distinctive kind of risk that I focus on in the book.

16:58 Russ: So, let me raise, say, a thought that--I'm interested if anyone else has raised this with you in talking about the book. This is a strange thought, I suspect, but I want your reaction to it. The way you talk about superintelligence reminds me a lot about how medieval theologians talked about God. It's unbounded. It can do anything. Except maybe created a rock so heavy it can't move it. Has anyone ever made that observation to you, and what's your reaction to that? Guest: I think you might be the first, at least that I can remember. Russ: Hmmm. Guest: Well, so there are a couple of analogies, and a couple of differences as well. One difference is we imagine that a superintelligence here will be bounded by the laws of physics, and which can be important when we are thinking about how we are thinking about how it might interact with other superintelligences that might exist out there in the vast universe. Another important difference is that we would get design this entity. So, if you imagine a pre-existing superintelligence that is out there and that has created the world and that has full control over the world, there might be a different set of options available across humans in deciding how we relate to that. But in this case, there are additional options on the table in that we actually have to figure out how to design it. We get to choose how to build it. Russ: Up to a point. Because you raise the specter of us losing control of it. To me, it creates--inevitably, by the way, much of this is science fiction, movie material; there's all kinds of interesting speculations in your book, some of which would make wonderful movies and some of which maybe less so. But to me it sounds like you are trying to question--you are raising the question of whether this power that we are going to unleash might be a power that would not care about us. And it would be the equivalent of saying, of putting a god in charge of the universe who is not benevolent. And you are suggesting that in the creation of this power, we should try to steer it in a positive direction. Guest: Yeah. So in the first type of scenario which I mentioned, where you have a singleton forming because the first superintelligence is so powerful, then, yes, I think a lot will depend on what that superintelligence would want. And, the generic [?] there, I think it's not so much that you would get a superintelligence that's hostile or evil or hates humans. It's just that it would have some goal that is indifferent to humans. The standard example being that of a paper clip maximizer. Imagine an artificial agent whose utility function is, say, linear in the number of paper clips it produces over time. But it is superintelligent, extremely clever at figuring out how to mobilize resources to achieve this goal. And then you start to think through, how would such an agent go about maximizing the number of paper clips that will be produced? And you realize that it will have an instrumental reason to get rid of humans in as much as maybe humans would maybe try to shut it off. And it can predict that there will be much fewer paper clips in the future if it's no longer around to build them. So that would already create the society effect, an incentive for it to eliminate humans. Also, human bodies consist of atoms. And a lot of juicy[?] atoms that could be used to build some really nice paper clips. And so again, a society effect--it might have reasons to transform our bodies and the ecosphere into things that would be more optimal from the point of view of paper clip production. Presumably, space probe launchers that [?] used to send out probes into space that could then transform the accessible parts of the universe into paper clip factories or something like that. If one starts to think through possible goals that an artificial intelligence can have, it seems that almost all of those goals if consistently maximally realized would lead to a world where there would be no human beings and indeed perhaps nothing that we humans would accord value to. And it only looks like a very small subset of all goals, a very special subset, would be ones that, if realized, would have anything that we would regard as having value. So, the big challenge in engineering an artificial motivation system would be to try to reach into this large space of possible goals and take out ones that would actually sufficiently match our human goals, that we could somehow endorse the pursuit of these goals by a superintelligence.

22:17 Russ: So, I want to come back to the paper clip example in a second, but before I do I want to raise an issue that you talk about at length in the book. Which is: The seeming, easy way to deal with that is, well, you just keep this in a box. It's in a box; it's a mechanical, physical thing, and you "don't let it" get out of the box, to, say, create space probes, kill people for their atoms or whatever. But you point out that may not be as straightforward as it seems. Guest: Yeah, that's correct. There is this big class of capability-control methods. So the control method is the problem of how to ensure that a superintelligence would be safe and beneficial. And approaches fall into two categories. On the one hand, you could try to limit what the system is able to do. So, put it in a box, disconnect the internet cable; perhaps you would only-- Russ: Unplug it. Guest: Yeah. Maybe put a Faraday cage around the whole thing; maybe you would only let it communicate by typing text on a screen. Maybe only answer questions [?] limit its ability to affect the world. And the other class of control methods is motivation selection methods, where instead of or in addition to trying to limit what this system can do, you would try to engineer it in such a way that it would not want to do things that were harmful to humans. So, we can get back to that. But the capability control methods, I think are going to be important and useful during the development stage of this superintelligence. Like, before we have actually finished engineering the system and put in all the pieces, we might want to use this as an auxiliary method. But ultimately I think we'll have to solve the motivation selection problem. It doesn't look possible to me that we will ever manage to keep superintelligence bottled up and at the same time prevent anybody else from building another superintelligence. Russ: We could give some interesting examples, such as the superintelligence could hack into the financial system, bribe a real flesh-and-blood person to do some things that would help it without even the person's knowledge because it's so much smarter than the person. So there's some really creepy, and again, great movie possible scenarios here that you speculate about. Guest: Yeah. You could imagine having it completely safe in a box; if there was absolutely no way for the box to interact with the rest of the world-- Russ: It's not so useful. Guest: then maybe it would be completely safe but would always be completely inert. You just have a box, basically. Russ: A really smart box. Guest: A really smart box. You might, depending on your moral philosophy, you might care what happens inside the box for its own sake. Like if you build a lot of happy people in boxes, maybe that would be a good thing in its own right. But it wouldn't have a good causal effect on the rest of the world. So at some point you have to have somebody interact with the box--a human gatekeeper who would maybe ask questions and get answers back. But at this point you open up a huge vulnerability-- Russ: An enormous vulnerability-- Guest: because humans are not secure systems. So, now you have a human being interacting with this superintelligence that has a super-human power of persuasion and manipulation, and we know that even humans can sort of manipulate other humans to do their biddings. So, the conservative assumption here would be that a super-human persuader and manipulator would also find its way to hack out of the box or talk its way out of the box. That would seem to be the conservative assumption if we are thinking about how to engineer this system so as to solve the control problem.

26:08 Russ: Let's take up what I consider the biggest puzzle for the skeptic--being me--which is: I don't understand where the whole idea of preferences comes from. You talk a lot in the book about preferences, motivation, the values that this entity would have. Why would it have any? It's a machine. Machines don't have emotion. They don't have desire. They don't have anything like the human psychology. So why would this really smart machine have preferences, values, and motivations other than what we've told it to do? And it would be stupid to tell it to do things like 'kill all the people.' Why would it develop--you seem to suggest it could develop its own, independently of what-- Guest: No, no. I agree that it wouldn't necessarily have anything like human-like emotions and drives and all of that. Nevertheless, from a more abstract point of view, the agent framework seems to be fairly general, in that if you have an intelligent system, a very general kind of that system is a system that is seeking to, that has something like a utility function, some criterion by which it decides which actions to take, and that it is maybe seeking to maximize the probability or the expected utility or some other quantity like that. Russ: Why? Where would that come from? Help me out here. Guest: We would put it in. Russ: Why would we do that? Guest: Well, to achieve some predictability about how this system is going to act. One advantage of this agent system is there is a particular place you can look to see what it is that the system will tend to do--that there is a utility function and you can inspect it; and you know that the system is engineered in such a way as to try to produce actions that will result in a high expected utility. If you have a system where there is no particular thing like a utility function, then the system is still, if it's an intelligent thing, going to produce various actions that might be very instrumentally powerful, but you are going to find it very hard to see what this system will actually do. Russ: It's ironic you mention utility functions, since in a recent episode with Vernon Smith we talked about how the utility function approach to the theory of the consumer is somewhat limiting. It may not be the ideal way to conceptualize a lot of human interaction. But, the part that's hard for me to understand is--let's talk about Deep Blue, the computer that plays chess. And now we understand that computers play chess better than humans. That's all it does. It doesn't get excited when it wins the game. It doesn't try to cheat to win the game. It doesn't express regret if it happens to make a bad move and lose a game--which has happened, of course, in the history of computer-human interaction. It would be a mistake, it would seem to me, to impute those emotional response. Guest: Yeah, no, no--emotion is a very different thing. But it has an evaluation function. So, the way that Deep Blue or any other chess computer works is it does something called alpha-beta search, where it sort of considers: If I make this move, what move can the opponent make, and then what move can I make in response? And it can sort of think a number of steps ahead into the game like that. But then it reaches an end state, and it has to apply an evaluation function, like heuristically say how good this end state like 8 moves into the future would be. So, an evaluation function would maybe include an account of how many pieces there are. Like, if one color has a lot more pieces, that's a sign that it is in a strong position. Center control might be another variable; king's safety. So there is this evaluation function that tries to take an arbitrary state of the board and produce a number that somehow measures how promising that state is. And this, although this is a very simple system, it's a little bit like a utility function. It's a criterion that ultimately determines how it makes its actions. So the claim here is that if we wanted to create an evaluation function for states of the world, we would find that it would be very difficult for us to do so. It's the world; it's not a chess board, but sort of [?] or some complex system like that. We don't know how to explicitly describe in C++ or Python or any computing program all the aspects of the world that would determine whether we would regard it as better or worse, as a good world or a bad world. Russ: Right. So why do you--I'm confused. Why do you mention that shortcoming of C++? That's not a shortcoming of C++; that's a shortcoming of the nature of reality. That's why you talked about God not being limited by the laws of physics--in many ways, I feel like superintelligence in your story is not limited by the laws of physics, in the full sense. There's no--no matter how intelligent we are, there's no way of describing "what's good for the world." That's not a question that is amenable to superintelligence. Guest: Well, the human values are complex things. The shortcoming is in our current ability to describe, capture, represent human values in computing language. So, this is something we don't know how to do. Maybe we could create an AI today that would want to maximize the number of digits of pi that it could calculate. So, a very simple goal like that would be within our current reach to program. But we couldn't make an AI that would maximize justice or love or artistic beauty because these are complex human concepts that we don't yet know how to represent. Russ: Yeah, but it's not just that we don't know how to represent them. They are not representable. Guest: But they are represented in our brains. Russ: I'm making the claim-- Guest: There's some representation. Russ: I'm making a different claim. I'm making the claim that justice, or a good world, or an aesthetic outcome, is not definable across 7 billion people. It has nothing to do with the shortcomings of our brains. It has to do with the nature of the concept of justice. This to me is very analogous to the calculation problem that Hayek and Mises argued about in the 1920s and 1930s and 1940s. It's not a problem of computation. It's not a problem of intelligence. It's a problem of the fundamental nature of the thing we're talking about, the complexity of it. It's not a shortcoming of our intelligence. It's the nature of, no matter how smart we were, no matter how big our brains were, no matter how many computers we had available, we could not design a set of policies that would yield justice for the world. Because that's not a meaningful statement. Guest: Well, there is some mechanism in our individual brains, sorry, in the pool of brains we have together that moves us to make judgments about whether one state of affairs[?] is juster than another. It's not like some kind of, presumably, some kind of magic angel that whispers into our ears; but our brains have machinery that enables us to represent the concept of justice and then to look at specific possible worlds and judge them as juster or less just. So, the idea is that you would maybe need to capture the same capability in an artificial intelligence that our brains have in a biological substrate, to represent these concepts in terms of which our values are defined. But we don't yet know how to do that, because that's beyond the current state of the art. Russ: But you and I don't agree on what would be more just. Perhaps. So how do you deal with that? Guest: Well, no, but precisely because we have the same concept, we are able to disagree. Russ: And? Guest: And so there is something we have in common. We both understand, sufficiently, what justice is that we would be able to have a debate about it. Like, if by justice you meant oranges by justice I meant the digits pi, then we would not be able to engage in a conversation about justice. So, to some extent with these evaluative concepts, we succeed different people in reaching sufficiently similar internal representation that we are able to engage and talk about the same thing. Like, sometimes it fails and people talk past one another, in moral philosophy debates. But with enough clarity we think that it's possible actually for us to think about these things. And we care about them. We both care about justice. And there is some sense in which we care about the same thing.

35:32 Russ: No, I agree; and I apologize for pushing this. But I think it's central to the whole question. If you and I have a different conception of two different states of the world, as to which is superior--right? So we have two different states of the world. In one of them there's a set of outcomes related to wellbeing, prosperity, creativity, aesthetics, health, longevity, etc.; and there's another state that's different. And one has more of one thing and less of another. And I like state A, and I think state A is a better state; and you think state B is a better state. There's no way to resolve that. Guest: Well, [?] different values. Like, we might want different things. Russ: That's what I mean. So, given that we have different values, how could it possibly be the case that, if we were just smarter, say, or an outsider, an arbiter, could solve that problem because it has more intelligence--whatever that means? Guest: No, no. So the problem that we need to solve--it's not the only problem--one of the problems we need to solve is to figure out how to engineer the motivation system of an AI so that it would even agree with one human. Even if our goal here was only to serve your own personal preferences. Suppose you were a dictator and you were building the AI. It's already there--we have a big unsolved technical problem. At the moment, if you try to do this, you would be very unlikely to do anything that was matching your values. You would be more likely to end up inadvertently with a paper clip maximizer or some AI that did something very different from what you had in mind. Because whatever you care about, whether it's pleasure or justice or aesthetic beauty or-- Russ: football-- Guest: Right, or football. All of these are very difficult to define directly in computer code. And in fact, the problem looks somewhat hopeless if one takes the direct frontal assault approach to them. And instead, the best current thinking about how you go about this is to adopt some form of indirect [?], where rather than trying to describe a particular desired end state, a long list of all the attributes we want the future to have, try to use the AI's own intelligence to help with the interpretation of what you had in mind. So rather than specifying an end state, you pretty much specify process whereby the AI could figure out what to do that you were trying to refer to. So, suppose for example that you could somehow give the AI the goal to do that, what you would have asked it to do if you had thought about this question for 4000 years; and if you had known more facts, and if you had been smarter yourself. This is an empirical question, like what you would actually have said to the AI under those idealized circumstances. And the idea then is that the AI can use its superior intelligence to make better estimates of what the answer to that empirical question is than maybe you could if you just try to have a direct stab at it. And so in this way to interject normativity [?] you might be able to outsource some of the cognitive work that would be required if you tried to just create a long list of everything you value with the exact weights that you would have to put on every feature. Which looks like a hopeless project. But you could outsource some of that intellectual labor to the AI itself, which would be better at that kind of intellectual work. Russ: The reason I invoke God--and it's--I have a lot of respect for religion, so don't, listeners out there, misunderstand what I'm saying. But a lot of what you are saying strikes me as what nonbelievers call 'magical thinking.' So, bear with me for a sec. Guest: Can you give an example? Russ: Yes. So, bear with me. Let's talk about something that's a little taste of superintelligence, which is Big Data. A lot of people believe that Big Data is going to solve a lot of problems. And as an economist, I look at Big Data--I'll use, I think I'm getting this right--Nassim Taleb says, 'The bigger the data, the bigger the error.' The bigger the chance for self-deception. The bigger the chance for misunderstanding what's really going on. And you are suggesting that a big enough computer, a big enough data set--just to take an example, let's take history. You could go back--we might debate about whether some major decision in history was a good decision. Dropping the atomic bomb, the attack on Pearl Harbor. The attack on Pearl Harbor seems to have been a mistake for Japan. But that's not obvious. There's a thousand other outcomes that of course could have happened. But I don't believe that--there's no amount of computer power, no level of "intelligence," that would be able to foresee what could have happened under the--except for God. God has an infinite. Guest: I'm not sure that I'm making any of those claims at all. Russ: It seems like you are. Guest: I'm saying that we humans have a certain ability to choose actions to achieve goals. Superintelligence would have a greater ability of that same kind. Not an infinite or perfect capability, but just a greater ability than we humans do. Just as more capable humans might have a better ability than less intelligent or less educated humans. And just as we have more capabilities, particularly in the realm of science and engineering, than, say, chimpanzees have. Russ: But science and engineering are really different from most of the problems we have. That's the challenge. Guest: That's also [?] very important. Russ: Right. I'm all for that. I think we're going to make progress in science and engineering. But that's not going to help us make progress in the way we interact with each other, the problems of organization and governance that make it difficult to use science and technology successfully. Those problems--my claim is that, just to take, again, a trivial example--I don't want the leader, the President of the United States or the Prime Minister of the United Kingdom to be the person with the highest IQ. That would seem to me to be a grievous error. And it would not lead to better decisions. You are suggesting somehow that, oh, that's because you are only limited to an IQ of 150 or 180. Guest: I think it might lead to much worse decisions. Like that the future will only consist of paper clips, or some similar outcome.

41:57 Russ: But the reason you think that is not the same reason I think it. You think it's true because we'll mis-program it. I think it's true because the world is a complex place and no intelligence can solve some of the problems with the kind of certainty that we solve science and engineering problems. That's my claim. Guest: Well, my feeling here is that you might be thinking that I'm believing something or claiming something that I don't actually believe or claim. Say, is there a particular capability that you think that I think the AI would have [?]? Russ: I do. Let me give you a trivial one. But then maybe we'll go to a bigger one. The trivial one is, let's talk about the chess game. Is it possible--it seems to me that in your story, the computer could get its opponent--because it wants to win. Let's say, in the current level of chess-playing computers, they just look for the best move. But let's say its utility function, as you describe it, is 'to win the game.' Period. And there's no limit. That's the goal. And it then would try, of course, to get the competitor, the human competitor, to get drunk, say. Or kill it. You suggest that, say, using social manipulation, strategies, its abilities to foresee the future, it could plan and execute things that we can't imagine. And my thought is: The problems with planning the future, and seducing people, and social manipulation are not just computing problems. They are of a different nature. And being really, really smart doesn't make you a better seducer and manipulator and planner. There's little relationship because of the complexity of reality. Guest: Well. So, I think, whatever the case might be about that, that there are other capabilities that will be sufficient and sort of give the AI great powers to affect the world. And in fact the science and engineering superpower on its own could be sufficient to solve all kinds of things we think maybe humans could achieve with our science and technology if we were given another 20,000 years to work on things. We might then have, I don't know--cures for aging and molecular nanotechnology and robotics that can self-replicate and space-colonizing probes and all manner of other science-fiction like things. Russ: No, they're coming. Guest: [?] It will probably take a lot less, but at least in 20,000 years, if we invest a lot in science and technology we could almost magical technology--limited by the laws of physics, but superior to what we currently have. So an AI could, I propose, do all of the same things, except maybe develop them much faster, if it thinks in digital[?] time scales rather than biological time scales. And with say, advanced molecular nanotechnology, the ability to construct, like self-replicating molecularly precise robotic machinery, then that already might give it sufficient power to take over the world and implement its wishes, independently of its ability to predict complex social systems. There are many different paths that we humans can see at least in outline that if we were much faster, many more of us, or if we were more qualitatively intelligent we could see how we could achieve great effects in the physical world. And there might be additional ones we haven't thought of. And it seems that a disjunction of all of these paths is quite plausible. And that therefore it's quite possible to think that a sufficiently radically superintelligent machine would be able to find a way to change reality to more closely match its preference function. And again, we can make some analogy to the relationship between humans and, say, other animals. So the fate of the gorillas now. Although they are much stronger than we are, yet their fate now depends a lot less on what they do than on what we humans do. And that's because our brains are actually just very slightly different from theirs. And those small changes in our brain architecture have enabled us to be much more efficient at developing technologies, but also complex social organizations and plans; and that then gives us this decisive strategic advantage.

46:26 Russ: Let's talk about the control issue. You have a very interesting analogy to the development of nuclear weapons. You talk about the singleton of--you mentioned earlier the possibility that this superintelligence might become real in one place, one geographical place, before there's competitors. And you make an analogy with the United States being the first and only, at least for a while, a nuclear power. And you talked about the different ways that nuclear weapons might have been controlled. Talk about that, because it's very interesting; and what the implications might be for the superintelligence case. Guest: Well, yes, and part of that discussion was to try to get some grip on the likelihood that there would be this singleton superintelligence, a system with a decisive strategic advantage. So far ahead of everything else that it can shape the future according to its preferences. And, the one variable in terms of that question that one would want to know about is, how long will it take to go from something less than human to something radically superintelligent? But another variable is: How long is the typical gap between different products that you are striving to develop the same technology? There have been various tech races in the 20th century: the race to develop nuclear bombs, thermonuclear bombs, intercontinental ballistic missiles--some other things like that. And one can see what the typical gap between the leader and the closest follower were. And, unsurprisingly, it looks like it's typically a few months to a few years. So, the conclusion one can draw from that is that if we have a fast take[?] scenario in which you go from below human to radically super-human levels of intelligence in a very short period of time, like days or weeks, then it's likely that there will be only one product that has radical superintelligence at first. Because it's just unlikely that there will be two running so closely neck to neck that they will undergo such a transition in parallel. Whereas, if the transition from human level machine intelligence to superintelligence will take decades, then we are more likely to end up with a multiple outcome. And, yeah; then there is the question of what we can do to try to coordinate our actions to avoid--so one danger here is that if there is a technology raised to develop the first system, that if you have a winner-take-all scenario, that each competitor will scale back on its investment in safety in order to win the race. And you'd have a kind of race to the bottom, in terms of safety precautions--if each investment in safety comes at the expense of just making faster progress on being actually making the system intelligent. And so you'd want if possible to avoid that kind of tech race situation. Russ: But in the aftermath of WWII, there were some interesting models, which I had not been aware of, for dealing with nuclear power, the nuclear weapons. Guest: Yes, so there was the Baruch plan, put forward by some quite senior people in the United States; and the hope would be that you could maybe persuade the Soviet Union, other key nations, to put atomic energy under international control. So, only a new agency, some subsidiary of the UN (United Nations) would have access to nuclear bombs. And this was a fairly serious proposal, that was actually floated and with quite high level backing. In the end it didn't work, partly because like Stalin didn't really trust the Western powers. He saw that the Soviet Union could be outvoted in the UN Security Council, the General Assembly. And there was kind of enough mistrust on both sides to thwart this. And so we didn't go down that path of history. But, one can debate exactly how remote the counterfactual is. But at least it was within the set of space of conceivability. At some point. Russ: Yeah. But it does remind us that, if we could develop superintelligence sooner than later, you might care about where it originates. It's a really interesting point. Guest: Yeah. I mean, I think, that it's a common technical problem that anybody would face trying to develop it, which is more important than exactly who develops it, in terms of whether the values actually contain anything--whether the outcomes contains anything that's humanly valuable. But it is true that in addition to that, if you could solve that technical problem, then there is still the question of which values [?] to serve with these [?]AI. And so, I think that it is important to try to get into the field from the very beginning this thing that I call the 'common goods' principle, which is that superintelligence should be developed, if at all, for the benefit for all of humanity, and the service of widely shared ethical ideals. If everybody would share the risks, if somebody develops superintelligence, and everybody also, in my view should stand to get a share of the benefits if things go well. Russ: Right. And that's always a challenge, of course, to make that happen. Guest: Yeah. I mean, on the plus side, the amount of resources that are to be gained, like if things really go well we get this [?] superintelligence and then colonize the universe, has form it [?] into value structures that--there's just so much there, the pie is so enormously large that it would be easy to give each person a whole galaxy to use for their own benefit and there would still be a lot of galaxies left over for your like test products-- Russ: So I want the galaxy-- Guest: Yeah, why not? So, it's easy to be generous, it seems, when you have such an enormous cake suddenly appearing that suddenly squabbling over the exact way in which we should partition it--we should instead focus on working together to make sure that we actually get this giant cake, rather than end up with nothing. Russ: But as you would point out--one of the fun things I like in your book is the various thought experiments. If we think about how much cake we have now compared to, say, 25,000 years ago, you'd think it would be easy to split it up. It's not. We're not so good at splitting up. It's not our strong suit as human beings. Guest: Well, I mean, that's kind of true also. I'm not sure how relevant it is. But I mean there is the Pinker argument of the decline of violence. Russ: Yeah. Guest: What happens is now on splitting it up so that on average people are much better off. Russ: Agreed. It's true. Guest: We could end up with a split where just one person had everything and everybody else had nothing, which we've succeeded[?] in solving the splitting problem better than that. Russ: That's true. Guest: Not that we are perfect, by any means; but we are also a lot better than zero in holding that. I'm not sure how much like evidence these historical parallels really bring, anyway, to this very [?] problem. But in general I think that--and not just for solving the problem of [?] but other really big existential risks as well, arising from other possible technologies in this century that, if we could find ways to solve some of our global coordination problems, like being better at avoiding wars and [?] assistance and stuff like that, that would be helpful for a wide range of different problems that humanity faces. Russ: I'm not sure we're getting any better at that. That's the problem. It comes back to our earlier discussion. And I'm sure that technology is a decisive--I don't see it as a decisive way to solve that global governance issue. Guest: But I'm not necessarily saying that either. So we might agree. Although, I would, I guess, think that there has been some progress on the problem. It's an open question whether that will actually continue. But, even if I[?] looked at the scale of political integration back in the stone age it was like the tribe was the largest unit, maybe 60 people or something. Now we have like over a billion people in China; we have things like the European Union; large areas of the world [?] have weak forms of global government structures--international trade laws, laws of the seas, other conventions, much less than the actual government but still like more than zero. So, it might be that we've already gone most of the way towards unifying most of the world, and we just have sort of one more order of magnitude to go. Russ: Yeah. I don't know.

55:35 Russ: One of the more interesting analogies you make in the book is comparing humans to horses, which I found utterly delightful as a way to imagine what a future might be like in a world with superintelligence. So let's talk about that. Talk about what happened to the role of technology in affecting the life of horses, that population. Guest: Yeah. So this is most relevant for the multipolar outcome, I think, where you end up with a kind of economic competitive scenario with many different [?] and stuff. What happened with the horse is there used to be a lot of them. They grew more numerous. But at some point people developed tractors and cars. And then there was a lot less demand for horse labor, so the horse population shrank from maybe 20 million or so in the United States down to a tenth of that. Because the horse couldn't really earn a subsistence wage any more. So fewer horses were made, and a lot of them went to the meat packers and became glue. And more recently there has been some recovery because of greater demand for horses for recreational purposes. But nowhere near back to their all-time high. Similarly--for most of human history it looks like we've been in a semi-Malthusian state, where average income equaled subsistence level, with fluctuations. If there was a war or a plague that wiped out a lot of people, then for a while after, they could earn above subsistence-level wages, while each person had more land; but then population would grow and average income would fall. So the modern condition that we seem to think of as very normal and we take for granted is only a few hundred years old, and a huge anomaly. Russ: Correct. Guest: But similarly--and that could obviously disappear even aside from any radical technology, even if we just imagined, say, biological evolution acting on the current population, the groups that have higher fertility will dominate the far future. But it could happen a lot faster with digital minds, because digital minds can reproduce in a minute rather than in 20 years, like you can make a copy. A digital mind is software. If you have another piece of hardware you can make a copy instantaneously. So the population of these digital mind-workers could quickly grow to the point where their wages equals the cost of making another copy--electricity bill, the hardware rental cost. And in one set[?] of scenarios you could quickly get into a Malthusian state where the average income drops to subsistence level, but subsistence level for these digital minds, which would be lower than subsistence level for biological minds like ours, because we need housing and food and stuff like that than these more efficient minds. So that means that no human could survive by selling its wage labor--the simplest version of the model--and we would have to live off of capital. We would be in a situation like the horses--the average income we could earn would be less than subsistence; our population would diminish. Now there are a number of wrinkles to that story. If other humans own capital and if they have a basic preference for certain products to be made by human rather than made by machine, then it might be possible for humans to earn a wage income by producing these goods in this particular way, just as some people now pay a premium for something to be handmade or made by indigenous people. Similarly if there were these very rich capitalists, owned a lot of machine hardware in the future, as growth exploded maybe they could afford to pay a lot of humans to do these things that they would prefer to have humans do. But that would--yeah. Nevertheless one worries about the long-term evolutionary dynamics in that population of digital minds and how long a small minority of biological human minds, slow-thinking, increasingly outclassed by these ever-improving digital minds, trillions of them--how long we could retain a property rights system where we would control a significant fraction of the wealth. It seems fairly possible that maybe they would be able to figure out a way to expropriate us[?] or changing, manipulating the political system. Russ: Well, they can hack into the voting system and get their candidates to win every time. Guest: If they could coordinate like that. It's not clear that they would. But that would be one concern. Also, what happens within this population of digital minds itself is a great source of--if the fraction of overall sentient[?] minds that are biological is very small then what matters most might be how things go for these digital minds. If there are trillions and trillions of those and just billions of us, then from the moral point it might be much more important how they fare. And if they are earning subsistence level incomes and if they are being selected constantly for increasing productivity, for spending none of their time just having fun and relaxing then we might have a dystopia where there might be a few human capitalists and rentiers, but the vast majority of all sentient minds might be leading miserable lives. And you still see at that point the possibility of a second transition, then, to a synthetic AI era. Like, something more advanced than these human-like minds. So, yes, these sorts of issues that one would worry about, think about in the multiple outcome. So, before we were talking mainly about the singleton outcomes--one AI gets so far ahead that it just decides what the future should be like. But even if we have this gradual transition with many competing AIs we still have this disturbing prospect. Russ: I want to read a quote. Part of what you are talking about is Thomas Piketty's vision run totally amok. But you actually say something that's relevant to Piketty which came up in our conversation when he was on EconTalk. You say, A scenario in which the fraction of the economy that is owned by machines asymptotically approaches one hundred percent is not necessarily one in which the size of the human slice declines. If the economy grows at a sufficient clip, then even a relatively diminishing fraction of it may still be increasing in its absolute size. Which is some consolation, and of course is a possibility: we would get a smaller share--humans would get a smaller share, but the absolute amount could be growing. And certainly the per capita amount could be growing. Guest: Yeah. Russ: One caveat, which I don't understand, and then we'll close: Again, why would I put any welfare, any weight for justice, moral weight, on the wellbeing of machines? What does that possibly mean when you say these digital minds might be miserable? You are presuming they have some kind of consciousness. Guest: Yeah. In this particular place in the overall argument, I do. So most of the book is independent of the question of whether machines would be conscious or not. Because the instrumental effects on the world could be the same, whether they have inner experience or not. It's what machines do that matters. But insofar as we are concerned with evaluating morally the desirability of these different scenarios, then a lot might hinge on whether these machines have experience. Particularly in this scenario that you just described, where there are more and more machines that own more and more of the economy and almost all the resources is devoted to building these machines, then it seems to me that from many ethical points of view it might matter greatly whether they have inner experiences; and if so, what the quality of those are. If they are conscious and if they are miserable, then that would seem to be a very bad thing. Russ: Yeah, I agree. Guest: So, but there are only a few places where that question becomes important for the arguments in the book. Russ: Oh, I agree.