(Last updated Feb. 11, 2015.)

What could an economics graduate student do to improve our strategic picture of superintelligence? What about a computer science professor? A policy analyst at RAND? A program director at IARPA?

In the last chapter of Superintelligence, Nick Bostrom writes:

We find ourselves in a thicket of strategic complexity and surrounded by a dense mist of uncertainty. Though many considerations have been discerned, their details and interrelationships remain unclear and iffy — and there might be other factors we have not thought of yet. How should we act in this predicament? … Against a backdrop of perplexity and uncertainty, [strategic] analysis stands out as being of particularly high expected value. Illumination of our strategic situation would help us target subsequent interventions more effectively. Strategic analysis is especially needful when we are radically uncertain not just about some detail of some peripheral matter but about the cardinal qualities of the central things. For many key parameters, we are radically uncertain even about their sign… The hunt for crucial considerations… will often require crisscrossing the boundaries between different academic disciplines and other fields of knowledge.

Bostrom does not, however, provide a list of specific research projects that could illuminate our strategic situation and thereby “help us target subsequent interventions more effectively.”

Below is my personal list of studies which could illuminate our strategic situation with regard to superintelligence. I’m hosting it on my personal site rather than MIRI’s blog to make it clear that this is not “MIRI’s official list of project ideas.” Other researchers at MIRI would, I’m sure, put together a different list.

I should also note that in addition to “strategic” work there is also direct technical work on AGI safety to be done — in fact, that’s what MIRI focuses on, for reasons partially enumerated here.

I’ll keep adding to this list as time passes, but I’ll preserve project numbering so the ideas can be referred to easily and stably, e.g. “CS project 14” or “Psych project 4.” If listed projects are completed or no longer seem valuable, I’ll “cross them off” rather than deleting them from the list and thereby changing the project numbering.

Each project description below is merely a seed idea for a project; I assume published studies will differ substantially from my descriptions, depending on the judgments and affordances and creativity of each investigator.

If carried out, these studies could be published as papers, reports, dissertations, or books. Some of them could be very large in scope, others could be quite small, and most of them could be tweaked in various ways to be made more or less ambitious.

Most of the project ideas below are of broad interest, but also have implications for superintelligence strategy — implications which could be spelled out in the study itself, or not, depending on the limitations of the publication venue.

I’ve described each seed idea in a single paragraph, but each project can be described in substantially more detail if someone who would credibly carry out the study asks for more detail. Here are some examples of elaborated project idea descriptions (2-5 pages), including potential research methods, comparison studies, publishing venues, and expert consultants:

I couldn’t find a “natural” way to organize all these project ideas, so I chose what seemed like the least-terrible option and organized them by home field of the project’s most plausible publication venues. Many of these projects are highly interdisciplinary, but I’ve tried to guess at which publication venues were the most plausible candidates (if the study results were written up in a paper), and which field of inquiry those venues were perceived as “belonging to.” Still, this often involved fairly arbitrary guesswork, so e.g. if you’re a computer scientist then please also check the list of projects under the ‘economics’ heading, in case you find a project there that strikes your fancy and can indeed be published in a venue on the border of computer science and economics.

Finally: for brevity’s sake, I make substantial use of field-specific jargon. The list below might make more sense after you’ve read Superintelligence.

Okay, on to my list of quick-and-dirty research project seed ideas.

Computer science

Another survey of AI scientists’ estimates on AGI timelines, takeoff speed, and likely social outcomes, with more respondents and a higher response rate than the best current survey, which is probably Müller & Bostrom (2014). Survey AI subfield experts on rates of progress within their subfields. See project guide here. How large is the field of AI currently? How many quality-adjusted researcher years, funding, and available computing resources per year? How big was the AI field in 1960, 1970, 1980, 1990, 2000, 2010? Given current trends, how large will the field be in 2020, 2030, 2040? Initial steps taken here. How well does an AI system’s transparency to human inspection scale, using different kinds of architectures and methods? See here. Can computational complexity theory place any interesting bounds on AI progress or AI takeoff speeds? (Dewey has looked into this some.) Summarize the current capabilities and limitations of methods for gaining “high assurance” in autonomous and semi-autonomous systems, e.g. in hybrid systems control, formal verification, program synthesis, simplex architectures. Explain which extant methods seem most likely to tractably scale well (for systems that are more complex, more autonomous, more general than those we have now), and what work is needed to extend the most promising methods to handle those challenges. Continue Grace (2013) in measuring rates of algorithmic improvement. Filter not for ease of data collection but for other properties of algorithms, for example economic significance. Construct a first-step “map of mind design space.” Are there principal components? Where do human minds sit in that space relative to apes, dolphins, current AIs, future AIs, etc.? See also Yampolskiy (2014). What are the nearest neighbors of narrow-AI “takeoff” that have actually occurred? Can they teach us anything about the plausibility of various AGI takeoff scenarios? Produce an initial feasibility analysis of Christiano’s proposal for addressing the value-loading problem (Superintelligence, ch. 12). How could one monitor and track AGI development nationally or globally? Could one construct a cryptographic box for an untrusted autonomous system? Produce improved measures of (substrate-independent) general intelligence. Build on the ideas of Legg, Yudkowsky, Goertzel, Hernandez-Orallo & Dowe, etc. Investigate steep temporal discounting as an incentives control method for an untrusted AGI. …

Psychology, Neuroscience, and Biology

How strongly does IQ predict rationality, metacognition, and philosophical sophistication, especially in the far right tail of the IQ distribution? Relevant to the interaction of intelligence amplification and FAI chances. See the project guide here. Is the first functional WBE likely to be (1) an emulation of low-level functionality that doesn’t require much understanding of human cognitive neuroscience at the computational level, as described in Sandberg & Bostrom (2008), or is it more likely to be (2) an emulation that makes heavy use of advanced human cognitive neuroscience, as described by Ken Hayworth, or is it likely to be (3) something else? Can we get WBE without producing neuromorphic AGI slightly earlier or shortly afterward? See section 3.2 for Eckersley & Sandberg (2013). List some feasible but non-realized cognitive talents for humans, and explore what could be achieved if they were given to some humans. (See Superintelligence, ch. 3.) What can we learn about AI takeoff dynamics by studying primate brain evolution? See Yudkowsky (2013). How powerful is evolution? In what ways does it have its hands tied that human programmers aimed at general intelligence don’t? How much more efficient can we expect human researchers to be at finding general intelligence algorithms, compared to evolution? (See Superintelligence, ch. 2.) Investigate the feasibility of emulation modulation solutions, based on currently known cognitive neuroscience. Can a person’s willingness to cooperate with future generations be increased? Conduct follow-ups to e.g. Hauser et al. (2014). …

Economics

Can endogenous growth theory or unified growth theory give us any insight into AI takeoff dynamics? See Yudkowsky (2013). …

History, forecasting, and general social science

Do another GJP/SciCast-style forecasting tournament, but with 5-year and 10-year time horizons for predictions. Did most early AI scientists really think AI was right around the corner, or was it just a few people? The earliest survey available (Michie 1973) suggests it may have been just a few people. For those that thought AI was right around the corner, how much did they think about the safety and ethical challenges? If they thought and talked about it substantially, why was there so little published on the subject? If they really didn’t think much about it, what does that imply about how seriously AI scientists will treat the safety and ethical challenges of AI in the future? Some relevant sources here. TG-style studies of predictions from (1) The Futurist and World Future Review, (2) Technological Forecasting and Social Change, (3) Foresight and International Journal of Forecasting, (4) Journal of Forecasting, (5) publications of the Hudson Institute, (6) publications of the Institute for the Future, (7) publications of the Club of Rome, (8) Journal of Future Studies, (9) Ray Kurzweil (more thorough than section 5.4 here), (10) Alvin Toffler, (11) John Naisbitt, (12) the State of the World reports by the Worldwatch Institute, and/or (13) other sources. What kinds of long-term forecasts are most accurate, by whom, and under what conditions? Conduct a broad survey of past and current civilizational competence. In what ways, and under what conditions, do human civilizations show competence vs. incompetence? Which kinds of problems do they handle well or poorly? Similar in scope and ambition to, say, Perrow’s Normal Accidents and Sagan’s The Limits of Safety. The aim is to get some insight into the likelihood of our civilization handling various aspects of the superintelligence challenge well or poorly. Some initial steps were taken here and here. Conduct a Delphi study of likely AGI impacts. Participants could be AI scientists, researchers who work on high-assurance software systems, and AGI theorists. Is macro-structural acceleration net good or net bad for FAI chances? See “Rates of change and cognitive enhancement” in chapter 14 of Superintelligence, and also Yudkowsky’s “Do Earths with slower economic growth have a better chance at FAI?” Build an improved AGI forecasting model ala The Uncertain Future. Decompose the AGI forecasting problem further, and update the program based on the latest analysis — perhaps, build it on the organization of ideas in Superintelligence. How scalable is innovative project secrecy? Examine past cases: Manhattan project, Bletchly park, Bitcoin, Anonymous, Stuxnet, Skunk Works, Phantom Works, Google X. What is the world’s distribution of computation, and what are the trends? Initial steps taken here. See the project guide here. Which aspects of information technology hardware and software have exhibited exponential price-performance trends in recent decades? Some notes and sources available here. How networked will the world be in 2020, 2030, 2040? How extensive and capable will robots be in 2020, 2030, 2040? Scenario analysis: What are some concrete AI paths to influence over world affairs? See project guide here. Is Bostrom (2009)’s “technological completion conjecture” true? If not, what are some predictable kinds of exceptions? Produce an initial feasibility analysis of Bostrom’s “Hail Mary” approach to the value-loading problem (Superintelligence, ch. 12). Analyze our “epistemic deference” situation: which problems do humans need to solve before we produce superintelligence, and which problems can be left to a properly designed superintelligence? (See Superintelligence, ch. 13.) What is the overall current level of “state risk” from existential threats? (See Superintelligence, ch. 14.) What are the major existential-threat “step risks” ahead of us, besides those from superintelligence? (See Superintelligence, ch. 14.) What are some additional “technology couplings,” in addition to those named in Superintelligence, ch. 14? What are some plausible “second-guessing arguments” with regard to superintelligence? (See Superintelligence, ch. 14.) In practice, to what degree do human values and preferences converge upon learning new facts? To what degree has this happened in history? (Nobody values the will of Zeus anymore, presumably because we all learned the truth of Zeus’ non-existence. But perhaps such examples don’t tell us much.) See also philosophical analyses of the issue, e.g. Sobel (1999). Do we gain any insight by modeling an intelligence explosion not with two parameters (as in Superintelligence, ch. 4) but with four parameters — recalcitrance, algorithms, information, and computational resources? (Dewey has done some thinking on this.) List and examine some types of problems better solved by a speed superintelligence than by a collective superintelligence, and vice versa. Also, what are the returns on “more brains applied to the problem” (collective intelligence) for various problems? (See Superintelligence, ch. 3.) What are the optimization power gains from mere content? What have people figured out without original theoretical advances or new experiments, but just by reading lots of known facts and putting together the pieces in a way that nobody had before? What will be some major milestone for various kinds of people “taking AI seriously” in various ways? How did public perception respond to previous AI milestones? How will the public react to self-driving taxis? Etc. See the debates on this thread. Provide more examples of decisive advantages: the ones in Superintelligence, ch. 5 are all pre-internet. Examine other strategically significant technology races. When do actors not maximize EV given a decisive strategic advantage of some kind? Examine international collaboration on major innovative technology. How often does it happen? What blocks it from happening more? What are the necessary conditions? Examples: Concord jet, LHC, international space station, etc. Signpost the future. Superintelligence explores many different ways the future might play out with regard to superintelligence, but cannot help being somewhat agnostic about which particular path the future will take. Come up with clear diagnostic signals that policy makers can use to gauge whether things are developing toward or away from one set of scenarios or another. If X does or does not happen by 2030, what does that suggest about the path we’re on? If Y ends up taking value A or B, what does that imply? Which kinds of technological innovations produce public panic or outrage, under which conditions? Which kinds of multipolar scenarios would predictably resolve into a singleton, and how quickly? See Superintelligence, ch. 11. What happens when governments ban or restrict certain kinds of technological development? What happens when a certain kind of technological development is banned or restricted in one country but not in other countries where technological development sees heavy investment? What kinds of innovative technology projects do governments monitor, shut down, or nationalize? How likely are major governments to monitor, shut down, or nationalize serious AGI projects? Explore uploading FAI researchers as a potential solution. (See Salamon & Shulman, “Whole Brain Emulation, as a platform for creating safe AGI.”) What is the construct validity of non-anthropomorphic intelligence measures? In other words, are there convergently instrumental prediction+planning algorithms? E.g. can one tend to get agents that are good at predicting economies but not astronomical events? Or do self-modifying agents in a competitive environment tend to converge toward a specific stable attractor in general intelligence space? Sure, “any level of intelligence could in principle be combined with more or less any final goal,” but what kinds of general intelligences are plausible? Should we expect some correlation between level of intelligence and final goals in de novo AI? How true is this in humans, and in WBEs? How quickly would different kinds of agents become optimizery? How strong is the ‘optimizer’ stable attractor? Are there other stable attractors? Are tool-ish or Oracle-ish things stable attractors? What does the bargaining-with-a-future-superintelligence calculus look like for guaranteeing Earth, or our galaxy, or some other slice of the observable universe for humans rather than for the AGI? Do approximately all final goals make an optimizer want to control a spatial region of linearly increasing radius? If a kludge AI stumbles its way into strong self-modification and becomes a maximizer, would its goal function end up being as “alien” as a paperclip maximizer? Are multipolar scenarios safer at all? One intuition for thinking so might be that “inaction is safer because it leaves us with status quo”. This intuition seems wrong but may have a little something to it — e.g. maybe you can use multipolar scenarios to formalize inaction (maybe there’s a Schelling point for multiple AIs that looks more like inaction than the fixed point for a singleton which is just paperclip everything). Secondly, if you just want a sliver of the universe, maybe multipolar outcomes are safer because maybe there’s at least one superintelligence who will give us a sliver. How likely is it that AGI will be a surprise to most policy-makers and industry leaders? How much advance warning are they likely to have? Some notes on this here. Copied from the AI Impacts list: “Look at the work of ancient or enlightenment mathematicians and control for possible selection effects in this analysis of historical mathematical conjectures.” This is relevant to questions of AGI development, AGI surprise, and AI takeoff speed. Copied from the AI Impacts list: “Obtain a clearer picture of the extent to which historical developments in neuroscience have played a meaningful role in historical progress in AI.” How much do the goals of powerful AI agents determine outcomes in a multipolar scenario? By analogy, it’s not clear that the goals of animal agents determine ecological or animal population outcomes as much as other dynamics in the ecological system do. Robin Hanson is writing a book which explores a multipolar WBE scenario, starting with the assumptions that WBEs can’t be qualitatively modified from their human sources much, and that there’s a competitive market for WBEs. One could do a similar analysis on the possible consequences of widely available AGI software, starting with the assumption that AGI software is like all other software we know in terms of reliability, design, development time, synergies between different modules, etc. Enumerate the risks unique to a multipolar scenario, more thoroughly than Bostrom does in Superintelligence. Run several principal agent problem analyses, but vary the assumptions as per different AI/WBE scenarios. E.g. if people with capital build agents, then how much of the future is controlled by the goals of those with capital compared to the goals of the created agents? In a multipolar outcome, which things last as a result of initial conditions? City locations, standards, etc. can last a while. But what else? When nuclear weapons arrived, experts first treated the strategic situation as the same as before but with bigger bombs. But some people thought this was a different situation, and they managed to convince others to treat this differently and to build new strategic analysis tools for this new strategic situation. There was no precedent for this at the time. How did they do that, and what can we learn from them about developing new strategic analysis tools for AI scenarios? …

Philosophy

What are the optimal solutions to normative uncertainty under various conditions? See this interview with Will MacAskill. Do we need to solve the paradoxes of population ethics (Arrhenius 2011) before we have superintelligence? If so, what’s the best solution? Address various problems relating to infinite ethics. Copied from Beckstead’s list: “What do currently known approaches to decision-making under moral uncertainty imply about the case for the overwhelming importance of shaping the far future?” Start by reading my interviews with Beckstead and MacAskill. …

Other

How much of humanity’s cosmic endowment can we plausibly make productive use of given AGI? One way to explore this question is via various follow-ups to Armstrong & Sandberg (2013). Sandberg lists several potential follow-up studies in this interview, for example (1) get more precise measurements of the distribution of large particles in interstellar and intergalactic space, and (2) analyze how well different long-term storable energy sources scale. See Beckstead (2014). Clarify what it would take for there to be perfectly loyal parts to an AGI, coordinating over millions of miles or light-years, etc. Distributed computing challenges in a vast space. …

Acknowledgements

My thanks to Katja Grace and Amanda House for their help in preparing the elaborated project guides, and to the many people who contributed research project ideas to this list, including Nick Beckstead, Nick Bostrom, Paul Christiano, Daniel Dewey, Benja Fallenstein, Robin Hanson, Katja Grace, Louie Helm, Anna Salamon, Anders Sandberg, Carl Shulman, Qiaochu Yuan, Eliezer Yudkowsky, and probably several people whose contributions I’m forgetting. Additional project suggestions are welcome. Also see List of Multipolar Research Projects at AI Impacts, with which my own list has some overlap.