The Open Philanthropy Project recommended a grant of $500,000 to the Machine Intelligence Research Institute (MIRI), an organization doing technical research intended to reduce potential risks from advanced artificial intelligence. We made this grant despite our strong reservations about MIRI’s research, in light of other considerations detailed below.

We found MIRI’s work especially difficult to evaluate, so we set up a fairly extensive review process for five of MIRI’s best (according to MIRI) papers/results produced in 2015-2016. These papers/results were all concerned with MIRI’s “Agent Foundations” research agenda, which has been the primary focus of MIRI’s research so far. This process included reviews and extensive discussion by several of our technical advisors, who assessed both the research topics’ relevance to reducing potential risks and the pace of progress that had been made on the topics; we also commissioned reviews from eight academics to help inform the latter topic.

Based on that review process, it seems to us that (i) MIRI has made relatively limited progress on the Agent Foundations research agenda so far, and (ii) this research agenda has little potential to decrease potential risks from advanced AI in comparison with other research directions that we would consider supporting. We view (ii) as particularly tentative, and some of our advisors thought that versions of MIRI’s research direction could have significant value if effectively pursued. In light of (i) and (ii), we elected not to recommend a grant of $1.5 million per year over the next two years, which would have closed much of MIRI’s funding gap and allowed it to hire 4-6 additional full-time researchers. This page does not contain the details of our evaluation of MIRI’s Agent Foundations research agenda, only high-level takeaways. We plan to write more about the details in the future and incorporate that content into this page.

Despite our strong reservations about the technical research we reviewed, we felt that recommending $500,000 was appropriate for multiple reasons, including the following:

We see our evaluation of MIRI’s research direction as uncertain, in light of the fact that MIRI was working on technical research around potential risks from advanced AI for many years while few others were, and it is difficult to find people who are clearly qualified to assess its work. If MIRI’s research is higher-potential than it currently seems to us, there could be great value in supporting MIRI, especially since it is likely to draw less funding from traditional sources than most other kinds of research we could support. We think this argument is especially important in light of the fact that we consider potential risks from advanced AI to be an outstanding cause, and that there are few people or organizations working on it full-time.

We believe funding MIRI may increase the supply of technical people interested in potential risks from advanced AI and the diversity of problems and approaches considered by such researchers.

We see a possibility that MIRI’s research could improve in the near future, particularly because some research staff are now pursuing a more machine learning-focused research agenda.

We believe that MIRI has had positive effects (independent of its technical research) in the past that would have been hard for us to predict, and has a good chance of doing so again in the future. For example, we believe MIRI was among the first to articulate the value alignment problem in great detail.

MIRI constitutes a relatively “shovel-ready” opportunity to support work on potential risks from advanced AI because it is specifically focused on that set of issues and has room for more funding.

There are a number of other considerations. In particular, senior staff members at MIRI spent a considerable amount of time participating in our review process, and we feel that a “participation grant” is warranted in this context. (This reasoning is only part of our thinking, and would not justify the full amount of the grant; however, note that we believe MIRI spent several times as much capacity on our process as nonprofits typically do when they receive participation grants from GiveWell). Additionally, as we ramp up our involvement in the area of potential risks from advanced AI, we expect to ask for substantially more time from MIRI staff.

There is a strong chance we will renew this grant next year.

The judgments and decisions that use “we” language in this page primarily refer to the opinions of Nick Beckstead (Program Officer, Scientific Research), Daniel Dewey (Program Officer, Potential Risks from Advanced Artificial Intelligence), and Holden Karnofsky (Executive Director).

Background and process

This grant falls within our work on potential risks from advanced artificial intelligence (AI), one of our focus areas within global catastrophic risks. We wrote more about this cause on our blog.

The organization

The Machine Intelligence Research Institute (MIRI) is a nonprofit working on computer science and mathematics research intended to reduce potential risks from advanced AI.

MIRI was founded in 2000 as the Singularity Institute for Artificial Intelligence (SIAI), with the mission to “help humanity prepare for the moment when machine intelligence exceeds human intelligence.” Our understanding is that for several years SIAI was primarily focused on articulating and communicating problems of AI safety, by writing public content, influencing intellectuals, and co-hosting the Singularity Summit. In 2013, SIAI changed its name to MIRI and shifted its primary focus to conducting technical research, pursuing a highly theoretical “Agent Foundations” research agenda. In May 2016, MIRI announced that it would be pursuing a machine learning research agenda alongside the original agenda.

Open Philanthropy Project staff have been engaging in informal conversations with MIRI for a number of years. These conversations contributed to our decision to investigate potential risks from advanced AI and eventually make it one of our focus areas. For more details on these early conversations, please refer to our shallow investigation.

We consider MIRI to be a part of the “effective altruism” community. It is not a part of mainstream academia. Because MIRI’s research priorities are unusual and its work does not always fall within any specific academic subfield, there are relatively few people who we feel clearly have the right context to evaluate its technical research. For this reason, we and others in the effective altruism community have found it challenging to assess MIRI’s impact.

Our investigation process

Nick Beckstead, Program Officer for Scientific Research, was the primary investigator for this grant. Daniel Dewey, Program Officer for Potential Risks from Advanced Artificial Intelligence, also did a substantial amount of investigation for this grant, particularly in evaluating the quality of MIRI’s research agenda.

We attempted to assess MIRI’s research primarily through detailed reviews of individual technical papers. MIRI sent us five papers/results which it considered particularly noteworthy from the last 18 months:

Papers 1, 3, and 4 were completed works, Paper 2 was an unpublished work in progress, and Result 5 was an unpublished result that was presented in person. This selection was somewhat biased in favor of newer staff, at our request; we felt this would allow us to better assess whether a marginal new staff member would make valuable contributions. Additionally, older works may have included collaborations between MIRI and Paul Christiano, one of our technical advisors, and we wanted to minimize any confusion or conflicts of interest that may have caused.

All of the papers/results fell under a category MIRI calls “highly reliable agent design”. Four of them were concerned with “logical uncertainty” — the challenge of assigning “reasonable” subjective probabilities to logical statements that are too computationally expensive to formally verify. One was concerned with “reflective reasoning” — the challenge of designing a computer system that can reason “reliably” about computations similar or identical to its own computations.

Papers 1-4 were each reviewed in detail by two of four technical advisors (Paul Christiano, Jacob Steinhardt, Christopher Olah, and Dario Amodei). We also commissioned seven computer science professors and one graduate student with relevant expertise as external reviewers. Papers 2, 3, and 4 were reviewed by two external reviewers, while Paper 1 was reviewed by one external reviewer, as it was particularly difficult to find someone with the right background to evaluate it. Result 5 did not receive an external review because the result had not been written up and, at the time we were commissioning external reviews, MIRI asked us to keep the result confidential. However, the result was presented to Daniel Dewey and Paul Christiano, and they wrote reviews of the result for us. MIRI is now discussing the result publicly, though it has yet to be released as a finished paper.

We have made all external reviews of the published work (Papers 1, 3, and 4) public, although the reviewers’ names were kept anonymous. Of the four technical advisors named above, three provided permission to publish anonymized versions of their reviews of the published work. A consolidated document containing all public reviews can be found here.

In addition to these technical reviews, Daniel Dewey independently spent approximately 100 hours attempting to understand MIRI’s research agenda, in particular its relevance to the goals of creating safer and more reliable advanced AI. He had many conversations with MIRI staff members as a part of this process.

Once all the reviews were conducted, Nick, Daniel, Holden, and our technical advisors held a day-long meeting to discuss their impressions of the quality and relevance of MIRI’s research.

In addition to this review of MIRI’s research, Nick Beckstead spoke with MIRI staff about MIRI’s management practices, staffing, and budget needs.

Our impression of MIRI’s Agent Foundations research

While we are not confident we fully understand MIRI’s research, we currently have the impression that (i) MIRI has made relatively limited progress on the Agent Foundations research agenda so far, and (ii) this research agenda has limited potential to decrease potential risks from advanced AI in comparison with other research directions that we would consider supporting. We view (ii) as particularly tentative, and some of our advisors thought that versions of MIRI’s research direction could have significant value if effectively pursued. This page does not summarize the details of our reasoning on points (i) and (ii) or the details of our reasoning on the two questions listed below. We plan to write more about that in the future and add it to this page.

Through technical reviews and the subsequent discussion, we attempted to answer two key questions about MIRI’s research:

How relevant is MIRI’s Agent Foundations research agenda?

Our technical advisors generally didn’t believe that solving the problems outlined in MIRI’s Agent Foundations research agenda would be crucial for reducing potential risks from advanced AI. Some felt that it could be beneficial to solve these problems, but it would be difficult to make progress on them. There was a strong consensus that this work is especially unlikely to be useful in the case that transformative AI is developed within the next 20 years through deep-learning methods. We did not thoroughly review MIRI’s second, machine learning-focused research agenda, because at the time of our investigation very little work had been done on it.

How much progress has MIRI made on its Agent Foundations agenda?

Overall, both internal and external reviewers felt that the reviewed work was technically nontrivial, but unimpressive. (Paper 1, Fallenstein and Kumar 2015, was a notable exception. The reviewer of that paper suggested it was technically impressive work that a relatively limited number of computer scientists would be in a position to do. Paper 2 received very mixed reviews.) Note that MIRI predicted in advance that internal and external reviewers were relatively unlikely to be impressed by their work without substantially more back-and-forth than the reviewers had time for, and that this would likely be the case whether or not MIRI had made substantial progress. We did not see a compelling alternative way to address the question in addition to other approaches we were already pursuing, such as speaking at length with MIRI staff members and other researchers concerned with AI safety. MIRI’s advance predictions about the review process can be found here, along with post-review comments by MIRI that were drafted after the completion of our grant decision process.

We asked our technical advisors to help us get a sense for the overall aggregate productivity represented by these papers. One way of summarizing our impression of this conversation is that the total reviewed output is comparable to the output that might be expected of an intelligent but unsupervised graduate student over the course of 1-3 years. Our technical advisors felt that the distinction between supervised and unsupervised work was particularly important in this context, and that a supervised graduate student would be substantially more productive over that time frame.

About the grant

Budget and room for more funding

MIRI operates on a budget of approximately $2 million per year. At the time of our investigation, it had between $2.4 and $2.6 million in reserve. In 2015, MIRI’s expenses were $1.65 million, while its income was slightly lower, at $1.6 million. Its projected expenses for 2016 were $1.8-2 million. MIRI expected to receive $1.6-2 million in revenue for 2016, excluding our support.

Nate Soares, the Executive Director of MIRI, said that if MIRI were able to operate on a budget of $3-4 million per year and had two years of reserves, he would not spend additional time on fundraising. A budget of that size would pay for 9 core researchers, 4-8 supporting researchers, and staff for operations, fundraising, and security.

Any additional money MIRI receives beyond that level of funding would be put into prizes for open technical questions in AI safety. MIRI has told us it would like to put $5 million into such prizes.

Case for the grant

If we had decided to pursue maximal growth for MIRI, we would have recommended a grant of approximately $1.5 million per year, and would likely have committed to two years of support. We decided against this option primarily because of our strong reservations about MIRI’s Agent Foundations research (see above). Despite this, we felt that recommending $500,000 was more appropriate than recommending no grant, for the following reasons:

Uncertainty about our technical assessment

We see our evaluation of MIRI’s research direction as uncertain, in light of the fact that MIRI was working on technical research around potential risks from advanced AI for many years while few others were, and it is difficult to find people who are clearly qualified to assess its work. We note that among our technical advisors, Paul Christiano has been thinking about relevant topics for the longest, and he tended to be the most optimistic about MIRI’s work. His relative optimism may come from having a deeper understanding of the value alignment problem and a better sense of how MIRI’s work may be able to address it.

If MIRI’s research is higher-potential than it currently seems to us, there could be great value in supporting MIRI, especially since it is likely to draw less funding from traditional sources than most other kinds of research we could support. We think this argument is especially important in light of the fact that we consider potential risks from advanced AI to be an outstanding cause, and that there are few people or organizations working on it full-time.

We expect that in the coming years, MIRI will have more opportunities to make the case for the value of its research, and general interest in relevant research will grow. We are unlikely to continue renewing support to MIRI 2-3 years from now if we do not see a stronger case that its research is valuable.

Increasing research supply and diversity

Overall, it seems to us that supporting MIRI will increase the total supply of technical researchers who are very thoughtful about safe implementation of AI, while generally not drawing those researchers away from institutions where (we would guess) safety researchers would be likely to have greater positive impact, such as top AI labs. (It remains to be seen what safety work will be produced at these labs. The previous comment is primarily based on our and our technical advisors’ intuitions that it seems most promising for safety research to be closely coupled with cutting-edge capabilities research.) We find it quite valuable to increase the total supply of researchers because we believe technical research on potential risks from advanced AI may be one of the most important and neglected causes we are aware of.

We believe MIRI’s approach and reasoning is highly unusual compared to that exhibited in other AI-safety-relevant work. Specifically, MIRI strikes us as assigning an unusually high probability to catastrophic accidents and as being pessimistic about the difficulty of implementing robust and general safety measures. We believe it is likely beneficial for some people in the field to be focused on understanding the ways standard approaches could go wrong, which may be something MIRI is especially well-suited to do. In general, it seems valuable to promote this kind of intellectual diversity in the field.

Potential for improvement

MIRI’s research program is still fairly new, since it only shifted to doing full-time technical research in 2013, and most of its research staff are somewhat recent hires. MIRI feels its most recent result (the unpublished Result 5) is more impressive than its older work, but we will have substantial uncertainty on this point until the work has been written up and we have had a chance to review it more thoroughly.

In May 2016, MIRI announced that it would be splitting its research program, with a significant fraction of its time spent on the design of safe systems descended from present-day approaches in machine learning. We are not sure what to expect on this front. MIRI may have less of a comparative advantage when doing work that overlaps more with standard machine learning research. However, there is a possibility that we would evaluate research following this new agenda more positively than research following its Agent Foundations agenda.

While we do not see a strong argument for the relevance and technical impressiveness of MIRI’s existing research, it is possible that such an argument will emerge over the next few years. If that happens, we would likely increase our support for MIRI.

Early articulation of the value alignment problem

We believe that MIRI played an important role in publicizing and sharpening the value alignment problem. This problem is described in the introduction to MIRI’s Agent Foundations technical agenda. We are aware of MIRI writing about this problem publicly and in-depth as early as 2001, at a time when we believe it received substantial attention from very few others. While MIRI was not the first to discuss potential risks from advanced artificial intelligence, we believe it was a relatively early and prominent promoter, and generally spoke at more length about specific issues such as the value alignment problem than more long-standing proponents.

There was a general consensus among our technical advisors and grant decision-makers that MIRI should receive some amount of support in recognition of these contributions.

Other considerations

MIRI has inspired, assisted, and/or incubated a number of other researchers and effective altruists that seem to us to be doing useful work. Additionally, two of our grantees, the Center for Applied Rationality and SPARC, received substantial initial support from MIRI. When considering these contributions, as well as MIRI’s early articulation of the value alignment problem, we think there is a case that MIRI will produce further positive impact in a way that is difficult to anticipate.

MIRI seems particularly well-aligned with our values, particularly with respect to effective altruism.

MIRI constitutes a relatively “shovel-ready” opportunity to support work on potential risks from advanced AI because it is specifically focused on that set of issues and has room for more funding.

Senior staff members at MIRI spent a considerable amount of time participating in our review process, and we feel that a “participation grant” is warranted in this context. (This reasoning is only part of our thinking, and would not justify the full amount of the grant; however, note that we believe MIRI spent several times as much capacity on our process as nonprofits typically do when they receive participation grants from GiveWell).

As we ramp up our involvement in the area of potential risks from advanced AI, we expect to ask for substantially more time from MIRI staff, both to get advice on pitfalls we might not otherwise consider, and to share and address some of our concerns about its activities (see next section).

Risks and reservations

Although we list multiple points in favor of MIRI above, MIRI’s core work is its technical research. We remain unconvinced of the value of this work despite putting a great deal of effort into trying to understand the case for it. This is a major reservation for us.

We are not confident that MIRI will add value pursuing its new, machine learning-focused research direction. It may have less of a comparative advantage when doing work that overlaps more with standard machine learning research.

We believe that MIRI often communicates in relatively idiosyncratic and undiplomatic ways, which can cause problems. We believe MIRI has communicated about the value-alignment problem in a way that has often caused mainstream AI researchers to be more dismissive of it, or reluctant to work on technical research to reduce potential risks from advanced AI for fear of being associated with MIRI and its ideas. We also feel there are instances of unprofessional behavior by MIRI staff that have posed similar risks. We see the potential for more negative impact along these lines in the future.

We are concerned that MIRI has only limited engagement with mainstream academia, and in particular very little hands-on experience with machine learning, with the exception of one recently-hired staff member (Jessica Taylor). We find this problematic because our intuition is that research into potential risks from advanced AI is likely to be most effective when closely coupled with cutting-edge machine learning research.

Size of the grant

We had difficulty deciding on a grant size. In light of the arguments both for and against MIRI’s contributions, we felt a case could be made for any figure between $0 and $1.5 million per year (the latter being enough that MIRI would no longer prioritize fundraising and would expand core staff as fast as possible, as discussed above). We ultimately settled on a figure that we feel will most accurately signal our attitude toward MIRI. We feel $500,000 per year is consistent with seeing substantial value in MIRI while not endorsing it to the point of meeting its full funding needs. This amount is similar to what we expect to recommend to higher-end (although not the highest-end) academic grantees in this space in the future. Note that this does not mean that we believe the value of MIRI’s research alone is equivalent to that of higher-end academics in the field; we think that MIRI’s other positive impact detailed above contributes substantially to its overall value.

Plans for follow-up

As of now, there is a strong chance that we will renew this grant next year. We believe that most of our important open questions and concerns are best assessed on a longer time frame, and we believe that recurring support will help MIRI plan for the future.

Two years from now, we are likely to do a more in-depth reassessment. In order to renew the grant at that point, we will likely need to see a stronger and easier-to-evaluate case for the relevance of the research we discuss above, and/or impressive results from the newer, machine learning-focused agenda, and/or new positive impact along some other dimension.

Sources