## Machine Intelligence Research Institute — General Support (2020)

Grant investigator: Committee for Effective Altruism Support

Open Philanthropy recommended a grant of $7,703,750 to the Machine Intelligence Research Institute (MIRI) for general support. MIRI plans to use these funds for ongoing research and activities related to reducing potential risks from advanced artificial intelligence, one of our focus areas. This follows our February 2019 support. While we see the basic pros and cons of this support similarly to what we’ve presented in past writeups on the matter, our ultimate grant figure was set by the aggregated judgments of our committee for effective altruism support, described in more detail here. ## Machine Intelligence Research Institute — General Support (2019) Grant investigator: Committee for Effective Altruism Support This page was reviewed but not written by members of the committee. MIRI staff also reviewed this page prior to publication. The Open Philanthropy Project recommended a grant of$2,652,500 over two years to the Machine Intelligence Research Institute (MIRI) for general support. MIRI plans to use these funds for ongoing research and activities related to reducing potential risks from advanced artificial intelligence, one of our focus areas. Planned activities include alignment research, a summer fellows program, computer scientist workshops, and internship programs.

This grant supplements our three-year October 2017 support. While we see the basic pros and cons of this support similarly to what we’ve presented in past writeups on the matter, our ultimate grant figure was set by the aggregated judgments of our committee for effective altruism support, described in more detail here.

Update: In November 2019, we added funding to the original award amount. The “grant amount” above has been updated to reflect this.

## Machine Intelligence Research Institute — AI Safety Retraining Program

Grant investigator: Claire Zabel

This grant represents a renewal of and increase to our 500,000 grant recommendation to MIRI in 2016, which we made despite strong reservations about their research agenda, detailed here. In short, we saw value in MIRI’s work but decided not to recommend a larger grant at that time because we were unconvinced of the value of MIRI’s research approach to AI safety relative to other research directions, and also had difficulty evaluating the technical quality of their research output. Additionally, we felt a large grant might signal a stronger endorsement from us than was warranted at the time, particularly as we had not yet made many grants in this area. Our decision to renew and increase MIRI’s funding sooner than expected was largely the result of the following: • We received a very positive review of MIRI’s work on “logical induction” by a machine learning researcher who (i) is interested in AI safety, (ii) is rated as an outstanding researcher by at least one of our close advisors, and (iii) is generally regarded as outstanding by the ML community. As mentioned above, we previously had difficulty evaluating the technical quality of MIRI’s research, and we previously could find no one meeting criteria (i) – (iii) to a comparable extent who was comparably excited about MIRI’s technical research. While we would not generally offer a comparable grant to any lab on the basis of this consideration alone, we consider this a significant update in the context of the original case for the grant (especially MIRI’s thoughtfulness on this set of issues, value alignment with us, distinctive perspectives, and history of work in this area). While the balance of our technical advisors’ opinions and arguments still leaves us skeptical of the value of MIRI’s research, the case for the statement “MIRI’s research has a nontrivial chance of turning out to be extremely valuable (when taking into account how different it is from other research on AI safety)” appears much more robust than it did before we received this review. • In the time since our initial grant to MIRI, we have recommended several more grants within this focus area, and are therefore less concerned that a larger grant will signal an outsized endorsement of MIRI’s approach. We are now aiming to support about half of MIRI’s annual budget. MIRI expects to use these funds mostly toward salaries of MIRI researchers, research engineers, and support staff. ## Machine Intelligence Research Institute — General Support (2016) We decided to write about this grant at some length, because we and others have found it challenging to assess MIRI’s work, and we believe there will be substantial interest – from donors and potential donors – in a frank assessment of it. This page is a summary of the reasoning behind our decision to recommend the grant; it was reviewed but not written by the grant investigator(s). Machine Intelligence Research Institute staff reviewed this page prior to publication. The Open Philanthropy Project recommended a grant of500,000 to the Machine Intelligence Research Institute (MIRI), an organization doing technical research intended to reduce potential risks from advanced artificial intelligence. We made this grant despite our strong reservations about MIRI’s research, in light of other considerations detailed below.

We found MIRI’s work especially difficult to evaluate, so we set up a fairly extensive review process for five of MIRI’s best (according to MIRI) papers/results produced in 2015-2016. These papers/results were all concerned with MIRI’s Agent Foundations” research agenda,1 which has been the primary focus of MIRI’s research so far. This process included reviews and extensive discussion by several of our technical advisors, who assessed both the research topics’ relevance to reducing potential risks and the pace of progress that had been made on the topics; we also commissioned reviews from eight academics to help inform the latter topic.

Based on that review process, it seems to us that (i) MIRI has made relatively limited progress on the Agent Foundations research agenda so far, and (ii) this research agenda has little potential to decrease potential risks from advanced AI in comparison with other research directions that we would consider supporting. We view (ii) as particularly tentative, and some of our advisors thought that versions of MIRI’s research direction could have significant value if effectively pursued. In light of (i) and (ii), we elected not to recommend a grant of $1.5 million per year over the next two years, which would have closed much of MIRI’s funding gap and allowed it to hire 4-6 additional full-time researchers. This page does not contain the details of our evaluation of MIRI’s Agent Foundations research agenda, only high-level takeaways. We plan to write more about the details in the future and incorporate that content into this page. Despite our strong reservations about the technical research we reviewed, we felt that recommending$500,000 was appropriate for multiple reasons, including the following:

• We see our evaluation of MIRI’s research direction as uncertain, in light of the fact that MIRI was working on technical research around potential risks from advanced AI for many years while few others were, and it is difficult to find people who are clearly qualified to assess its work. If MIRI’s research is higher-potential than it currently seems to us, there could be great value in supporting MIRI, especially since it is likely to draw less funding from traditional sources than most other kinds of research we could support. We think this argument is especially important in light of the fact that we consider potential risks from advanced AI to be an outstanding cause, and that there are few people or organizations working on it full-time.
• We believe funding MIRI may increase the supply of technical people interested in potential risks from advanced AI and the diversity of problems and approaches considered by such researchers.
• We see a possibility that MIRI’s research could improve in the near future, particularly because some research staff are now pursuing a more machine learning-focused research agenda.2
• We believe that MIRI has had positive effects (independent of its technical research) in the past that would have been hard for us to predict, and has a good chance of doing so again in the future. For example, we believe MIRI was among the first to articulate the value alignment problem in great detail.
• MIRI constitutes a relatively “shovel-ready” opportunity to support work on potential risks from advanced AI because it is specifically focused on that set of issues and has room for more funding.
• There are a number of other considerations. In particular, senior staff members at MIRI spent a considerable amount of time participating in our review process, and we feel that a “participation grant” is warranted in this context. (This reasoning is only part of our thinking, and would not justify the full amount of the grant; however, note that we believe MIRI spent several times as much capacity on our process as nonprofits typically do when they receive participation grants from GiveWell). Additionally, as we ramp up our involvement in the area of potential risks from advanced AI, we expect to ask for substantially more time from MIRI staff.

There is a strong chance we will renew this grant next year.

The judgments and decisions that use “we” language in this page primarily refer to the opinions of Nick Beckstead (Program Officer, Scientific Research), Daniel Dewey (Program Officer, Potential Risks from Advanced Artificial Intelligence), and Holden Karnofsky (Executive Director).

## 1. Background and process

This grant falls within our work on potential risks from advanced artificial intelligence (AI), one of our focus areas within global catastrophic risks. We wrote more about this cause on our blog.

#### 1.1 The organization

The Machine Intelligence Research Institute (MIRI) is a nonprofit working on computer science and mathematics research intended to reduce potential risks from advanced AI.

MIRI was founded in 2000 as the Singularity Institute for Artificial Intelligence (SIAI), with the mission to “help humanity prepare for the moment when machine intelligence exceeds human intelligence.” Our understanding is that for several years SIAI was primarily focused on articulating and communicating problems of AI safety, by writing public content, influencing intellectuals, and co-hosting the Singularity Summit.3 In 2013, SIAI changed its name to MIRI and shifted its primary focus to conducting technical research, pursuing a highly theoretical “Agent Foundations” research agenda.4 In May 2016, MIRI announced5 that it would be pursuing a machine learning research agenda6 alongside the original agenda.

Open Philanthropy Project staff have been engaging in informal conversations with MIRI for a number of years. These conversations contributed to our decision to investigate potential risks from advanced AI and eventually make it one of our focus areas. For more details on these early conversations, please refer to our shallow investigation.

We consider MIRI to be a part of the “effective altruism” community. It is not a part of mainstream academia. Because MIRI’s research priorities are unusual and its work does not always fall within any specific academic subfield, there are relatively few people who we feel clearly have the right context to evaluate its technical research. For this reason, we and others7 in the effective altruism community have found it challenging to assess MIRI’s impact.

#### 1.2 Our investigation process

Nick Beckstead, Program Officer for Scientific Research, was the primary investigator for this grant. Daniel Dewey, Program Officer for Potential Risks from Advanced Artificial Intelligence, also did a substantial amount of investigation for this grant, particularly in evaluating the quality of MIRI’s research agenda.

We attempted to assess MIRI’s research primarily through detailed reviews of individual technical papers. MIRI sent us five papers/results which it considered particularly noteworthy from the last 18 months:

1. Benya Fallenstein and Ramana Kumar. 2015. “Proof-Producing Reflection for HOL: With an Application to Model Polymorphism.” In Interactive Theorem Proving: 6th International Conference, ITP 2015, Nanjing, China, August 24-27, 2015, Proceedings. Springer.8
2. Vadim Kosoy. 2015. “Optimal Predictors: A Bayesian Notion of Approximation Algorithms.” Unpublished draft.
3. Scott Garrabrant, Benya Fallenstein, Abram Demski, and Nate Soares. 2016. “Inductive Coherence.” arXiv:1604.05288 [cs.AI]. Previously published as “Uniform Coherence.”9
4. Scott Garrabrant, Nate Soares, and Jessica Taylor. 2016. “Asymptotic Convergence in Online Learning with Unbounded Delays.” arXiv:1604.05280 [cs.LG].10
5. 2016. Unpublished result on “logical induction.”

Papers 1, 3, and 4 were completed works, Paper 2 was an unpublished work in progress, and Result 5 was an unpublished result that was presented in person. This selection was somewhat biased in favor of newer staff, at our request; we felt this would allow us to better assess whether a marginal new staff member would make valuable contributions. Additionally, older works may have included collaborations between MIRI and Paul Christiano, one of our technical advisors, and we wanted to minimize any confusion or conflicts of interest that may have caused.

All of the papers/results fell under a category MIRI calls “highly reliable agent design”. Four of them were concerned with “logical uncertainty” — the challenge of assigning “reasonable” subjective probabilities to logical statements that are too computationally expensive to formally verify. One was concerned with “reflective reasoning” — the challenge of designing a computer system that can reason “reliably” about computations similar or identical to its own computations.

Papers 1-4 were each reviewed in detail by two of four technical advisors (Paul Christiano, Jacob Steinhardt, Christopher Olah, and Dario Amodei). We also commissioned seven computer science professors and one graduate student with relevant expertise as external reviewers. Papers 2, 3, and 4 were reviewed by two external reviewers, while Paper 1 was reviewed by one external reviewer, as it was particularly difficult to find someone with the right background to evaluate it. Result 5 did not receive an external review because the result had not been written up and, at the time we were commissioning external reviews, MIRI asked us to keep the result confidential. However, the result was presented to Daniel Dewey and Paul Christiano, and they wrote reviews of the result for us. MIRI is now discussing the result publicly, though it has yet to be released as a finished paper.

We have made all external reviews of the published work (Papers 1, 3, and 4) public, although the reviewers’ names were kept anonymous. Of the four technical advisors named above, three provided permission to publish anonymized versions of their reviews of the published work. A consolidated document containing all public reviews can be found here.

In addition to these technical reviews, Daniel Dewey independently spent approximately 100 hours attempting to understand MIRI’s research agenda, in particular its relevance to the goals of creating safer and more reliable advanced AI. He had many conversations with MIRI staff members as a part of this process.

Once all the reviews were conducted, Nick, Daniel, Holden, and our technical advisors held a day-long meeting to discuss their impressions of the quality and relevance of MIRI’s research.

In addition to this review of MIRI’s research, Nick Beckstead spoke with MIRI staff about MIRI’s management practices, staffing, and budget needs.

## 2. Our impression of MIRI’s Agent Foundations research

While we are not confident we fully understand MIRI’s research, we currently have the impression that (i) MIRI has made relatively limited progress on the Agent Foundations research agenda so far, and (ii) this research agenda has limited potential to decrease potential risks from advanced AI in comparison with other research directions that we would consider supporting. We view (ii) as particularly tentative, and some of our advisors thought that versions of MIRI’s research direction could have significant value if effectively pursued. This page does not summarize the details of our reasoning on points (i) and (ii) or the details of our reasoning on the two questions listed below. We plan to write more about that in the future and add it to this page.

Through technical reviews and the subsequent discussion, we attempted to answer two key questions about MIRI’s research:

#### 2.1 How relevant is MIRI’s Agent Foundations research agenda?

Our technical advisors generally didn’t believe that solving the problems outlined in MIRI’s Agent Foundations research agenda11 would be crucial for reducing potential risks from advanced AI. Some felt that it could be beneficial to solve these problems, but it would be difficult to make progress on them. There was a strong consensus that this work is especially unlikely to be useful in the case that transformative AI is developed within the next 20 years through deep-learning methods. We did not thoroughly review MIRI’s second, machine learning-focused research agenda, because at the time of our investigation very little work had been done on it.

#### 2.2 How much progress has MIRI made on its Agent Foundations agenda?

We asked our technical advisors to help us get a sense for the overall aggregate productivity represented by these papers. One way of summarizing our impression of this conversation is that the total reviewed output is comparable to the output that might be expected of an intelligent but unsupervised graduate student over the course of 1-3 years. Our technical advisors felt that the distinction between supervised and unsupervised work was particularly important in this context, and that a supervised graduate student would be substantially more productive over that time frame.

#### 3.1 Budget and room for more funding

MIRI operates on a budget of approximately $2 million per year. At the time of our investigation, it had between$2.4 and $2.6 million in reserve. In 2015, MIRI’s expenses were$1.65 million, while its income was slightly lower, at $1.6 million. Its projected expenses for 2016 were$1.8-2 million. MIRI expected to receive $1.6-2 million in revenue for 2016, excluding our support. Nate Soares, the Executive Director of MIRI, said that if MIRI were able to operate on a budget of$3-4 million per year and had two years of reserves, he would not spend additional time on fundraising. A budget of that size would pay for 9 core researchers, 4-8 supporting researchers, and staff for operations, fundraising, and security.

## 4. Plans for follow-up

As of now, there is a strong chance that we will renew this grant next year. We believe that most of our important open questions and concerns are best assessed on a longer time frame, and we believe that recurring support will help MIRI plan for the future.

Two years from now, we are likely to do a more in-depth reassessment. In order to renew the grant at that point, we will likely need to see a stronger and easier-to-evaluate case for the relevance of the research we discuss above, and/or impressive results from the newer, machine learning-focused agenda, and/or new positive impact along some other dimension.

DOCUMENT SOURCE