Our AI governance grantmaking so far

When the Soviet Union began to fracture in 1991, the world was forced to reckon with the first collapse of a nuclear superpower in history.1 The USSR was home to more than 27,000 nuclear weapons, more than one million citizens working at nuclear facilities, and over 600 metric tons of nuclear fissile materials.2 It seemed inevitable that some of these weapons, experts, and materials would end up in terrorist cells or hostile states,3 especially given a series of recent failed attempts at non-proliferation cooperation between the US and the USSR.4

Seeing the threat, the Carnegie and MacArthur foundations funded a Prevention of Proliferation Task Force, which (among other outputs) produced the influential report “Soviet Nuclear Fission: Control of the Nuclear Arsenal in a Disintegrating Soviet Union” by Ash Carter and others.5 Shortly before the report’s publication, the authors presented their findings to Senators Sam Nunn (D-GA) and Richard Lugar (R-IN) at a meeting arranged by the president of the Carnegie foundation.6 In later remarks, Nunn described the report as having an “astounding effect” on him and other Senators.7

Later that year, Nunn and Lugar introduced legislation (co-drafted with Carter and others8) to create the Cooperative Threat Reduction Program, also known as the Nunn-Lugar Act.9 The bill provided hundreds of millions of dollars in funding and scientific expertise to help former Soviet Union states decommission their nuclear stockpiles. As of 2013,10 the Nunn-Lugar Act had achieved the dismantling or elimination of over 7,616 nuclear war-heads, 926 ICBMs, and 498 ICBM sites. In addition to removing weapons, the program also attempted to ensure that remaining nuclear materials in the former USSR were appropriately secured and accounted for.11 In 2012, President Obama said that Nunn-Lugar was one of America’s “smartest and most successful national security programs,” having previously called it “one of the most important investments we could have made to protect ourselves from catastrophe.”12 President-Elect Joe Biden, a U.S. Senator at the time of Nunn-Lugar’s passage, called it “the most cost-effective national security expenditure in American history.”13

The Nunn-Lugar program is an example of how technology governance can have a very large impact, and specifically by reducing global catastrophic risks from technology. Stories like this help inspire and inform our own grantmaking related to mitigating potential catastrophic risks from another (albeit very different) class of high-stakes technologies, namely some advanced artificial intelligence (AI) capabilities that will be fielded in the coming decades, and in particular from what we call “transformative AI” (more below).14

We have previously described some of our grantmaking priorities related to technical work on “AI alignment” (e.g. here), but we haven’t yet said much about our grantmaking related to AI governance. In this post, I aim to clarify our priorities in AI governance, and summarize our AI governance grantmaking so far.

Our priorities within AI governance

By AI governance we mean local and global norms, policies, laws, processes, politics, and institutions (not just governments) that will affect social outcomes from the development and deployment of AI systems. We aim to support work related to both AI governance research (to improve our collective understanding of how to achieve beneficial and effective AI governance) and AI governance practice and influence (to improve the odds that good governance ideas are actually implemented by companies, governments, and other actors).

Within the large tent of “AI governance,” we focus on work that we think may increase the odds of eventual good outcomes from “transformative AI,” especially by reducing potential catastrophic risks from transformative AI15 — regardless of whether that work is itself motivated by transformative AI concerns (see next section). By transformative AI, I mean software that has at least as profound an impact on the world’s trajectory as the Industrial Revolution did.16 Importantly, this is a much larger scale of impact than others seem to mean when discussing “transformative technologies” or a “4th industrial revolution,” but it also doesn’t assume technological developments as radical as “artificial general intelligence” or “machine superintelligence” (see here). Nor does it assume any particular AI architecture or suite of capabilities; it remains an open empirical question which architectures and capabilities would have such extreme (positive or negative) impact on society. For example, even a small set of AI systems with narrow and limited capabilities could — in theory, in a worst-case scenario — have industrial-revolution-scale (negative) impact if they were used to automate key parts of nuclear command and control in the U.S. and Russia, and this was the primary cause of an unintended large-scale nuclear war.17 (But this is only one example scenario and, one hopes, a very unlikely one.18)

Unfortunately, it’s difficult to know which “intermediate goals” we could pursue that, if achieved, would clearly increase the odds of eventual good outcomes from transformative AI. Would tighter regulation of AI technologies in the U.S. and Europe meaningfully reduce catastrophic risks, or would it increase them by (e.g.) privileging AI development in states that typically have lower safety standards and a less cooperative approach to technological development? Would broadly accelerating AI development increase the odds of good outcomes from transformative AI, e.g. because faster economic growth leads to more positive-sum political dynamics, or would it increase catastrophic risk, e.g. because it would leave less time to develop, test, and deploy the technical and governance solutions needed to successfully manage transformative AI? For those examples and many others, we are not just uncertain about whether pursuing a particular intermediate goal would turn out to be tractable — we are also uncertain about whether achieving the intermediate goal would be good or bad for society, in the long run. Such “sign uncertainty” can dramatically reduce the expected value of pursuing some particular goal,19 often enough for us to not prioritize that goal.20

As such, our AI governance grantmaking tends to focus on…

  • …research that may be especially helpful for learning how AI technologies may develop over time, which AI capabilities could have industrial-revolution-scale impact, and which intermediate goals would, if achieved, have a positive impact on transformative AI outcomes, e.g. via our grants to GovAI.
  • …research and advocacy supporting intermediate goals that we’ve come to think will improve expected transformative AI outcomes,21 such as more work on methods for gaining high assurance in advanced AI systems and greater awareness of the difficulty of achieving such high assurance, e.g. via our funding for Lohn (2020) and Flournoy et al. (2020).
  • …broad field-building activities, for example to identify and empower highly capable individuals with a passion for increasing the odds that transformative AI will result in long-lasting broad benefit, e.g. via scholarships, our support for career advice related to AI policy careers, and grantees such as GovAI.22
  • …better-informed AI governance training and advice for governments, companies, and other actors, especially on issues of likely relevance to transformative AI outcomes such as great power technology competition, e.g. via our grants to CSET and the Wilson Center.

In a footnote, I list all the grants we’ve made so far that were, at least in part, motivated by their hoped-for impact on AI governance.23

Example work I’ve found helpful

Our sense is that relatively few people who work on AI governance share our focus on improving likely outcomes from transformative AI, for understandable reasons: such issues are speculative, beyond the planning horizon of most actors, may be intractable until a later time, may be impossible to forecast even in broad strokes, etc.

Nevertheless, there has been substantial AI governance work that I suspect has increased the odds of good outcomes from transformative AI,24 regardless of whether that work was itself motivated by transformative AI concerns, or has any connection to Open Philanthropy funding. I list some examples below, in no order:

In the future, we hope to fund more work along these lines. As demonstrated by the examples above, some of the work we fund will involve explicit analysis of very long-run, potentially transformative impacts of AI, but much of the work we fund will be focused on more immediate, tractable issues of AI governance, so long as we are persuaded that the work has a decent chance of improving the odds of eventual good outcomes from transformative AI (and regardless of whether a given grantee has any interest in transformative AI).

Of course, we might never fund anything in AI governance as impactful as the work that led to the Nunn-Lugar Act, but per our commitment to hits-based giving, we are willing to take that risk given the scale of impact we expect from transformative AI.

  • 1. My thanks to Nathan Calvin for his help researching and drafting these opening paragraphs about the Nunn-Lugar Act.
  • 2. On Soviet nuclear stockpile numbers, see Carter et al (1991) at pp. i & 29. On Soviet citizens working at nuclear facilities, see Parker (2016):
    [Siegfried] Hecker and the rest of the Americans were deeply concerned about the one million-plus Russians who worked in nuclear facilities. Many faced severe financial pressure in an imploding society and thus constituted a huge potential security risk.

  • 3. See e.g. contemporary commentary from Carter et al (1991), p. i:
    Soviet nuclear command and control is at root a social and political creation. However successful its designers have been in insulating it from all the problems they could foresee, it cannot be assumed capable of standing apart from turmoil throughout the society within which it is embedded. And if even one hundredth of one percent of the nuclear weapons in the Soviet Stockpile falls into the wrong hands, destruction greater than the world has seen since Hiroshima and Nagasaki could result.

    Another example contemporary quote is from then Secretary of Defense Dick Cheney:

    If the Soviets do an excellent job at retaining control over their stockpile of nuclear weapons – let’s assume they’ve got 25,000 to 30,000; that’s a ballpark figure – and they are 99 percent successful, that would mean you could still have as many as 250 that they were not able to control.

  • 4. See e.g. the 1986 breakdown of negotiations between Gorbachev and Reagan at Reykjavik over disagreements in Reagan’s proposed Strategic Defense Initiative, and the US’s 1980 withdrawal from the SALT II treaty after the Soviet invasion of Afghanistan.
  • 5. Kohler (2007).
  • 6. Jones (2019), p. 28.
  • 7. From Nunn’s remarks at a 1995 White House Forum, discussing his role in Soviet Nuclear disarmament:
    Then, in early November, Ash Carter gave his report on nuclear weapons security in the USSR, which I understand was financed by Carnegie… That report had an astounding effect. Dick Lugar and I got together. I knew that Dick had tremendous influence on the Republican side, tremendous influence in the Senate, and in the country. We really formed a partnership. Ash Carter presented his report to us. We then brought in other senators, and within about three to four weeks we had built a consensus.

  • 8. Carter and Perry (2000), pp. 71-72:
    Carter briefed the senators on the Harvard study. It turned out that Senator Nunn and Senator Lugar and their staff members, Robert Bell, Ken Myers, and Richard Combs, were working on a similar scheme for joint action. After the meeting broke up, Carter, Bell, Myers, and Combs stayed behind to draft what became known as the Nunn-Lugar legislation.

  • 9. Jones (2019), especially pp. 27-33.
  • 10. The Cooperative Threat Reduction Plan was revised several times over the years, and expanded to engage other types of weapons and other states besides Russia (Congressional Research Service 2015).
  • 11. Nunn-Lugar Scorecard (2013).
  • 12. The earlier quote is from The Audacity of Hope (2006), p. 311:
    The premise of what came to be known as the Nunn-Lugar program was simple: after the fall of the Soviet Union, the biggest threat to the United States — aside from an accidental launch — wasn’t a first strike ordered by Gorbachev or Yeltsin, but the migration of nuclear material or know-how into the hands of terrorists and rogue states, a possible result of Russia’s economic tailspin, corruption in the military, the impoverishment of Russian scientists, and security and control systems that had fallen into disrepair. Under Nunn-Lugar, America basically provided the resources to fix up these systems, and although the program caused some consternation to those accustomed to Cold War thinking, it has proved to be one of the most important investments we could have made to protect ourselves from catastrophe.

  • 13. Jones (2019), p. 32.
  • 14. To be clear, I’m not using the Nunn-Lugar Act as anything more than an example of technology governance having a large impact by reducing a global catastrophic risk from technology. For example, I’m not using the Nunn-Lugar example to suggest that future AI risks are similar to the risk of “loose nukes,” nor that aggressive AI arms control measures should be an urgent priority.
  • 15. We focus on “transformative AI” (a term we introduced in a 2016 blog post) because our Potential Risks from Advanced AI focus area is part of our longtermism-motivated portfolio. For more on our reasons for this focus, see Potential Risks from Advanced Artificial Intelligence: The Philanthropic Opportunity. For more on longtermism, see e.g. Greaves & MacAskill (forthcoming); Ord (2020); Bostrom (2013). Because of this longtermist motivation, we focus on a subset of transformative AI scenarios that seems especially important from a longtermist perspective, for example (but not limited to) scenarios involving “prepotent AI” (Critch & Krueger 2020). However, in this blog post and elsewhere I often focus the discussion on “transformative AI” because this term is (I hope) more concrete than alternatives such as “AI systems of likely longtermist importance,” and because it helps to point readers in the direction of issues we focus on (i.e. those with extremely large stakes for the future of human civilization). Our priorities in the space overlap with, but aren’t identical to, those articulated in Dafoe (2020).
  • 16. This phrasing is due to my colleague Ajeya Cotra, and is adapted from the definition of transformative AI introduced by Holden Karnofsky here.
  • 17. On risks from the interaction of AI and nuclear arsenals, including the automation of some parts of nuclear command and control, see e.g. Boulanin et al. (2020); Geist & Lohn (2018); Horowitz et al. (2019).
  • 18. This example might also be unrepresentative of the kind of possible catastrophic risk from transformative AI that we are likely to focus on, for example if it is difficult to affect with philanthropic dollars, or if even large-scale nuclear war is unlikely to have much longtermist significance (see Ord 2020, ch. 4).
  • 19. For example, if we estimate that a $1 million grant has a 60% chance of having ~no impact and a 40% chance of creating +100 units of some social benefit, the expected value of the grant is (.6×0)+(.4×100) = 40 benefit units, for a return on investment (ROI) of one benefit unit per $25,000 spent. If instead we estimate that a $1 million grant has a 40% of having ~no impact, a 20% chance of creating -100 benefit units (i.e. a large harm), and (as with the other grant) a 40% chance of creating +100 benefit units, then even though we think the grant is twice as likely to create a large benefit than a large harm, the expected value of the grant is only (0.4×0)+(0.2×(-100))+(0.4×100) = 20 benefit units, for an ROI of one benefit unit per $50,000. In other words, our “hits-based giving” approach can accommodate more failure of the “no impact” variety than it can of the “negative impact” variety. (And to be clear, I’m not suggesting anything different from normal cost-benefit analysis.)
  • 20. That is, sign uncertainty can reduce the expected value of pursuing some particular goal below our threshold for how much benefit we hope to create on average per dollar spent. For more on our traditional “100x bar” for benefit produced per dollar, see GiveWell’s Top Charities Are (Increasingly) Hard to Beat, but also note that we are still thinking through what threshold to use for our longtermism-motivated grantmaking, per our current approach to “worldview diversification”; see here. The potential impact of sign uncertainty on expected value is universal, but I highlight it here because I have encountered sign uncertainty more commonly in our work on AI governance than in some other Open Philanthropy focus areas, for example in our grantmaking to machine learning researchers and engineers for technical work on AI alignment (though there can be some sign uncertainty for those grants too). For more on sign uncertainty in the context of attempts to do good cost-effectively, see e.g. Kokotajlo & Oprea (2020).
  • 21. We don’t have a concrete list of such intermediate goals at this time, because our expectations about the likely flow-through effects from various possible intermediate goals to transformative AI outcomes are still in a great deal of flux as we learn more.
  • 22. And more broadly, much of our work aims to build other assets for the field besides individuals, for example institutions, professional networks, credentialing methods, etc.
  • 23. Each grant below is linked to a page with more information about the size of the grant and its rationale. While some of our grants are entirely aimed at supporting AI governance work, many of our grants support a variety of work by a given grantee. For example, a grant might support both AI governance and (technical) AI alignment work (e.g. our grant to OpenAI), or it might support work on a variety of global catastrophic risks (e.g. our grants to FHI), with only some (typically unknown) portion of it supporting work on AI governance specifically. In the table below, I provide rough estimates about what fraction of each grant effectively supported AI governance work vs. other kinds of work at the grantee, but these are just guesses and don’t currently play any official role in our budgeting. The cutoffs of 35% and 90% were chosen for grant classification convenience.

    Our annual spending in this area fluctuates greatly from year to year, depending on how much staff time we’re able to devote to the area and especially on which opportunities happen to be available and discovered in a given year.

    Grant How much for AI governance?
    MIT, for Thompson (2020) >90%
    CSIS (2020) >90%
    CISAC at Stanford University (2020) >90%
    RHGM (2020) >90%
    CNAS, for Scharre (2020 #2) >90%
    CNAS, for Scharre (2020 #1) >90%
    FHI at Oxford University, for GovAI (2020) >90%
    FHI at Oxford University, renewal (2020) <35%
    World Economic Forum (2020) >90%
    Lohn (2020) >90%
    Wilson Center, expansion (2020) >90%
    Oxford University (2020) <35%
    80,000 Hours, renewal (2020) <35%
    Wilson Center, renewal (2020) >90%
    WestExec (2020) >90%
    RAND, for Lohn (2020) >90%
    Scholarships (2020) >90%
    CSET at Georgetown University (2019) >90%
    80,000 Hours, renewal (2019) <35%
    Wilson Center (2018) >90%
    FHI at Oxford University, for Dafoe (2018) >90%
    FHI at Oxford University, renewal (2018) <35%
    CNAS, for Danzig, renewal (2018) >90%
    AI Impacts, renewal (2018) 35%-90%
    Future of Life Institute, renewal (2018) <35%
    80,000 Hours (2018) <35%
    Yale University, for Dafoe (2017) >90%
    UCLA, for Parson & Re (2017) >90%
    CNAS, for Danzig (2017) >90%
    OpenAI (2017) 35%-90%
    FHI at Oxford University (2017) <35%
    Future of Life Institute, renewal (2017) <35%
    AI Impacts (2016) 35%-90%
    Electronic Frontier Foundation (2016) 35%-90%
    George Mason University, for Hanson (2016) >90%
    Future of Life Institute, renewal (2016) <35%
    Future of Life Institute (2015) <35%
  • 24. Presumably by only a very small amount in each case, and typically with some remaining sign uncertainty.

Leave a comment