Philanthropy - especially hits-based philanthropy - is driven by a large number of judgment calls. At the Open Philanthropy Project, we’ve explicitly designed our process to put major weight on the views of individual leaders and program officers in decisions about the strategies we pursue, causes we prioritize, and grants we ultimately make. As such, we think it’s helpful for individual staff members to discuss major ways in which our personal thinking has changed, not only about particular causes and grants, but also about our background worldviews.

I recently wrote up a relatively detailed discussion of how my personal thinking has changed about three interrelated topics: (1) the importance of potential risks from advanced artificial intelligence, particularly the value alignment problem; (2) the potential of many of the ideas and people associated with the effective altruism community; (3) the properties to look for when assessing an idea or intervention, and in particular how much weight to put on metrics and “feedback loops” compared to other properties. My views on these subjects have changed fairly dramatically over the past several years, contributing to a significant shift in how we approach them as an organization.

I’ve posted my full writeup as a personal Google doc. A summary follows.

Changing my mind about potential risks from advanced artificial intelligence

I first encountered the idea of potential risks from advanced artificial intelligence - and in particular, the value alignment problem - in 2007. There were aspects of this idea I found intriguing, and aspects I felt didn’t make sense. The most important question, in my mind, was “Why are there no (or few) people with relevant-seeming expertise who seem concerned about the value alignment problem?”

I initially guessed that relevant experts had strong reasons for being unconcerned, and were simply not bothering to engage with people who argued for the importance of the risks in question. I believed that the tool-agent distinction was a strong candidate for such a reason. But as I got to know the AI and machine learning communities better, saw how Superintelligence was received, heard reports from the Future of Life Institute’s safety conference in Puerto Rico, and updated on a variety of other fronts, I changed my view.

I now believe that there simply is no mainstream academic or other field (as of today) that can be considered to be “the locus of relevant expertise” regarding potential risks from advanced AI. These risks involve a combination of technical and social considerations that don’t pertain directly to any recognizable near-term problems in the world, and aren’t naturally relevant to any particular branch of computer science. This is a major update for me: I’ve been very surprised that an issue so potentially important has, to date, commanded so little attention - and that the attention it has received has been significantly (though not exclusively) due to people in the effective altruism community.

More detail on this topic

Changing my mind about the effective altruism (EA) community

I’ve had a longstanding interest in the effective altruism community. I identify as part of this community, and I share some core values with it (in particular, the goal of doing as much good as possible). However, for a long time, I placed very limited weight on the views of a particular subset1 of the people I encountered through this community. This was largely because they seemed to have a tendency toward reaching very unusual conclusions based on seemingly simple logic unaccompanied by deep investigation. I had the impression that they tended to be far more willing than I was to “accept extraordinary claims without extraordinary evidence” in some sense, a topic I’ve written about several times (here, here and here).

A number of things have changed.

  • Potential risks from advanced AI, discussed above, is one topic I’ve changed my mind about: I previously saw this as a strange preoccupation of the EA community, and now see it as a major case where the community was early to highlight an important issue.
  • More generally, I’ve seen the outputs from a good amount of cause selection work at the Open Philanthropy Project. I now believe that the preponderance of the causes that I’ve seen the most excitement about in the effective altruism community are outstanding by our criteria of importance, neglectedness and tractability. These causes include farm animal welfare and biosecurity and pandemic preparedness in addition to potential risks from advanced artificial intelligence. They aren’t the only outstanding causes we’ve identified, but overall, I’ve increased my estimate of how well excitement from the effective altruism community predicts what I will find promising after more investigation.
  • I’ve seen EA-focused organizations make progress on galvanizing interest in effective altruism and growing the community. I’ve seen some effects of this directly, including more attention, donors, and strong employee candidates for GiveWell and the Open Philanthropy Project.
  • I’ve gotten to know some community members better generally, and my views on some general topics (below) have changed in ways that have somewhat reduced my skepticism of the kinds of ideas effective altruists pursue.

I now feel the EA community contains the closest thing the Open Philanthropy Project has to a natural “peer group” - a set of people who consistently share our basic goal (doing as much good as possible), and therefore have the potential to help with that goal in a wide variety of ways, including both collaboration and critique. I also value other sorts of collaboration and critique, including from people who question the entire premise of doing as much good as possible, and can bring insights and abilities that we lack. But people who share our basic premises have a unique sort of usefulness as both collaborators and critics, and I’ve come to feel that the effective altruism community is the most logical place to find such people.

This isn’t to say I support the effective altruism community unreservedly; I have concerns and objections regarding many ideas associated with it and some of the specific people and organizations within it. But I’ve become more positive compared to my early impressions.

More detail on this topic

Changing my mind about general properties of promising ideas and interventions

Of the topics discussed here, this one is the hardest to trace the evolution of my thinking on, and the hardest to summarize.

I used to think one should be pessimistic about any intervention or idea that doesn’t involve helpful “feedback loops” (trying something, seeing how it goes, making small adjustments, and trying again many times) or useful selective processes (where many people try different ideas and interventions, and the ones that are successful in some tangible way become more prominent, powerful, and imitated). I was highly skeptical of attempts to make predictions and improve the world based primarily on logic and reflection, when unaccompanied by strong feedback loops and selective processes.

I still think these things (feedback loops, selective processes) are very powerful and desirable; that we should be more careful about interventions that don’t involve them; that there is a strong case for preferring charities (such as GiveWell’s top charities) that are relatively stronger in terms of these properties; and that much of the effective altruism community, including the people I’ve been most impressed by, continues to underweight these considerations. However, I have moderated significantly in my view. I now see a reasonable degree of hope for having strong positive impact while lacking these things, particularly when using logical, empirical, and scientific reasoning.

Learning about the history of philanthropy - and learning more about history more broadly - has been a major factor in changing my mind. I’ve come across many cases where a philanthropist, or someone else, seems to have had remarkable prescience and/or impact primarily through reasoning and reflection. Even accounting for survivorship bias, my impression is that these cases are frequent and major enough that it is worth trying to emulate this sort of impact. This change in viewpoint has both influenced and been influenced by the two topics discussed above.

More detail on this topic

Conclusion

Over the last several years, I have become more positive on the cause of potential risks from advanced AI, on the effective altruism community, and on the general prospects for changing the world through relatively speculative, long-term projects grounded largely in intellectual reasoning (sometimes including reasoning that leads to “wacky” ideas) rather than direct feedback mechanisms. These changes in my thinking have been driven by a number of factors, including by each other.

One cross-cutting theme is that I’ve become more interested in arguments with the general profile of “simple, logical argument with no clear flaws; has surprising and unusual implications; produces reflexive dissent and discomfort in many people.” I previously was very suspicious of arguments like this, and expected them not to hold up on investigation. However, I now think that arguments of this form are generally worth paying serious attention to until and unless flaws are uncovered, because they often represent positive innovations.

The changes discussed here have caused me to shift from being a skeptic of supporting work on potential risks from advanced AI and effective altruism organizations to being an advocate, which in turn has been a major factor in the Open Philanthropy Project’s taking on work in these areas. As discussed at the top of this post, I believe that sort of relationship between personal views and institutional priorities is appropriate given the work we’re doing.

I’m not certain that I’ve been correct to change my mind in the ways described here, and I still have a good deal of sympathy for people whose current views are closer to my former ones. But hopefully I have given a sense of where the changes have come from.

More detail is available here:

Some Key Ways in Which I’ve Changed My Mind Over the Last Several Years
  • 1.

    This section focuses on the parts of the effective altruist community that I did not initially encounter as people donating to, or spreading the word about, GiveWell and its top charities.

Comments

One impression I get reading the case reports generated for the history of philanthropy project, along with taking a thousand yard view of the updates that you and OPP have made over time, is intervention at the process level. I.e. generate robust pipelines that will create a stream of potentially impactful projects and wait for the hits. Tweak the pipeline itself to make hits more likely based on the successes. A danger to such an approach is calcification of the pipeline, especially as the people who built it move on to other roles and the new maintenance overseers who lack a full stack causal understanding of the pipeline do not adjust it out of risk aversion. Another danger is selection effects. If your pipeline is known to be flush with cash, it will eventually start generating people and teams that exactly match whatever proxy measures you are using because they heavily optimized for this. This is a particular instantiation of goodheart’s law that we might point to as the reason that foundation efficacy tends to decrease over time as it loses contact with reality.

Perhaps pipeline permutations can be baked into the pipeline DNA itself.

Another relevant contact point with reality that is potentially prone to decay is expert selection. Too exacting an expert selection process can lead to correlated blindspots, a la swiss cheese theory. I noticed myself worrying about this as I read case report methodology that resulted in narratives based on the reports of principle actors in the said narratives.

BTW have you considered having Soskis canvas the nuclear history landscape?

“the reason’ should be “a reason” though the others are more obvious/well known variants of institutional capture. This, to me, points to epistemic rigor being upstream of impact to a greater degree than generally thought. It seems most donors/foundations do not consider the incentive structure they are exuding. Specifically, that the environment is willing to devote millions of dollars worth of optimization power to capturing your millions of dollars, and few have a good intuition of just what the end result of that will look like and thus be on guard for it. There is a fundamental asymmetry here that is hard to overcome. A foundation handing out million dollar grants is spending X labor/hours on the grant, which is much lower than the number of labor/hours people are willing to spend to capture it (including the creativity difference). This could be framed as some variant of the offense/defense problem.

Romeo, thanks for the thoughts! A couple of reactions:

A. I’m not sure I’d accept that “foundation efficacy tends to decrease over time.” And if/to the extent this is the case, I think a couple of other potential factors jump to mind: declining opportunities (the more total public spending, the harder it is for one philanthropist to stand out), regression to the mean, and general challenges of succession (each generation inheriting restrictions/conventions from the previous one, and related issues).

B. I agree that we have to be sensitive to the incentives we’re creating, though I’m not sure this problem is as core/dominant as implied by your comment.

To be more explicit, I think an outside view factor analysis of foundation efficacy would be helpful for avoiding obvious failures.

WRT foundation efficacy over time: determining whether this is worth checking seems worthwhile. Generating napkin math on how often foundations move away from established expertise areas when conditions in the world affecting those areas change. A hits based giving compatible framing metaphor here is “how often do foundations pivot?” if infrequently we should be suspicious of their decision making/evidence gathering capabilities (and thus should want to avoid copying them) since the initial areas of interest should not generally be the areas that turn out to be most valuable on investigation.

Another framing that fits with hits based giving: YCombinator didn’t arise from anything like the standard VC blueprint. All of this dovetails well with reducing openess if transparency prevents doing weirder stuff than normal foundations. An example: almost no one except meta-research organizations produces explicit models to be shared. This is likely the correct move as object level models are likely to produce no useful feedback/peer review but highly likely to produce controversy/friction.

WRT how dominant a consideration: I guess I’m making it seem like I’m asserting causal models. Working on fixing this, but in the meantime epistemic status for these is *Potentially interesting lines of inquiry. Subject to large updates on shallow investigation, which is exactly why we should like to check*

Romeo, I don’t think I follow what you’re saying. I think changes in focus areas are relatively rare for foundations - partly because they build up expertise/connections/capacity, partly because there is often an interest in sticking with the benefactor’s legacy, and partly because conventional wisdom (which differs from our view) seems to be against being analytical and impact-oriented in choosing focus areas. But I’m not sure what confirming this would tell us with respect to other issues discussed here.

If OpenPhil’s methods become highly correlated with the reference class of grant giving foundations, I also expect its failure modes to become highly correlated with them.

Thanks for this, Holden. I appreciate you sharing your thoughts, and the links.

Leave a comment