Note: this post discusses a number of technical and philosophical questions that might influence our overall grantmaking strategy. It is primarily aimed at researchers, and may be obscure to most of our audience.
We are dedicated to learning how to give as well as possible. Thus far, we’ve studied the history of philanthropy, adopted an overall approach we call “strategic cause selection,” chosen three criteria and used them to select some initial focus areas, embraced hits-based giving, and learned many notable lessons about effective giving. These and other judgment calls are subject to revision, but overall we feel reasonably happy about these big-picture choices and “lessons learned.”
However, we also feel that we have many other things left to learn about how to give as well as possible — not just about the details relevant to current and potential focus areas, but also about how we should think about certain “fundamental questions” that could greatly affect our overall approach to giving and our choice of focus areas.
This post briefly explains some of these fundamental questions, especially those which seem to us like they are relatively neglected, and also potentially tractable to scientists and scholars with relevant expertise and interest.1 At the end of each section, I list some example questions we’d like to see examined in some depth.
If readers know of helpful existing literature on these questions, or of ways that we could support further work, we’d appreciate their letting us know via the comment section or by contacting us directly.
Cross-cause cost-effectiveness comparisons
We now recommend grants in focus areas as diverse as criminal justice reform, farm animal welfare, and potential risks from advanced artificial intelligence. Given limited resources, how should we compare the expected value of a dollar spent in one of these areas vs. another?
GiveWell also faces such questions about its top charities, even when comparing between charities within a single cause (global health). For example, the Against Malaria Foundation primarily saves lives, while other top charities primarily improve individuals’ health, increase their income, or do a mix of both. How should one compare the value of saving lives, improving health, and increasing income?2
GiveWell’s current solution for comparing income increases to health improvements is to ask its staff members to make ethical judgments comparing the value of an increase in one’s income to the value of an extra year of healthy life.3 The median staff judgment is then used (along with other variables) to calculate final cost-effectiveness estimates.
Comparing health improvements or income increases to lives saved is arguably even more philosophically contentious. To compare improved health to lives saved, GiveWell asks its staff members to make ethical judgments comparing the value of an extra year of healthy life to the death of a young child.4 The median staff judgment is then used to calculate final cost-effectiveness estimates. Meanwhile, to compare increased income to lives saved, GiveWell first converts income increases into equivalent health improvements (using the median answer to the previous prompt), and then converts health improvements to lives saved (using the median answer to the prompt about an extra year of healthy life vs. the death of a young child).
Now for an Open Philanthropy Project example. To compare the cost-effectiveness of a grant to the Center for Global Development (CGD) to the cost-effectiveness of marginal dollars spent on GiveWell top charity GiveDirectly,5 we modeled the expected impact of CGD’s work in terms of additional dollars sent to the global poor (as a result of CGD’s work),6 since GiveDirectly also transfers dollars to the global poor.
However, most cross-cause comparisons we’d like to make are less straightforward than this. In some strict sense, such diverse grantmaking opportunities may be incommensurable. But it remains the case that we have limited resources, and we see excellent giving opportunities across a wide range of causes. So, we want a solution for cross-cause cost-effectiveness comparisons that, while not perfect, is “good enough” to usefully guide our giving.
We suspect there is room to improve GiveWell’s current method for comparing health improvements, income increases, and lives saved. But there is perhaps even more room for improvements to our understanding of how to make comparisons across the Open Philanthropy Project’s diverse focus areas. Currently, our way of comparing opportunities across focus areas in very intuitive and heuristic, and our process for comparing opportunities across causes could likely benefit from improved rigor, and from stronger theoretical foundations.
Here are some example questions about cross-cause cost-effectiveness comparisons we’d like to see examined in more depth (either via original research, critical analysis of the often significant existing literature, or a mix):
- How should we think about the “disability paradox” (Schroeder 2016)?
- How should we think about age-weighting?
- How should we capture and reflect a range of ethical judgment calls when making cross-cause cost-effectiveness comparisons?
- One component of the cost-effectiveness analyses for some grants (e.g. in support of a campaign to close Rikers Island) includes the value of saving taxpayer dollars. Using a logarithmic model of income and subjective well-being (see e.g. Sacks et al. 2013), one might start with the assumption that dollars given to Americans are 100 times less valuable than dollars given to those with 1/100th of the income. However, there are many complications to this model one might examine, for example the fact that a small portion of the U.S. federal budget is spent on science, which benefits not just Americans but everyone.
- How convincing are the usual arguments for the logarithmic model of income and subjective well-being? Is there a better-justified and equally tractable model available?
- How should we compare the death or suffering of humans to the death or suffering of various non-human species? (This overlaps with questions about moral patienthood; see below.)
- In general, what are some plausible strategies for comparing the cost-effectiveness of small, hard-to-measure reductions in certain kinds of catastrophic risk to the cost-effectiveness of other kinds of grants? What are their pros and cons?
- We expect to make many grants funding scientific research. How should we think about the expected value of different kinds of scientific research? How have people estimated the long-term human benefit of past investments in scientific research? What kinds of new estimates could be conducted?
Making decisions under different kinds of uncertainty
Consider several different kinds of uncertainty:
- I’m 50% confident that a coin (which I know to be a fair coin) will land on heads.
- Based on his track record of showing up at parties like this, I’m 50% confident my friend Matt will show up at the party on Friday. I could also call Matt and ask him some questions to update my level of confidence.
- Based on many different kinds of evidence, plus my gut intuitions, I’m ≥10% confident that “transformative AI” will be created within 20 years. I don’t know what else I could do to significantly further narrow my confidence intervals on this question, though of course things beyond my control (e.g. a sudden acceleration or stall in the field’s progress as a whole) could cause me to update my level of confidence.
- I’m not clear what I mean by “consciousness,” and I barely know what kind of evidence I should think is relevant to the question of whether a certain animal is “conscious” (and therefore likely worthy of moral consideration), but I notice that if I was asked to bet whether it will turn out to be the case that chimpanzees are “conscious” in roughly the way I’m intuitively thinking about consciousness today, and I somehow knew the bet would be definitively resolved, I would take bets consistent with my believing there’s a greater than 85% chance that chimpanzees are “conscious.”
- Pascal’s Mugging: Suppose someone tells me that if I give them five dollars, they will use their magic powers (or their contacts with an advanced alien civilization) to benefit trillions of beings throughout the observable universe. Even if I assume that this claim is 99.99999% likely to be wrong, the expected value is still high. Can I really justify being 99.99999% confident the claim is wrong? It wouldn’t be the first time someone was grossly wrong about the basic structure and affordances of reality.
The theory of decision-making under uncertainty of type (1) — sometimes called “risk” — is well-understood and widely considered to be solved by expected value maximization. But is expected value maximization the right way to think about decision-making in the other cases? We’ve wrestled with this question before,7 but we don’t feel that we yet have a fully satisfactory answer.
Our question here is methodological rather than simply philosophical. Philosophically we are fairly comfortable with a broadly Bayesian framework, in which all relevant uncertainty can be captured by the right set of conditional probabilities. We worry, though, that the practical methodology of simply introspecting one’s subjective probabilities breaks down (and fails to capture all the relevant information in our heads) in situations such as (3) through (5) above.
We plan to write more about this in the future. For now, here are some example questions about decision-making under different kinds of uncertainty that we’d like to see examined in more depth:8
- What are some key dimensions along which uncertainty can vary? For example: expected stability of the estimate over time, ability of the estimator to narrow their confidence interval through their own action, whether the estimator has strong expectations about their own level of probability calibration on questions of the presently relevant reference class, degree of model uncertainty, and so on.
- How should we methodologically handle these different kinds of uncertainty, if at all?
- Is there a single set of principles that gives “sensible” answers” in all the cases above, including Pascal’s Mugging?
- Is there a reasonable methodology for making decisions given uncertainty about which decision theory is correct?
- How should we think about “cluelessness” (especially about the long-run consequences of options), and what can be done about it?9
As explained in an earlier blog post, our giving takes a “worldview diversification” approach, which means we put significant resources behind each worldview that we find highly plausible. For example, our work on farm animal welfare is premised on a worldview according to which at least some non-human animals are worthy of substantial moral consideration.
However, as described in that post, our current approach to worldview diversification is largely intuitive and fairly “rough,” and we welcome ideas for how to practice worldview diversification in a more principled, systematic way.
Here are some example questions about worldview diversification we’d like to see examined in more depth:
- How reasonable is our case for worldview diversification? What are the most important considerations that post does not consider?
- Is there a formal or semi-formal framework that would improve our (currently highly intuitive) method for implementing worldview diversification?10
- What is the most reasonable approach to dealing with moral uncertainty (MacAskill 2014)?
Philanthropic coordination theory
GiveWell has written previously about the “giver’s dilemma”:
Imagine that two donors, Alice and Bob, are both considering supporting a charity whose room for more funding is $X, and each is willing to give the full $X to close that gap. If Alice finds out about Bob’s plans, her incentive is to give nothing to the charity, since she knows Bob will fill its funding gap. Conversely, if Bob finds out about Alice’s funding plans, his incentive is to give nothing to the charity and perhaps support another instead. This creates a problematic situation in which neither Alice nor Bob has the incentive to be honest with the other about his/her giving plans and preferences — and each has the incentive to try to wait out the other’s decision.
We’ve also discussed three types of approaches to the giver’s dilemma: “funging” approaches, “matching” approaches, and “splitting” approaches (for explanations, see here).
The best solution for philanthropic coordination can vary depending on several factors, for example whether we are trying to coordinate with a small number of other donors (e.g. one or two other foundations who work in one of our focus areas) or a large number of other donors (e.g. all those who give to GiveWell top charities each year). We have some tentative ideas about how to deal with these coordination issues in some cases, but we are not very confident our current ideas are best.
According to economist S. Nageeb Ali (see here), little to no academic research has addressed these issues of philanthropic coordination, and we agree with him that studying these issues further could be productive.
Here are some example questions about philanthropic coordination theory we’d like to see examined in more depth:
- If we are considering funding a project that we suspect another similarly-sized funder would also be willing to fund, what are some possible approaches we can take to negotiating with them, and what are the pros and cons of each approach? How does the optimal behavior change with the degree to which the other funder (a) is of similar / different size compared to us; (b) has similar / different values to us; and (c) is easier / harder for us to communicate with?
- How do these principles extend to times when we are dealing with a diffuse set of donors that we lack key information about (such as the total size of the group, value alignment, etc.)?
- We typically avoid situations in which we provide >50% of an organization’s funding, so as to avoid creating a situation in which an organization’s total funding is “fragile” as a result of being overly dependent on us. To avoid such situations, one approach we’ve sometimes taken is to fill the organization’s funding gap up to the point where we are matching all their other donors combined. But we have several concerns about this strategy. For example, does this strategy create highly problematic incentives for other donors? Does it lead to a situation in which some of the organization’s donors should “wait us out” to make the organization’s funding gap appear larger than it otherwise would, while others should “front-run us” to make our room for matching other donors seem larger?
Moral patienthood and moral weight
One of the key questions we ask when choosing focus areas or grants is: “Per dollar spent, how many individuals could our funding benefit, and how much might it benefit them?” However, this raises a further question: Which types of beings merit moral concern? Or, to phrase the question as some philosophers do, “Which things are ‘moral patients’,11 and what are the dimensions along which moral patients can be benefited?”12 (See also our blog post on radical empathy.)
To illustrate: our work on farm animal welfare is premised on the view that at least some animals are moral patients. But which animals are moral patients, and how should we weigh the death or suffering of one kind of animal against that of another?
I am currently preparing a report summarizing some of my early thinking on moral patienthood, but my initial findings do not come close to “settling” the issue to our satisfaction, and my report does not examine the further issue of “moral weight.” We think additional work on these questions could be valuable. Some conversations from this investigation have already been published, and provide some sense of how I am thinking about the relevant issues. My report on moral patienthood will also include a list of questions I’d like to see examined in further depth.
- 1. There are many other fundamental questions that in principle could impact our work, but which are not elaborated here due to our current intuitions about (1) how much “room” there is for them to affect our decision-making in a practical and substantial way, given our current understanding of the issue and how confident we are in that understanding, (2) how tractable they seem given the methods we can think of for investigating them, and/or (3) how neglected they seem to be. However, in most cases we have not investigated these questions deeply, and our intuitions about them could easily be wrong. Fundamental questions we considered, but decided not to elaborate at this time, due to our intuitions about (1)-(3) for each question, include: how to better model the long-term indirect effects of interventions, additional debate about theories in normative ethics (e.g. consequentialism vs. deontological ethics vs. virtue ethics vs. contractualism), the likely value of the long-term future, population ethics, infinite ethics, and anthropic reasoning.
- 2. See also S. Andrew Schroeder’s review of related issues in “Value Choices in Summary Measures of Population Health (2016), GiveWell’s notes from a conversation with S. Andrew Schroeder, and GiveWell’s blog post “AMF and Population Ethics.”
- 3. Specifically, cell F10 of the worksheets each staff member fills out has the following prompt: “1 DALY [disability-adjusted life year] is equivalent to increasing ln(income) by one unit for how many years?”
In the 2016 cost effectiveness analysis spreadsheet provided here, you can see (on the “Summary” spreadsheet, row AD) that staff members’ answers to this prompt above ranged from 3 to 5, with a median of 3.
The use of ln(income) is also debatable, but fits with some common interpretations of the available data concerning the relationship between income and subjective well-being: see e.g. Sacks et al. (2013).
- 4. Specifically, cell L19 of the worksheets that each staff member fills out has the following prompt: “DALYs per death of a young child averted.”
In the 2016 cost effectiveness analysis spreadsheet provided here, you can see (on the “Summary” spreadsheet, row AU) that staff members’ answers to this prompt were as low as 3 (implying that a young child’s life is worth relatively little, perhaps because a young-enough child is thought to have not yet achieved “full moral status,” as some philosophers argue), and as high as 40 (perhaps using something like a “years of life lost” calculation which incorporates discounting and age-weighting; see the “YLL per death” worksheet).
- 5. If we think a strong case can be made that a grant will do substantially more expected good per dollar than an additional grant to GiveDirectly would, then we are excited to make that grant. If a grant seems as though it would do about as much expected good per dollar as GiveDirectly, or less than that, then we are typically not excited to make that grant, because GiveDirectly already provides a reasonably high-confidence and cost-effective way to do good, and in the long run likely has large amounts of room for more funding. (Simply giving cash to poor people is a very scalable intervention.)
- 6. On the page for our 2016 grant to CGD, we wrote:
Based on our case studies, we think it is likely that at least one or two of CGD’s initiatives have produced a few billion dollars of value for people in low-income countries… Given that CGD has spent on the order of $150 million during its 15-year history, it seems reasonable to us to estimate that CGD has produced at least 10 times as much value for the global poor as it has spent, though we consider this more of a rough lower bound than an accurate estimate.
- 7. See Why we can’t take expected value estimates literally (even when they’re unbiased), Modeling Extreme Model Uncertainty, Sequence thinking vs. cluster thinking, and the final footnote of What Do We Know about AI Timelines?.
- 8. See also the Global Priorities Project’s work on “problems of unknown difficulty.”
- 9. In particular, I’m thinking of cases of what Greaves (2016) calls “complex cluelessness.” Greaves distinguishes cases of “simple” vs. “complex” cluelessness like so:
In ‘simple problem’ cases, the unforeseeable effects under consideration were ones that, while they could result from (say) some particular act A1, they could equally easily, and in precisely analogous ways, result from any of the relevant alternative acts. It is this precise analogy between the possibility that (say) choosing A1 over A2 would lead to effect E1 rather than [effect] E2, and the ‘opposite’ possibility that choosing A1 over A2 would lead to E2 rather than E1, that renders plausible the indifference reasoning that is so intuitive in those cases [and which arguably undergirds orthodox subjective Bayesianism -LM]. In contrast, in ‘complex problem’ cases (I stipulated), one has more specific reasons for suspecting particular, systematic correlations between acts and ‘indirect’ effects, but too many such reasons: non-isomorphic reasons that point in different directions, and for which there is no canonical weighing-up operation. In those cases, no form of indifference principle is at all plausible, and the threat of cluelessness is more genuine.
Greaves illustrates the phenomenon of complex cluelessness in the context of Effective Altruism-motivated decision-making like this:
…some examples should convey the idea [of what Greaves calls “complex cluelessness”]… In [the context of Effective Altruism], the agent is considering devoting a significant portion of her resources, in terms of time and money, with the express purpose of causing as much good as possible for a fixed amount of input resource. Since the actions in question here involve at least moderate and optional sacrifice on the part of the agent, and since in addition the whole point of the actions under consideration would be to maximize good, any cluelessness about which actions have that property feels particularly galling…
Here is just one such example. Effective Altruists place a lot of weight on the recommendations of independent charity evaluators, whose aim is to rank charities, as far as possible, in terms of overall cost-effectiveness: ‘amount of good done per dollar donated’. One charity that consistently comes out top in these rankings, at the time of writing, is the Against Malaria Foundation (AMF), a charity that distributes free insecticide-treated bed nets in malarial regions. To justify this verdict, the charity evaluators clearly need (inter alia) estimates of the consequences of distributing bed nets, per extra net distributed (and hence per dollar donated). Equally clearly, however, these charity evaluators, just like everyone else, cannot possibly include estimates of all the consequences of distributing bed nets, from now until the end of time. In practice, their calculations are restricted to what are intuitively the ‘direct’ (‘foreseeable’?) consequences of bed net distribution: estimates of the number and severity of cases of malaria that are averted by bed net distribution, for which there is reasonably robust empirical data. In fact, the standard calculation focuses exclusively at the effectiveness of bed net distribution in averting deaths from malaria of children under the age of five, and (using standard techniques for evaluating death aversions) concludes that those benefits alone suffice for ranking amf’s cost-effectiveness above that of most other charities…
Averting the death of a child, however, has knock-on effects that have not been included in this calculation. What the calculation counts is the estimated value to the child of getting to live for an additional (say) sixty years. But the intervention in question also has systematic effects on others, which latter (1) have not been counted, (2) in aggregate may well be far larger than the effect on the child himself of prolonging the child’s life, and (3) are of unknown net valence. The most obvious such effects proceed via considerations of population size. In the first instance, averting a child death directly increases the size of the population, for the following (say) sixty years, by one. Secondly, averting child deaths has longer-run effects on population size, both because the children in question will (statistically) themselves go on to have children, and because a reduction in the child mortality rate has systematic, although difficult to estimate, effects on the near-future fertility rate. Assuming for the sake of argument that the net effect of averting child deaths is to increase population size, the arguments concerning whether this is a positive, neutral or negative thing are complex. But, callous as it may sound, the hypothesis that (overpopulation is a sufficiently real and serious problem that) the knock-on effects of averting child deaths are negative and larger in magnitude than the direct (positive) effects cannot be entirely discounted. Nor (on the other hand) can we be confident that this hypothesis is true. And, in contrast to the ‘simple problem of cluelessness’, this is not for the bare reason that it is possible both that the hypothesis in question is true and that it is false; rather, it is because there are complex and reasonable arguments on both sides, and it is radically unclear how these arguments should in the end be weighed against one another.
Greaves concedes that “orthodox subjective Bayesianism” may be the right response even in such cases of complex cluelessness (p. 327), but she suggests that it may be more appropriate in such cases for an (ideal) agent to have imprecise credences rather than sharp ones:
Intuitively, the idea here is that when the evidence fails conclusively to recommend any particular credence function above certain others, agents are rationally required to remain neutral between the credence functions in question…
Greaves goes on to outline some of the choices to be made if one adopts an imprecise credences model; see Bradley (2014) for much more. Or, perhaps some other model is most appropriate for cases of complex cluelessness. Or perhaps orthodox subjective Bayesianism is as good a solution as we’re going to get (but, see Talbott 2008).
- 10. In our post on worldview diversification, we wrote:
We haven’t worked out much detail regarding the “how” of worldview diversification. In theory, one might be able to work out a formal approach that accounts for both the direct benefits of each potential grant and the myriad benefits of worldview diversification we’ve listed here, plus some other considerations (such as the fact that we implicitly commit to continue working in a cause for several years when we make a full-time hire for that cause). One might also incorporate considerations like “I’m not sure whether worldviews A and B are commensurate or not; there’s an X% chance they are, in which case we should allocate one way, and a Y% chance they aren’t, in which case we should allocate another way.” But while we’ve discussed these sorts of issues, we haven’t yet come up with a detailed framework along these lines.
- 11. Related terms include “moral status,” “moral standing,” “moral considerability,” “personhood,” “moral subject,” and “member of the moral community.” Sometimes these terms are used more-or-less interchangeably, and sometimes they are not.
For reviews of these related terms and concepts, see Newson (2007), Hancock (2002), Jaworska & Tannenbaum (2013), Morris (2011), ch. 3 of Beauchamp & Childress (2012), Kagan (2016), and the entry on “Moral Status” by James W. Walters in pp. 1855-1864 of Post (2004).
- 12. Most commonly-discussed candidate dimensions of moral concern are captured by theories of well-being. Crisp (2013) organizes philosophical theories of well-being into three categories: hedonistic theories according to which well-being is the presence of pleasure and the absence of pain, desire theories according to which well-being is getting what one wants, and objective list theories according to which well-being is the presence or absence of certain objective characteristics (potentially including hedonistic and desire-related characteristics). Fletcher (2015) offers a different categorization, including chapters on hedonistic theories, perfectionistic theories, desire-fulfillment theories, objective list theories, hybrid theories, subject-sensitive theories, and eudaimonistic theories. In the social sciences, human objective well-being is often measured using variables such as education, health status, personal security, income, and political freedom (see e.g. OECD 2014), while human subjective well-being is typically conceived of in terms of life satisfaction, hedonic affect, and eudaimonia (psychological “flourishing”): see e.g. OECD (2013). For another overview of several approaches to well-being, see the chapters in Part II of Adler & Fleurbaey (2016). For a “network theory of well-being” that integrates both subjective and objective factors, see Bishop (2015).
Many candidate dimensions of moral concern captured by these theories of well being — e.g. pain and pleasure — are widely thought to vary in their moral importance depending on other parameters such as “intensity” and (subjective or objective) duration.