## Does putting kids in school now put money in their pockets later? Revisiting a natural experiment in Indonesia

Open Philanthropy’s Global Health and Wellbeing team continues to investigate potential areas for grantmaking. One of those is education in poorer countries. These countries have massively expanded schooling in the last half century. but many of their students lack minimal numeracy and literacy

To support the team’s assessment of the scope for doing good through education, I reviewed prominent research on the effect of schooling on how much children earn after they grow up. Here, I will describe my reanalysis of a study published by Esther Duflo in 2001. It finds that a big primary schooling expansion in Indonesia in the 1970s caused boys to go to school more — by 0.25–0.40 years on average over their childhoods — and boosted their wages as young adults, by 6.8–10.6% per extra year of schooling.

I reproduced the original findings, introduced some technical changes, ran fresh tests, and thought hard about what is generating the patterns in the data. I wound up skeptical that the paper made its case. I think building primary schools probably led more kids to finish primary school (which is not a given in poor regions of a poor country). I’m less sure that it lifted pay in adulthood.

Key points behind this conclusion:

• The study’s “margins of error” — the indications of uncertainty — are too narrow. The reasons are several and technical. I hold this view mostly because, in the 21 years since the study was published, economists including Duflo have improved collective understanding of how to estimate uncertainty in these kinds of studies.
• The reported impact on wages does not clearly persist through life, at least according to a method I constructed to look for a statistical fingerprint of the school-building campaign.
• Under the study’s methods, normal patterns in Indonesian pay scales and the allocation of school funding can generate the appearance of an impact even if there was none.
• Switching to a modern method which filters out that mirage also erases the statistical results of the study.

My full report is here. Data and code (to the extent shareable) are here.

## Background

The Indonesia study started out as the first chapter of Esther Duflo’s Ph.D. thesis in 1999. It appeared in final form in the prestigious American Economic Review in 2001, which marked Duflo as a rising star. Within economics, the paper was emblematic of an ascendant emphasis on exploiting natural experiments in order to identify cause and effect (think Freakonomics).

Here, the natural experiment was a sudden campaign to build tens of thousands of three-room schoolhouses across Indonesia. The country’s dictator, Suharto, launched the big push with a Presidential Instruction (Instruksi Presiden, or Inpres) in late 1973, soon after the first global oil shock sent revenue pouring into the nation’s treasury. I suspect that Suharto wanted not only to improve the lot of the poor, but also to consolidate the control of his government — which had come to power through a bloody coup in 1967 — over the ethnically fractious population of the far-flung and colonially constructed nation.

I live near the Library of Congress, so I biked over there to peruse a copy of that 1973 presidential instruction. It reminded me of James Scott’s Seeing Like a State, which is about how public bureaucracies impose homogenizing paradigms on the polities they strive to control. After the legal text come neat tables decreeing how many schools are to be built in each regency. (Regencies are the second-level administrative unit in Indonesia, below provinces.) After the tables come pages of architectural plans, like the one at the top of this post.

The instruction even specifies the design of the easels, chairs, and desks. Here’s a desk:

Sure enough, if you search Google images for “Inpres Sekolah Dasar” (Inpres primary school), you’ll find those schools and those desks (source):

The Inpres campaign doubled the stock of primary schools in the country in just six years. Economists call that a “schooling shock”.

## Methods and results

The Duflo study looks for reverberations of this educational earthquake in data from a household survey that the Indonesian government fielded in 1995. By 1995, the first kids who went to the new schools had grown up and started working. The study examines whether boys with more opportunity to attend a new school, by virtue of how young they were and where they lived, actually went to school more and then earned more.[2]The study restricts to men because they more uniformly engage in paid employment or self-employment across their careers, which enhances comparability across age groups. Separately, Duflo studied effects on girls.

To perform this calculation, the study takes difference-in-differences. It looks not at whether men from regencies that got more schools earned more—a difference—but whether the pay differential between young and old men in 1995 was narrower for natives of regencies that got more schools, which is a difference in differences.[3]More precisely, wages are taken in logarithms, so the “pay gap” is a ratio.

Why look at that? Regencies ranged along a spectrum in how many new schools they got per child. To understand the study’s theory of measurement, I like to split the spectrum into regencies that got fewer schools and those that got more. If the Inpres schools did increase future pay, here’s how the world would look in this framing. In reading this table, bear in mind that it is normal for older workers to earn more than younger ones, as I’ll document later. So if something bumps up the pay of younger workers, it narrows the old-young pay gap:

 Regencies getting fewer schools per child in 1970s Regencies getting more schools per child in 1970s Older natives too old to have gone to new schools Older natives too old to have gone to new schools Fewer young natives could have gone to the schools More young natives could have gone to the schools Fewer young natives get pay boost in adulthood More young natives get pay boost in adulthood Larger old-young pay gap among natives in 1995 Smaller old-young pay gap among natives in 1995 Regencies getting fewer schools per child in 1970s Regencies getting more schools per child in 1970s

The bottom line (as it were): natives of places that got more schools in the 1970s would exhibit a smaller old-young pay gap in 1995. That is the correlation that the Duflo study looks for…

…and finds. The study (Table 4, panel A) calculates that each additional planned Inpres school, per 1,000 children in a regency, increased boys’ future wage earnings by about 1.5%. The 1.5% number pertains to employees, meaning people who work for other people. (The government surveyors in 1995 didn’t ask self-employed people, including farmers, how much they earned, so they fall out of this analysis.)

In the same way, it is calculated that during childhood those future workers spent a fifth of a year more in school for each Inpres school built per 1,000 children. I think of that finding as an extra year in school for every fifth boy.

If an extra fifth of a year of schooling bumped wages by an average of 1.5%, then a full year would have increased them by about 5 × 1.5% = 7.5%.

## Association and causation

That 7.5% payoff rate for a year in the classroom is known as the “return to schooling”. Economists have estimated it thousands of times using data from various times and places. Yet among all those estimates, Duflo’s stands out. It comes from the developing world, which is where most people live. It comes from a big schooling expansion, which adds realism if you’re interested in national-level education policy. And the use of difference-in-differences gives the study a certain rigor, for it rules out some potential critiques. Few other studies can check all those boxes (though some can — see this from Kenya or this from India).

To expand on that last strength: If the Duflo study had only computed differences, then, for example, a simple finding that men from regencies that got more schools earned more, if presented as evidence of impact, could be easily challenged. Maybe everything just costs more — and everyone earns more — in the megalopolis of Jakarta; and maybe Jakarta, as the capital, got more schools per capita. Then we would not need to believe that Inpres schools made a difference in order to explain why men from regencies that got more schools earned more. On the other hand, if urban inflation raised everyone’s wages within Jakarta the same amount, then the old-young pay gap would be the same in Jakarta and beyond. It would not be misleadingly associated with the number of schools each district got. And that, as I said, is what the Duflo study actually checks.

Notice that “if urban inflation…” in the previous paragraph. Despite the rigor of difference-in-differences, you still need to assume something nontrivial about the world in order to fully buy the study’s findings.

Fortunately, the Duflo analysis contains a potentially more compelling basis for proving impact. It has to do with timing.[4]Duflo (2004, p. 350) sees an additional virtue in this natural experiment: “Identification is made possible because the allocation rule for schools is known (more schools were built in places with low initial enrollment rates).” But an allocation rule is no less endogenous for being … Continue reading Think of the opportunity to go to one of the new Inpres schools as a medicine. The dose of educational opportunity depended on kids’ ages. Approximately speaking, those 12 and up in 1974 were too old to get any of this schoolhouse-shaped drug, for they had aged out before any new schools got built. Kids who were 11 in in 1974 could get a one-year dose before aging out, at least if they lived near one of the new schools. Kids who were 10 could get a two-year dose. And so on. Because every year more neighborhoods and villages got new schools, well into the 1980s, the average schooling opportunity continued rising for younger and younger kids.

So the graph of Inpres schooling opportunity looks like this:

If we found a similar bend around age 12 in other data, such as on earnings in 1995, that would look like the fingerprint of Inpres carrying through, from cause to effect. And that is exactly what the Duflo study suggests happened, if with statistical noise. This graph is from the study:

Each dot in this graph is a measurement of the association framed in that table I showed above, between the old-young gap in schooling or pay and how many schools a regency was to receive per child. In that framing, we expect no association for the oldest men in the study, for all were too old to have gone to the new schools. But it should start to emerge—the dots should start to rise—as we scan to men who were 12 or younger in 1974. Duflo wrote:

These coefficients fluctuate around 0 until age 12 and start increasing after age 12. As expected, the program had no effect on the education of cohorts not exposed to it, and it had a positive effect on the education of younger cohorts.

Looking at that graph I wondered: do the trends really bend around age 12? Or should they be seen as straight? Because of the noise, neither characterization completely nails it; the question is whether one model clearly out-fits the other. If the overall trends were straight and long-term, perhaps they had little to do with Inpres. Just as in my reanalyses of Hoyt Bleakley’s studies of hookworm and malaria eradication, I set out to probe this question with a mathematical test.

## Starting the reanalysis

I started my quest with a request for the study’s data and computer code. Ironically, Duflo is now the editor of the journal that published her paper. That puts her in charge of enforcing the data and code-sharing policy that applied to her study. Sure enough, she promptly sent me files for reproducing most of the results.[5]Duflo’s license to the 1995 survey data did not permit her to share it. But through the gratefully appreciated assistance of Daniel Feenberg, I indirectly accessed the copy licensed by the NBER. Separately, IPUMS International hosts a large subset.

Once I had anchored myself in exact reproduction, I made changes to the code. Most owe to the passage of time: methods in empirical economics have improved since 2001, and Indonesian men of the generation in the Duflo study have continued tracing their way through life (and through government survey data).

While my biggest question going in was about timing, I stumbled on another first-order issue: an alternative explanation for the numerical findings.

I’ll explain a few technical concerns first, as non-technically as I can, then move to that alternative explanation and the search for bends in trends.

### Data corrections

Some numbers in the Duflo study come from government documents published in the 1970—presidential instructions and reports on Indonesia’s 1971 census. At the Library of Congress, I scanned pages in these books and double-checked the numbers Duflo sent me. In my experience, it is normal for such a check to expose errors, and normal for them not to affect conclusions much—as happened here. For about a tenth of regencies, my figures for planned schools per 1,000 children differ from Duflo’s. (See my Github repo.)

### Clustering

It’s a truism that the larger your sample, the more precise your statistics. The margin of error is tighter if you poll 1,000 people than if you poll 10. But margins of error must themselves be estimated, and determining the effective sample size for this purpose is often a head-scratcher. Should we view a study of the impact of state air pollution rules on asthma rates as being about 50 states or, say, 50 million people? The answer can radically affect how precise we take the results to be. One rule of thumb: the effective sample size is the number of treatment units. There are 50 states, with 50 air pollution laws, so 50 is your number, not 50 million.

In a striking turnabout, soon after finalizing the Indonesia study, Duflo coauthored a paper raising doubts about the methods she had just used: “How Much Should We Trust Differences-in-Differences Estimates?” This new paper was not purely destructive, for it demonstrated the value of a particular mathematical correction, called clustering, which allows one to crunch data on millions of individuals while computing margins of error as if the sample is much smaller. Under the influence of that paper, in returning to the Indonesia study, I cluster standard errors by regency. This widens confidence ranges by a factor of two or three.

### Overrepresentation of wealthy families

Governments run many surveys (to track how much people work, how healthy they are, how much they pay for housing, etc.) Some surveys are censuses, which ideally entail knocking on everyone‘s door, and even reaching the people who don’t have doors. But finding all those people, asking them lots of questions, and collating the answers all costs money. This is why most surveys, like polls, take samples.

As soon as one gives up on surveying everyone, the question arises: what is the best way to allocate surveying resources to get the most accurate statistical picture? Often, it is not to take a plain random sample, as when pollsters dial random phone numbers. It can be better to split the sample into strata — urban and rural, rich and poor. If some strata are known from censuses to be more homogeneous, then governments can get more precision for the money by sampling those strata less and others more. In a history of one of Indonesia’s national surveys, Parjung Surbakti explains it well:

The fact that an orange taken from a truckload of oranges all coming from the same orchard is sweet, gives adequate evidence to conclude that all the oranges in the truck are sweet. In this example, a very small sample size can provide an accurate conclusion about a large population when the population is homogeneous. It would be a different story if the oranges came from a number of orchards and consisted of different varieties. Then a sample of size 10 might not give as accurate a conclusion as that of the previous example. However, if the truckload of oranges can be sorted by varieties, i.e., the population is stratified, then sampling once again may be made more efficient.

It seems that in Indonesia, poorer people are thought to be more like oranges from the same orchard. For the government surveyors disproportionately visit wealthy households, where wealth is indicated by possessions such as toilets and diplomas.

The Indonesia survey data used in the Duflo study are accompanied by weights to document the oversampling of some groups. They indicate, say, that each surveyed household with a toilet stands for 100 others while each without stands for 200. However, the Duflo study mostly does not incorporate these weights.[6]This is not documented in the text but is made plain in the code. As a result, wealthier people are overrepresented.

Whether such weights should in general be factored in is a confusing question, so much so that three respected economists wrote “What Are We Weighting For?” to dispel their colleagues’ befuddlement. Here, my concern is that the data are being tilted on the basis of the outcomes of interest. People with more education and higher incomes were more likely to get a knock on the door from a surveyor in 1995 and thus to appear in the Duflo analysis. Imagine a study of the impact of smoking in which people who live are oversampled at the expense of people who die. That would make smoking look safer than it is.

That is why I prefer to incorporate the weights in the Indonesia analysis. For technical reasons, this not only shifts the impact estimates, but further widens margins of error.[7]For intuition, imagine concentrating all weight on a few observations. This effectively slashes sample size.

### Instability of ratios

I quoted the estimate that the Inpres campaign raised wages by 7.5% per year of extra schooling. That is a ratio: a 1.5% wage boost divided by 0.2 years (a fifth of a year) of extra schooling. Because the numbers going into the ratio are themselves averages from samples of Indonesians, each comes with its own margin of error. The true value of the schooling increase in the full population might be 0.3 years or 0.1 years — or 0.0. And if, as far as the math goes, there’s a nontrivial chance that Inpres led to zero additional years of schooling, then there’s a nontrivial chance that the ratio of wage increase to schooling increase is infinite.

The point is not that I think the return to schooling could be infinity (or negative infinity), but that ratios emerging from this sort of analysis can range wildly. Standard methods for computing margins of error can underestimate this uncertainty.

Since Duflo wrote about Indonesia, economists have made a lot of progress in recognizing and working around this devil in the details, which is called “weak identification”. In my reanalysis, I marshal a modern method called the wild-bootstrapped Anderson-Rubin test, which happens to be performed by a cool program I wrote. Like the clustering and weighting corrections, the new method widens the uncertainty bands around the estimated return to schooling.

### Bottom line after incorporating the technical comments

After I fix data errors, cluster, and compensate for the oversampling of wealthy households, it is surprisingly unclear whether Inpres caused boys to spend any more time in school. And because dividing by a number that is hard to distinguish from zero produces unstable results, the impact on wages per extra year in school is even less clear. Where Duflo brackets that 7.5% schooling return rate with a 95% confidence range of 1–15%, I widen to a huge span, –44% to +164%.[8]Here, I express these results as percentages rather than log points, i.e. as exp(x) – 1 where x is a primary statistical result.

To be fair, that wide range can mislead. My 70% confidence range is 0–23%. I conclude that incorporating the technical comments into the core Duflo analysis leaves it weakly favoring the view that Inpres-stimulated schooling raised wages.

## The alternative explanation: wage scale dilation

As I wrangled with those technicalities and worked to answer my original question about trend bending, I discovered another reason to doubt the Duflo study’s results. And once I did, I realized that Clément de Chaisemartin and Xavier D’Haultfœuille had already pointed to the heart of the issue. It turns out that some more mundane patterns in the data, when fed into the difference-in-differences machine, can produce the same statistical results.

Here are some universal truths, or at least as close as you get to that in economics:

1. Those who go to school more earn more, on average.
2. The earnings gap between the more- and less-schooled rises with age.

As an example of the second point, in the 1995 Indonesia data, the average employed, college-educated 21-year-old man earned 744 rupiah per hour, only 18% more than the 633 rupiah earned by a contemporary who dropped out of school before fourth grade. But at age 61, in the same data, the hourly pay for the primary school dropout was basically unchanged, 642 rupiah, while pay for the college graduate was more than six times higher than at age 21, at 4,852 rupiah.

This graph shows more fully how the wage scale widened with age among employed Indonesian men in 1995:

I imagine the people behind the bottom curve as farm workers whose pay had little to do with age. And I imagine that the top curve traces how the elite ascend the ranks of big corporations and government.

The widening of the wage scale feeds into the Duflo study in a mind-bending way. Suppose (correctly) that poorer regencies — the ones that produced more day laborers and fewer doctors and lawyers — received more Inpres schooling funding per child. Then we would see:

 Regencies getting fewer schools per child in the 1970s Regencies getting more schools per child in the 1970s Natives better off on average Natives poorer on average More kids grow up to be CEOs More kids grow up to be day laborers Average pay rises a lot during career Average pay rises little during career Larger old-young pay gap among natives in 1995 Smaller old-young pay gap among natives in 1995

This table starts and ends in the same places as the earlier table: getting more schools means a smaller old-young pay gap. But it goes by a different route. Nowhere does the new scenario assume or require that the Inpres school-building campaign had any effect.[9]Formally, I am suggesting a violation of the parallel trends assumption required for causal interpretation of difference-in-differences results. Thus, the study’s methods could lead to the conclusion that Inpres schools raised wages even if they did not.

You might push back against my skepticism: I’m undercutting an argument that schooling increases earnings by invoking the universal truth, confirmed in the graph above, that pay and education go hand in hand — which itself seems like powerful evidence that education increases earnings!

To which I reply: The Duflo study strives not merely to prove that education raises wages, but to measure the impact more sharply. It invokes a natural experiment to remove sources of statistical bias, such as my urban inflation hypothetical. It matters whether the natural experiment is working as intended.

## Fresh findings

To probe whether wage scale dilation is generating the Duflo study’s results — and to return to the search for bent trends — I pursued three strategies:

• As foreshadowed, I tested mathematically whether the trend really bends in that Duflo graph I showed earlier. The wage scale dilation theory amplifies the importance of this check: I have no reason to think that wage scale dilation suddenly kicks in at a particular age, so a clear bend in the trend of emergence of those differences-in-differences would favor the Duflo explanation as laid out in that first table above.
• I deployed a newer statistical method called changes-in-changes, which should be immune to wage scale dilation.
• I followed up later on the same generation of Indonesian men, in data from 2005, 2010, and 2013–14 (a selection dictated by whether the surveys asked the needed questions and whether the answers are publicly available). One reason was to see whether the reported link between schooling and earnings was consistent over men’s careers, or a one-off in 1995.

To convey the results, I’ll show you some graphs. All are constructed like that Duflo graph I showed before. But I’ve incorporated the technical fixes, such as correcting data errors, and added some visual elements.

First comes my update of the “education” contour in the Duflo graph. Again, a precise statement of the meaning of the dots — blue in mine, black in the Duflo graph — is a mouthful. Each shows how much the old-young gap in total years spent in school, among men of a particular age, was associated with how intense the Inpres program was in their home regency.[10]Each dot shows, for men who were a particular age in 1974, how much their total years spent in school increased for each additional Inpres school per 1,000 children in one’s native district, relative to the benchmark group, here taken to be those aged 2 in 1974. The sample is restricted to … Continue reading Around the dots I added gray bands to depict 95% confidence intervals.[11]Figure 1 of the Duflo study also shows confidence intervals, but for the schooling contour only. They remind us that because of noise in the data, each dot could have landed a bit lower or higher than it does. And I fit a line to the data, in red, while allowing it to kink at age 12:

The schooling trend hardly bends. From the standpoint of this search for the fingerprint, it’s not clear that building Inpres schools contributed to rising school attendance. A statistical test for whether the slopes of the two red segments differ returns a p-value of 0.60, which I printed in the upper left. That high probability means that a bend this small could easily have happened by chance, because of statistical noise, if the true line did not bend at all.[12]The same test applied to the uncorrected original returns a p-value of 0.09. See figure 6 of my write-up.

Now, the Inpres program built primary schools. So did it at least get more kids to finish primary school? Quite possibly. In the next graph, the vertical axis pertains to the share of workers in 1995 who had finished primary school rather than the total years they spent in school. Now the trend more clearly accelerates around age 12. The p-value for the bend is reassuringly low, at 0.01:

It’s a strange pair of findings: boys finished primary school more but didn’t go to school more? I think the first finding is closer to the truth. The surveyors in 1995 didn’t actually ask people how many years they went to school, but rather the highest grade they attended. Probably, when the schools were first built, some kids who were officially too old to attend them went anyway, rather than going to junior high schools farther away.[13]The primary school gross enrollment ratio, which is the ratio of the number of kids attending primary school to the number of official primary school age, temporarily surpassed 100% in the 1980s. Suharti, “Trends in education in Indonesia,” Figure 2.5. Even if they spent exactly the same number of years in school, the study would have coded them as having spent fewer, since on paper they only got as far as sixth grade rather than seventh or eighth.[14]In fact, the Duflo paper (page 804) finds a slight fall in secondary school attainment.

If Inpres at least got more boys through primary school, did that suffice to raise their pay in adulthood? The next graph gets at that possibility by switching the vertical axis to wages. Again, the trend bends up, with a reassuringly low p-value:

If we assume that the deflection in the primary schooling trend caused the deflection in the wage trend, then we can divide the second by the first to gauge the rate of impact. Unfortunately, the first (the increase in primary school completion) still does not differ from zero with enough certainty to stabilize the ratio. In my paper (Table 7, panel B, column 2) I calculate that finishing primary school changed wages in adulthood by somewhere between –12% and +∞, as a 95% confidence range.[15]The statistical method cannot rule out with 95% confidence that the Inpres schooling campaign had zero effect on the rate of primary school completion, and thus that the impact on wages per unit of gain in primary schooling was infinite.

Another source of doubt: when I checked on the same generation of men later in life, the upward bend in the wage trend didn’t persist as strongly. In 2005 (when the men aged 2–24 in 1974 had reached ages 33–55) and 2010 (ages 38–60), the line bends slightly downward.[16]For reasons of data availability, wages are defined differently in different years. In 1995 and 2010, they are the log hourly wage for wage workers. In 2005, they are log hourly wages as imputed from a model calibrated to 1995 data. In 2013-14, they are log typical monthly pay from all sources, … Continue reading In 2013–14 (ages 41–64), it bends more significantly upward. Rather than showing you all of those here, I’ll average them in a single graph;[17]The regressions behind this plot pool the data from all post-1995 follow-ups. The dependent variable is the one defined within each survey sample. Year dummies are added as controls.

The red line bends upward, with a p-value of 0.23. I’m not one to mechanically dismiss a finding as “insignificant” when the p-value exceeds 0.05. At face value, p = 0.23 means there’s less than a 1-in-4 chance of a bend this big in the data if the true pattern is no bend at all. On the other hand, I could have put the finding to an even more rigorous test by including more of the control variables used in the Duflo study.[18]One of these, Inpres water and sewer spending, could plausibly generate a trend break.

Separately, I ran the changes-in-changes method I mentioned, the one that should be immune to wage scale dilation. This approach finds no wage boost from Inpres.

## Conclusion

To recap:

• A representative result from the Duflo study is that Inpres-stimulated schooling increased future wages of boys by 7.5% per year of school, with a 95% confidence range of 1–15%.
• Technical adjustments widen that range hugely. The main reason is that it is surprisingly unclear whether the Inpres school construction led boys to go to school more. Dividing any wage increase by a number that cannot be confidently distinguished from zero makes for instability.
• It is more plausible that the program caused more boys to finish primary school.
• There’s another way to explain why the study finds that Inpres increased adult earnings. It is rooted in two facts: over their careers, more-educated people see their pay rise more (wage scale dilation); and poorer regencies got more schools per child.
• The changes-in-changes method, which is in effect designed to rule out wage scale dilation as an explanation, finds no wage boost from Inpres.
• On the other hand, an apparent fingerprint of Inpres, the trend bend, holds up fairly well in the wage data of 1995 despite my technical tweaks. And wage scale dilation would not be expected to cause such a bend.
• The fingerprint persists weakly later in life.

The findings reported here are important because they show that an unusually large government-administered intervention was effective in increasing both education and wages in Indonesia.

I am confident that, in retrospect, that reading is overconfident. But I wouldn’t swing to the opposite extreme of no confidence. It seems more likely than not that building all those schools (and hiring all those teachers) got more kids into school. And the big push may have left light fingerprints in the wage numbers decades later. Meanwhile, it is conceivable that the conservatism of the changes-in-changes method, which makes it less prone to generating false positives, also makes it more prone to generating false negatives.

Still, the rate of return to Inpres-stimulated schooling — wage gains per additional unit of schooling — is quite unclear.

One’s judgment about whether basic education in developing countries is a good thing should not hinge solely on the answers emerging from this study, nor even on the questions it asks. It could be that Inpres schools indeed made a large difference in Indonesia, but that the “natural experiment” was just not strong enough for the signal to shine through the noise. Or — more likely — the problem is that, as Lant Pritchett puts it, schooling ain’t learning. Maybe the Inpres schooling campaign was better at getting kids behind desks than knowledge into their heads. If billions of kids are passing through school and not learning much, there is huge room for improvement.

Moreover, higher pay is not the only reason to send kids to school. As I write, Duflo and coauthors are using randomly allocated scholarships in Ghana — an artificial rather than natural experiment — to research a wide array of potential consequences of secondary schooling, for girls as well as boys. Do the girls go on to have fewer unwanted pregnancies? Do fewer of their children die in the first year of life?

I am struck by how often the findings from studies of “natural experiments” fray under stress. An appreciation for that fact may explain why, soon after completing her dissertation, Esther Duflo became a champion of running actual experiments, such as the scholarship experiment in Ghana. Discarding some of what she learned in school would eventually pay high returns. It made Duflo the second woman to receive a Nobel prize in economics. And it drove her profession to produce more credible research.

Footnotes

↑1 See Table 1, panel B, of the paper. The study restricts to men because they more uniformly engage in paid employment or self-employment across their careers, which enhances comparability across age groups. Separately, Duflo studied effects on girls. More precisely, wages are taken in logarithms, so the “pay gap” is a ratio. Duflo (2004, p. 350) sees an additional virtue in this natural experiment: “Identification is made possible because the allocation rule for schools is known (more schools were built in places with low initial enrollment rates).” But an allocation rule is no less endogenous for being known. And Duflo (2001, Table 2) shows that the non-enrollment rate was a secondary correlate of allocation. It matters even less after data corrections (figure 1 of my write-up). Duflo’s license to the 1995 survey data did not permit her to share it. But through the gratefully appreciated assistance of Daniel Feenberg, I indirectly accessed the copy licensed by the NBER. Separately, IPUMS International hosts a large subset. This is not documented in the text but is made plain in the code. For intuition, imagine concentrating all weight on a few observations. This effectively slashes sample size. Here, I express these results as percentages rather than log points, i.e. as exp(x) – 1 where x is a primary statistical result. Formally, I am suggesting a violation of the parallel trends assumption required for causal interpretation of difference-in-differences results. Each dot shows, for men who were a particular age in 1974, how much their total years spent in school increased for each additional Inpres school per 1,000 children in one’s native district, relative to the benchmark group, here taken to be those aged 2 in 1974. The sample is restricted to wage earners. Figure 1 of the Duflo study also shows confidence intervals, but for the schooling contour only. The same test applied to the uncorrected original returns a p-value of 0.09. See figure 6 of my write-up. The primary school gross enrollment ratio, which is the ratio of the number of kids attending primary school to the number of official primary school age, temporarily surpassed 100% in the 1980s. Suharti, “Trends in education in Indonesia,” Figure 2.5. In fact, the Duflo paper (page 804) finds a slight fall in secondary school attainment. The statistical method cannot rule out with 95% confidence that the Inpres schooling campaign had zero effect on the rate of primary school completion, and thus that the impact on wages per unit of gain in primary schooling was infinite. For reasons of data availability, wages are defined differently in different years. In 1995 and 2010, they are the log hourly wage for wage workers. In 2005, they are log hourly wages as imputed from a model calibrated to 1995 data. In 2013-14, they are log typical monthly pay from all sources, including self-employment. The 2010 data have the disadvantage of being coded only by regency of workplace, not regency of birth. The regressions behind this plot pool the data from all post-1995 follow-ups. The dependent variable is the one defined within each survey sample. Year dummies are added as controls. One of these, Inpres water and sewer spending, could plausibly generate a trend break.

## 1. Preamble

This document is a shallow investigation, as described here. As we noted in the civil conflict investigation we shared earlier this year, we have not shared many shallow investigations in the last few years but are moving towards sharing more of our early stage work.

This is a shallow on telecommunications infrastructure in low- and middle-income countries (LMICs). I spent about three weeks on it, during which I read major papers in the field and spoke to about ten experts. This document has been read and discussed by Open Philanthropy’s Global Health and Wellbeing cause prioritization team.

We’re continuing to look at the robustness of the social scientific evidence on telecommunications. Many of the papers are quite new and remain unpublished, which limits our ability to review the data behind them in detail. However, we think that the initial evidence is quite positive, and we think that someone looking to launch an impactful organization might want to examine this space more closely. We think there may be both nonprofit and for-profit ideas worth pursuing (the latter could still have large social returns, and provide benefits from an earn-to-give perspective). We see Wave as an interesting model for the latter.

We welcome comments either posted to the Effective Altruism forum, emailed to me at [email protected], or shared via this Google Form.

#### 1.1 Major sources of uncertainty

It seems worth flagging two major uncertainties that this shallow does not resolve:

1. I am quite unsure what the expansion path for telecommunications technologies looks like. Technologies often spread much faster than expected; it is possible that philanthropic investment is less needed/marginally useful here than I think it is.
2. The literature on telecommunications access and income is still quite limited; there are really only two randomized controlled trials (RCTs), both of which have relatively small sample sizes. In addition, there’s only one RCT on cellphone coverage and the other studies in this literature use difference-in-differences with two-way fixed effects, which may yield misleading results. We’d like to have more evidence and are currently considering funding more research in this area.

## 2. Summary

#### 2.1 Why telecoms?

For the purposes of this document, “telecoms” includes internet access, fixed-line telecommunications infrastructure, and cellular phone access.

We did some initial work on infrastructure in general.  Investment in telecoms looked unusually promising in catalyzing income growth.  Two RCTs find that access to a form of telecommunications increased incomes by 20% in the first year.  Therefore, we decided to look into the topic further.

This shallow seeks to answer the following questions:

• What is the current status of information and communication technology (ICT) infrastructure? Where is ICT infrastructure worst?
• What are the best tactics to improve ICT infrastructure?
• Should Open Phil fund any of those tactics?

#### 2.2 Who is already working on it?

Mostly, the market is solving access to at least some telephony and internet – in 20 years, mobile phone usage has increased approximately 85 percentage points. In sub-Saharan Africa, where mobile phone usage has lagged much of the rest of the world, access to 3G networks increased 55 percentage points in the six years through 2020.[1]In 2014, Steve Song gave a talk noting that 20% of Africans had access to 3G. By 2020, GSMA said that 75% of Africans have access to 3G. This increase has happened without particular effort from philanthropists, and with significant barriers to usage. However, neither access nor usage of telecoms are universal, growth is beginning to slow, and there are still large gaps to fill.  This is not uncommon for new technologies; initial adoption is generally much faster than rollout to the last 10-15% of users.

ICT is relatively light on philanthropic investment, likely because it is an active area of commercial investment. I’m somewhat uncertain on total funding, as several entities either don’t report their spending or don’t clearly separate it from other spending, but I am fairly certain that it is not a central focus area for any major foundations.

There are four major areas where there is at least some ICT funding, none of which I have managed to come up with a good total investment figure for.

1. The Gates Foundation seems to be the only large foundation that is very active in this area. Much of their funding is targeted specifically towards women and/or financial services; they are less interested in ICT as a category than ICT as a catalyst for other things. I’ve had difficulty figuring out what their total ICT spend is, because ICT-centric projects are not necessarily categorized as being such. They funded the original Phillip Roessler RCTs on cellphone provision (targeting women in particular).
2. There are a handful of ICT-specific foundations – e.g. the Internet Society Foundation (~$50M a year), the Web Foundation (~$4M a year), the Association for Progressive Communications (~$4M a year), ISIF Asia (~$2M a year). This might be 100M per year if you add up all the small foundations (though that’s potentially high/optimistic).
3. Commercial entities also do some amount of philanthropic work – e.g. Google.org’s technology and innovation arm, the Ericsson for Good initiative, Meta’s Connectivity, and Data for Good.

While their overall footprint is relatively large — Google.org gives about $100M per year, and Meta is obviously a large company — their ICT-specific spending appears to be a relatively small percentage of that. If I were to guess, I’d say the companies spend perhaps 50M together on more or less philanthropic ICT work — but this is a highly uncertain estimate. 1. Commercial non-philanthropic spending is potentially very large, but the overlap with our interests (work that makes cell phones/internet access available to new users) is very unclear. However, there’s clearly a lot of commercial capital overall. Orange Telecom is currently building (pg. 41) 2000 towers in the DRC to (ostensibly) cover 10M additional people in rural areas. Likewise, Google just made first landing of the Equiano cable, their third private international cable, which connects Portugal to West and South Africa (with 20x the capacity of previous cables to the region), which probably cost around$400M. Safaricom is spending $500M expanding to Ethiopia, and Africell is spending$200M on an Angola expansion. Capital spending per year on telecoms in sub-Saharan Africa is ~$10B. I would guess that total philanthropic spending on telecoms access is probably less than$150M a year, but I am (again) highly uncertain. Adding in relevant commercial spending could vastly increase that number.

#### 2.3 What could a new philanthropist do?

Since most interests in the space are commercial entities, they are by definition interested in commercially viable products, and somewhat less interested in the social returns to ICT infrastructure. This means that rural and/or very poor communities are less likely to receive ICT infrastructure.

We could target grants towards these communities – where there might be a large percentage increase in income, but from an extremely low base. These communities might not be commercially viable for telecoms, but investing in them could be above our cost-effectiveness bar. There is pretty good evidence that there are large economic gains to be had in going from no connectivity to some connectivity. (This is discussed further in the importance section.)

The completely unserved populations are now quite small – perhaps 3%-9% of the world’s population.[2]It’s possible that this number is much larger — see my notes on the global coverage gap. These ~250-750 million people are (somewhat by definition) difficult to reach, but we could invest in some of the companies working on reaching them. One example is Africa Mobile Networks, which is building “network-as-a-service” for mobile operators (pg. 40).

It may be more promising to not only focus only on strictly increasing coverage, but also on decreasing the cost to use mobile devices in areas that already have (at least some) coverage. Most of the ways to drive down cost (competition, deregulation, additional infrastructure) will also likely do something to expand coverage or improve the quality of coverage where it exists now.

These include:

## 3. Importance

The limited existing literature suggests that improving communications technology is a surprisingly good way to improve incomes. This appears to be true both when expanding coverage and expanding access, and for both adding cell coverage and internet coverage.[3]Indeed, I am sometimes a little sloppy in distinguishing between the two. The results are remarkably consistent and large.

When expanding coverage, the economics literature has found:

• In the Philippines, an RCT showed that gaining access to a community network increased incomes 17%, with those who could actually place calls from home increasing their incomes 28% in fourteen months. This would be a 7000x return in our units given their per-tower costs.
• In Senegal and Tanzania, a quasi-experimental study found that access to 3G increased incomes 14% and 7% in a year. This would be equivalent to a 3500x-7000x return if building a tower costs roughly the same as it does in the US.
• A similar quasi-experimental study in Nigeria found that access to mobile broadband increased incomes 5.8% in a year, 7.8% in two years, and 9.2% in three years. Based on these numbers, a tower would meet our cost-effectiveness bar if it covered ~2200 people.
• In sub-Saharan Africa, a quasi-experimental study exploiting the rollout of subsea cables showed that gaining access to improved internet raised employment by 7% and GDP/capita by 2%.

When expanding access:

• An RCT in Tanzania showed providing people with phones boosted incomes 7%, and providing people with smartphones boosted incomes 20% in a year. Note that this was true even though 88% of households already had access to a feature phone before the RCT.

Another way of looking at impacts is to look at studies of products enabled by mobile expansions, such as mobile money. Suri and Jack find that MPESA lifted 2% of Kenyan households out of extreme poverty – I think a naive back-of-the-envelope calculation (BOTEC) based on that analysis would imply that MPESA increased average consumption by 3.4%.[4]They say that 2% of Kenyan HHs exit extreme poverty based on MPESA overall. The coefficient on “leave extreme poverty” w.r.t. “agent density” is -0.007, so 0.02/0.007 ~ 2.9x multiplier on that coefficient. The coefficient on “log(per capita consumption)” is 0.012, so 0.012*2.9 ~ 0.034. Having this large of an effect from mobile money alone is another credible signal of large total impacts.

I think these results are positive and consistent enough to suggest that telecoms access gives at least a 5% income boost, and more likely a >10% income boost. That is an extremely “big if true” result.

For instance, a very rough estimate suggests if this is true, perhaps 9% of Africa’s growth in the last two decades has been from increased cell coverage, and an additional 2%-8% came from growth in internet usage. Thus, expanded access to telecommunications alone could explain ~⅙ of the growth Africa has experienced from 2000 to 2020.

A similar macro BOTEC suggests that increasing usage of mobile broadband in sub-Saharan Africa to South Asian levels (28% usage → 34% usage – which is only a modest change compared to high-income country levels of usage) would be worth $171 –$342 billion[5]Depending on whether you expect a 5% income increase or a 10% income increase to Open Philanthropy, using our framework for valuing income increases.

## 4. Neglectedness

Mobile communications is not the most neglected of topics when viewed in terms of total spending, as global telecoms spending is somewhere around 1.6 trillion dollars a year. But this is obviously not spent evenly around the globe; US telecom providers spend about $226/person/year on capex, while African ones spend about$13/person/year.

I found it more helpful to separately consider the number of people who have no coverage (the coverage gap) and the number of people who do not use mobile devices (the usage gap).

The coverage gap is closing fairly rapidly over time. GSMA claims that 93% of the world’s population currently has access to 3G or better mobile data; the International Telecommunications Union claims it is 94%, with an additional 3% having access to 2G data. This would leave about 600 million people without access to coverage.[6]Note that for the above I am considering mobile internet only. In the near future, satellites in low orbit will make it possible to access broadband in almost all parts of the world in the near future. However, satellite internet is quite pricey. Starlink terminals are currently $500 loss-leaders … Continue reading This is down from 750 million just two years before; if coverage continued to increase at that rate, it would take just eight years to reach global coverage. However, this data is likely to be optimistic; it is based on self-reported data by telecoms, who are likely to have incentives to overstate how good their networks are. As I noted above, I am uncertain about this data, and it is possible that the number of people with no cell phone coverage is substantially greater than 600 million. I do think it’s likely that >90% of the world has access to cell service, but I’m not willing to swear to any exact percentage. I’d estimate the true coverage gap is somewhere between 3% (the number claimed by GSMA) and 9% (3x what GSMA claims). This would mean ~240M-710M people lack coverage globally. #### 4.2 Usage Gap If defined like the coverage gap, the usage gap would be the percentage of people who have access to any kind of cell phone coverage, but don’t use it. But for reasons I frankly don’t understand, this is almost never what people actually mean by the usage gap. Indeed, finding that number is fairly difficult; since SIM swapping and sharing phones is common in low and middle income countries, good estimates of the percentage of the population with access to a cell phone are thin on the ground. My guess is that most but not all residents of L&MICs have access to a feature phone, as surveys in several sub-Saharan Africa countries suggest that about 80% of people have access to a phone and the number of subscriptions is over 100 per 100 people. This is likely less true the poorer the country you consider; the number of subscriptions / 100 people in LICs is nearly 50% lower than it is in MICs. Instead, the “usage gap” generally actually means the percentage of people who have theoretical access to mobile internet but do not use it. This is now about 6x the percentage that don’t have any access, or about 43% of the global population. Usage is lower among the groups you would expect – rural residents (37% less likely to use mobile broadband),[7]Rural coverage can be 40X the cost of urban coverage. lower income people, women (20% less likely to use mobile broadband), speakers of minority languages, etc. Like the coverage gap, the size of usage gap is decreasing over time – but the experts we spoke to were less than convinced that usage will ever be universal. The most common reason for people not to use telecom services is the cost of both the device and the coverage. Per the GSMA, an internet-enabled device costs an average of 44% of monthly income in a low or middle income country. Philanthropic investment is unlikely to change device costs, but we might be able to reduce tariffs (often 7-10% of the purchase price) through policy work. Even if you have a phone, though, the cost of using data can be prohibitive in developing countries. In sub-Saharan Africa, for instance, 2 GB of data can cost 10-20% of monthly income, and it is fairly obvious why internet penetration in Nigeria is 2x that of Togo (see image below). Prices are dropping over time — GSMA reports that the price of a marginal GB of data has dropped 40% in four years – but I think we could speed this process along. It’s not clear that telecoms companies have the incentives to lower their prices as much as possible – many are monopolies or duopolies – but we could advocate for changes. As physical infrastructure improves, regulatory costs decrease, or competition increases, the marginal GB becomes cheaper, and the marginal consumer becomes more likely to be able to afford to use mobile internet. ## 5. Tractability Possible “Grant Ideas” All of these are somewhat more nebulous than I would like; I am not sure what organization you would literally write a check to in order to make these things happen. In many cases, the ideal organization would be a company doing this work to which we could potentially provide low-cost capital, which has different costs than a typical grant. I split these into two sections — possible grants that would address the coverage gap, and possible grants that would address the usage gap. #### 5.1 Coverage Gap #### 5.1.1 White-Label Tower Companies In the US, many cell phone towers are owned by companies that are not the telecoms that run services through them. This is not true elsewhere, where towers are largely built, owned and maintained by the telecoms. This is less than efficient – if a new carrier wants to expand into an area, they have to build a new set of towers, since they cannot use their competitors’ existing towers. The Tribune Express cites (without a source) that “[in Pakistan] a dismal reality is that 40% of the 34,000 towers are just 300 metres apart,[8]Cell phone towers should be 1-2 km apart. which results in wastage of$1 billion worth of capital due to the existence of parallel towers at close distances.” In many places, this can make coverage expansion unviable. The population density might be enough to support one company and one infrastructure buildout, but certainly not three or four.

This might also allow telecom companies to lower prices. If each company must complete a full build-out, they must also pay for said full build-out. If companies are sharing towers, the cost per company is lower, and the amount that the company must recoup from their customers is also lower. Thus, this intervention might be able to address both the coverage gap (by making expansion into new areas more economical) and the usage gap (by lowering the cost of service in all areas).

There is at least one group working on this in Pakistan. It appears to be an incumbent telecom company, so it’s not clear if they need funding. African Mobile Networks and BRCK are working on related issues in Africa (though BRCK is building a wifi rather than mobile solution).

#### Right-of-Way Fees

It is cost-prohibitive in some countries to put down more land-based fiber. In some municipalities in Nigeria, there are policies that require companies to pay extremely high fees to put down cable. These towns are often just skipped during buildout.

If we are interested in subsidizing capital for land-based fiber, I think we would have to fund advocacy for reducing these fees as well; it’s not that helpful to lower interest rates if it is still extremely costly to build cable.[11]This effort could be compared to our work on land use reform, which often involves funding advocates for reduced housing regulation.

#### Competition

Almost all telecoms markets have only a few entrants, and some remain monopolies. As long as there is no competition in the market, there are few incentives for telecom companies to improve or increase their coverage.

One of the most obvious examples of this is EthioTelecom. EthioTelecom is the only telecommunications provider in Ethiopia, and while it claims to cover 88% of the country with at least 2G service (and 66% with 3G), it is unusably bad. South Africa’s average internet speed is about 10,000x faster than Ethiopia’s. Unsurprisingly, Ethiopia has low internet penetration and essentially no digital economy.

By comparison, when Cambodia introduced competition into the mobile market, the cost per gigabyte of data dropped from $4.56 in 2013 to$0.13 in 2019 (pg. 166 of the World Development Report) and mobile usage spiked to 6.9 gigabytes per capita per month.

However, introducing competition without improving infrastructure has its own challenges. When pricing pressure is substantial, the infrastructure can’t keep up. This has been most notable in India, but also has happened elsewhere in southeast Asia. Carriers are not making enough money to upgrade their infrastructure, and so service is somewhat worse.

#### Spectrum Costs

Spectrum licensing fees can be 20-30% of buildout costs. When you buy spectrum in most countries, you have exclusive use of that part of the spectrum.

This allows incumbent carriers to essentially block new entrants (by licensing the part of the spectrum they would use) — but does not require the incumbent to use it. A licensing regime that does not allow you to license spectrum unless you plan to use it could be helpful, and allow more entrants into the market.

This might be more effective than pushing for competition more generally.

#### Tariffs

Handset costs are a significant barrier for many people in low and middle income countries (LMICs). The phone supply chain is relatively efficient, so we are unlikely to be able to reduce the cost of the actual device, but we could reduce the effective cost to consumers by reducing tariffs.

Tariffs of 7-10% are common on imported electronics in LMICs. This is pretty substantial, and makes electronics more expensive to purchase in poor countries than in rich ones. We could advocate for tariffs on these devices to be reduced, so that more people could afford handsets.

## 1. Editorial note

This document is a “shallow” investigation, as described here. Over the last few years, we’ve moved towards trying to do more shallows quickly for internal audiences.  We wanted to experiment with sharing more again to see how much work it takes and whether it generates informative feedback or leads, so this is the first shallow we’ve published in a number of years.

This is a shallow on reducing civil conflict.  It took me about two weeks to write.  During this time, I read major papers in the field and spoke to about five experts, but did not fully critique their assumptions.  I have tried to flag my major sources of uncertainty in the document.

This document has been read and discussed by the cause prioritization team.  At this point, we do not plan to proceed to a medium investigation, but that could change if we substantially update our estimates of either the financial costs of civil war or the tractability of the problem.

We welcome comments either posted to the EA Forum, emailed to me at [email protected], or shared to this Google Form (can be submitted anonymously).

## 2. Major sources of uncertainty

My major sources of uncertainty after writing this are the following:

• What is the best way to model the economic impact of civil war?  I tended to make fairly conservative (in the sense of total size of the problem) assumptions – e.g. economies that experience civil war recover fully within ten years – but I am not sure how reasonable those assumptions are. We’re currently hiring some academics to investigate these questions further.
• Am I fully capturing the health impacts of civil war on civilians?  There is relatively little data about the social or economic consequences of being displaced within the Global South, and I am not sure my estimates of off-battle-field DALYs are particularly good.
• Fundamentally, I am quite uncertain how much of a difference micro-scale interventions like cognitive behavioral therapy will make on macro-scale events like war.

## 3. Summary

#### 3.1 What is the problem?

Living through a civil war is very bad for your health.  The direct cost of civil conflict is about 10M DALYs per year, but the indirect cost probably quadruples that figure.[1]The GBD estimate includes only direct DALYs, so war injuries/war deaths but not deaths caused from disrupted access to healthcare, etc.  (This is a fairly conservative estimate; one paper argues that there are 25x as many indirect deaths as there are direct deaths.)  Infant mortality and malnutrition rates are twice as high in states either experiencing or recovering from civil conflict as in stable ones.  Life expectancy is about a decade lower than would be expected in a peaceful state.

Per my BOTEC, civil wars cause a loss approx. ⅔ as many DALYs per year as malaria and NTDs (combined).  A civil war costs about $14,000 OP value per country resident per year, or 0.14 life years. (Malaria, by contrast, has a DALY burden of 0.038 life years / resident of sub-Saharan Africa – so civil wars are not as significant a burden on life as malaria, but not insubstantial either, at least in countries where they are ongoing.) But war’s impacts are not limited to health. War impoverishes the populations that experience it, such that poverty is increasingly concentrated in and around states experiencing civil conflict. By 2030, about ½ of the global extreme poor will live in such states. This is not simply because poverty is declining elsewhere; an increasing percentage and an increasing number of the global poor live in countries either at war or at high risk of descending into war. Most of these wars are not wars of global importance; no major power has any particular investment in settling them. (Indeed, for the purposes of this shallow, I am choosing to focus on civil wars where great powers did not intervene – thus excluding Iraq, Afghanistan, Ukraine, Yemen, Israel/Palestine, as reducing conflict there is likely to involve different processes than reducing conflict in more local wars.) War is bad for growth; unlike other emerging markets, states at war generally have zero or negative growth. The residents of fragile states are particularly unable to bear a recession or depression, as civil wars essentially exclusively happen in low income countries. #### 3.2 Who is already working on it? 30% of overseas development aid goes to fragile states, totaling about$76 billion in 2018.  However, most of that is targeted toward poverty reduction and alleviation rather than reducing conflict.  The UN estimates that about $6.8B was spent on peacebuilding projects in 2013, with another$8B spent on peacekeeping operations.

The US government is and has been the largest single contributor to peacebuilding budgets.  Particularly during the Iraq and Afghanistan Wars, the US spent copiously on stabilizing these two countries to dubious effect.  In 2013, spending on peacebuilding in these two countries alone was over 100 billion.  For the purposes of this shallow, I ignore those conflicts and attempts at peacebuilding, and focus only on conflicts where a great power is not involved.

In 2020, the US spent $4.1B a year in peacebuilding in countries where it was not actively involved. Nonetheless, even$4.1B makes the US the largest single contributor to peacebuilding efforts.  The majority of this is in the form of UN peacekeeping missions ($1.45B) and general international organization funding ($1.5B), with the Department of State funding an additional $650M a year in peacebuilding and democracy promotion programs. Private spending is likely considerably smaller than government spending. According to one source, 2020 grants were only$150M, $50M of which comes from USAID. This is likely somewhat of an underestimate, as the 990 for a single NGO – Search for Common Ground – reveals yearly funding of about$25M, but I would not expect private spending to be more than $300M total. #### 3.3 What could a new philanthropist do? Open Philanthropy might be interested in funding a) more research on particular interventions (see potential grants), b) lobbying for more effective interventions rather than less effective ones (e.g. CBT for ex-combatants instead of community-driven development programs). Open Philanthropy could also fund lobbying for more engagement with international peace processes (for instance, lobbying for funding UN peacekeeping to the level requested by the UN, rather than the smaller amount currently provided by the US government). ## 4. Importance Global poverty reduction will likely stall out without improvements in states with civil conflict (or that are recently post-conflict). While extreme poverty in east Asia and south Asia will approach zero by 2030, poverty in fragile states (mainly but not exclusively in Africa) will increase over the same period. The economies in fragile states barely grew during the period 1970-2008, and the living standards of the people living in those states are unlikely to improve unless governance improves. Stabilizing existing fragile states and preventing other states from becoming fragile is thus of potentially very high importance – indeed, eradicating poverty likely requires it. Estimates of how badly a civil war affects a host economy vary widely; Imai and Weinstein 2007 estimate that a civil war costs an economy 1.25% of growth a year, Collier 1999 estimates a 2.2% decrease due to war, the IGC roughly estimates 4-5%, UNDP puts the reduction in GDP at 12.3%/year for civil wars after 1990. (Though an extreme example, GDP/capita in Syria is 11% of what it was before the war.) I’ve chosen to use a 5% annual hit to GDP/capita in the BOTEC. (I assume LICs would otherwise grow at 3%, so in the BOTEC, countries at war shrink at 2%.) With this input, the financial cost substantially outweighs the cost in health; I find that a 5% hit to growth for four years and full recovery by year 10 will cost individuals about$1000 in income, for a total financial cost of about $21B in an average LIC.[2]Paul Collier finds the financial cost of the average civil war is$64B 2008 USD in The Bottom Billion, though he doesn’t specify in any detail how he reaches this figure. A rough estimate of the financial cost of all civil wars that occurred in 2017 was 290B.[3]This is greater than the cost of the number of civil wars that started in 2017 * the cost of a civil war, because more civil wars start than resolve, so the number of civil wars is increasing over time. Note that this is a relatively optimistic assumption – it implies that these countries grow at >7% for the six years following the end of the civil war.

Civil wars also cost lives.  IMHE estimates that 10.1 million DALYs are lost per year directly due to “conflict and terrorism”.  This is about 0.25% of all DALYs, but that estimate does not include any deaths or disability incurred through displacement[4]On average,  ~100 people are displaced per battle death. or disrupted access to healthcare.  I am highly uncertain how many DALYs this adds (see mortality cost), but it seems that the indirect deaths due to conflict substantially outnumber direct deaths on the battlefield.  (I take total DALYs to be 4x the number of direct DALYs, but I’d be willing to believe anything from 2-20x.)

Between deaths and lost income, preventing an average civil war would have OP value of around 1.2 trillion,[5]The true value obviously varies substantially by the size of the country and the severity of the civil war; even small countries can have particularly deadly civil wars (Liberia), while India is a large country that has had several small civil conflicts. of which 20% is from DALYs and 80% is from lost income.

Unfortunately, it seems relatively few civil wars are currently averted.  After a relative lull in interstate and intrastate violence in the 2000s, the number of civil wars and the number of people killed in civil wars has increased in the last decade.  There are also now more refugees than any previous point in history.  While data on poverty among refugee populations is scant, it is likely that they are experiencing considerably more poverty than either the native population of their host country or that they would have if not forcibly displaced.

#### 4.1 Sources of Uncertainty

Most of the above figures are fairly conservative; I could be substantially underestimating both the mortality cost and the financial cost of civil war.

Mortality Cost

In terms of mortality cost, I’ve estimated that about 25% of deaths from civil war occur on the battlefield.  Papers range from suggesting that there are as many deaths off the battlefield as on the battlefield, to suggesting that only 4% of war deaths occur on the battlefield.  Sources attempting to determine the percentage of battlefield deaths by war vary by nearly as much.

I am not sure I have strong thoughts on which end of this range is more likely to be right; I would be willing to believe that the total number of DALYs caused by war is >> 4x the number of reported battlefield DALYs.

Further complicating estimates of the total number of DALYs from civil war, one expert mentioned that battlefield death counts are often a severe underestimate, by >3x.  I am willing to believe I’m probably order of magnitude correct on DALYs, but that’s about it.

Financial Cost

There is substantial heterogeneity in how badly civil wars affect the economy of the country experiencing one; in some cases, rebel groups stay in the hinterland and life in the major cities is able to keep ticking along much as usual, while in others, the economy completely collapses.  Given this, it’s difficult to figure out what the counterfactual is for the economy of countries experiencing civil war.

The best single paper is probably Costelli, Moretti and Pischedda 2017, which does attempt to estimate a counterfactual with a synthetic control rather than just doing a cross-country regression.  Costelli, Moretti and Pischedda 2017 finds a total effect of -17.5%, which is relatively similar to the BOTEC simulation I use (-5%/year for four years).

It is also unclear how much a civil war in one country affects its neighbors.  I do not model spillover effects at all; Paul Collier claims that modeling spillovers doubles the cost of civil wars.  I could fairly easily be convinced into a substantial increase here too.

## 5. Neglectedness

Per DALY, civil wars do not seem particularly neglected.

International spending on averting and settling civil wars seems to be in the range of 10-15B (not including governments also spend money trying to pacify restive areas).[6]Estimating government military and policing spending to reduce the chance of civil war seems difficult, so I’m not doing it.  OECD countries spend about $5B on peacebuilding, the UN spends about$6.5B on peacekeeping, and presumably non-OECD countries spend at least some amount (though likely not a lot)

This means that the world spends about $330-$495 per DALY from civil wars, about 1.5-2.5x what is spent per DALY on HIV/AIDS.

However, the financial cost of civil war substantially outweighs the DALY cost.  About $15B is spent preventing civil wars, but civil wars cost$290B in 2017.  If one adjusts for the greater consumption value for LIC residents – civil wars almost exclusively happen in very poor countries – OP’s valuation of for loss of income is $14T. This gives a current OP-valued cost/current non-OP spending ratio of 966 versus 497 for HIV/AIDs — which means we could view conflict prevention as relatively underfunded. ## 6. Tractability I am reasonably convinced this area is both important and neglected relative to its importance. However, its tractability seems questionable. In the below, I review some possible interventions to prevent civil conflict. #### 6.1 Tractability and Speed Several experts have mentioned that speed matters a lot in reducing the severity of conflict. Once things have started to spiral, there is a very limited time in which one can bring down the temperature of the situation. It is helpful if a particular intervention already has people on the ground, or can get them there very quickly. Peacekeepers are often useful for this, as they are on site and neutral.[8]Once they are deployed; the process of getting peacekeepers deployed is notoriously long. For example, an expert told us a story about a town in a country that had recently ended a civil war where a girl had been murdered and her body dumped outside a mosque. Tensions escalated quickly, the local imam was beaten up, and there were threats of further attacks on the Muslim community to happen the next day. The local peacekeeping mission had Muslim and Christian leaders make a broadcast on local radio together, and this kept the situation from escalating further. Even a limited delay likely would have made such an intervention impossible; de-escalation would have been much more difficult after further violence. This suggests that we should focus particularly on organizations that can react quickly. #### 6.2 Possible Interventions #### 6.2.1 Predicting Civil War The first step to preventing civil war is to determine where a civil war is likely to be in the future. Nearly every party interested in knowing about future conflicts has a project designed to provide information about where future conflicts will occur – the USthe EU, ECOWAS, the OSCE, etc. The Integrated Crisis Early Warning System (ICEWS) claims to have a forecast accuracy of 80% (though I am skeptical that this means an 80% accuracy for predicting exactly when conflict escalation is likely to happen). At the macro level, it is relatively easy to tell which countries are at risk of future conflict. Countries that have had prior civil conflict are quite likely to have civil conflict in the future, and poor countries are also more likely to have insurgencies and civil conflict. The riskiest countries therefore are both poor and have current conflicts: the DRC, Yemen, Nigeria, Sudan/South Sudan, Somalia. However, it is difficult to predict exactly when a recurrence will occur, no matter how fine-grained the data. Bazzi et al find that predicting conflict over space is fairly straightforward, but adding time-varying information provides little additional information. Indeed, the within-location-but-time-varying model has an R2 of only 0.01. I think this is likely to be a problem not easily solved with additional research or funding, but I don’t know how well superforecasters do on this problem. My inclination is that if the Pentagon has an incentive to solve a problem, and has failed to do so, I suspect the sticking point is not resource constraints, but I may be wrong here. If it is possible to predict which situations will escalate, that could be enormously valuable. #### 6.2.2 UN Peacekeeping Despite the perceptions of peacekeepers as largely useless, the empirical evidence suggests peacekeepers really do keep the peace; indeed, of interventions to reduce the risk of return to conflict in a fragile state, peacekeeping is probably the most promising of the bunch. The experts we have spoken to have universally agreed that peacekeeping works. Peacekeepers may also help support the rule of law and a functioning justice system in the difficult reconstruction period. Reducing The Likelihood Of A Return To War Even when parties to a civil war come to the negotiating table, talks often break down, and even signed agreements are often ignored in favor of rearming and returning to war. With a third party willing to enforce a signed agreement, warring parties are much more likely to make a bargain (5% vs. 50%) and the settlement is much more likely to hold (0.4% vs. 20%) (per Walter 2002). Peacekeeping is the most common tool used to enforce such an agreement, or at least give all parties substantial warning if one party violates the agreement. A cross-sectional logit by Page Fortna finds that peacekeepers reduced the likelihood of further war by 75% through the mid-2000s. However, I am not sure that I believe that past peacekeeping successes necessarily imply future peacekeeping successes. Per Krasner and Eikenberry, “[peacekeepers] work best when there are a limited number of national parties, when there are no hostile neighbors, when there are no national spoilers, and when there is [still] a functioning state”. It is not clear to me that this describes that many modern civil wars. Many modern civil wars happen in places where the state is very weak to non-existent – Somalia, CAR – and there are many national parties. Furthermore, it is often difficult to get enough peacekeepers. In a place like Liberia (population 5M), a force of 15,000 will suffice; should you wish to deploy a similar density of peacekeepers to the DRC (population 89M) or Ethiopia (population 115M), you would likely need more peacekeepers than currently exist in the world. I took a 30% haircut on Fortna’s number. Even with this haircut, peacekeeping has decent returns of up to ~152x; BOTEC here. If we have some leverage here, such that the cost of one lobbyist returns >7x in funding for peacekeeping, this seems possibly worth keeping on the radar and/or considering as part of a larger advocacy portfolio. Rule of Law Support UN peacekeepers also often have a mandate to support rule of law reform. While it is not their primary mission, UN missions have “assumed responsibility for drafting laws and lobbying for their passage, revis[ed] constitutions, train[ed] judges, prosecutors, and police officers, [built] courthouses, police stations and prisons, monitor[ed] extrajudicial punishments, arbitrary arrests and indefinite detentions, assist[ed] with criminal investigations and prosecutions, improv[ed] coordination both between and among state and non-state authorities and more generally elevating the role of the state as a purveyor of security and justice” (Blair 2021, 17). By providing an interim rule of law during the transition from war to peace, peacekeepers show governments and citizens a demonstration of how rule of law should work. In his 2021 book, Robert Blair finds that this is remarkably effective. Spending time with a UN peacekeeping mission increased individuals’ trust in formal institutions by 30 percentage points. This seems like a valuable contribution by peacekeeping. It is unclear if this is contributing to the increased likelihood of peace holding, but it does seem useful. What Could Funding Do? As noted above, it is relatively unlikely that additional funding for peacekeeping would increase the number of missions. If the Security Council has not yet agreed to a mission, there is likely a reason for that; that reason will not vanish with more funds. However, simply increasing the funding for existing missions seems to be useful; Collier and Hoeffler find that doubling mission funding reduces conflict by 25%. In particular, additional funding could be used for one of two purposes: increasing the number of peacekeepers or increasing the quality of peacekeepers. The first seems to be effective in reducing conflict; the larger the mission, the more combatants are deterred from returning to conflict (Hultman, Kathman and Shannon 2014Kreps 2010, Kathman and Wood 2016). The second is also useful as better-trained (more expensive) peacekeepers are more effective at keeping the peace. Current peacekeepers are poorly trained and poorly paid,[9]The UN reimburses at$1400/month/peacekeeper.  This is cost-effective for Egypt (2812 peacekeepers), where the average soldier costs $672/month, but means that even South Africa is effectively subsidizing the UN when it contributes peacekeepers (average South African expenditure/soldier/month = … Continue reading often from armies not known for their efficacy. Given existing variation in peacekeeper troop quality, it may be possible to use additional funding to rely more on the higher quality peacekeeping troops,[10]This might also serve to reduce the number of bad PR incidents involving peacekeeping troops. Well-trained soldiers are hopefully less likely to be involved in sexual abuse of minors. and less on cheap troops from countries with histories of major human rights violations from their armed forces. Alternatively, it might be possible to spend some additional funding on better training peacekeeping troops. Training for peacekeeping is currently two weeks long due to the “limited time available for professionals” and the need “to pay staff while enrolled in long-term training programmes.” This does not seem like enough time to train people to operate in a foreign country, under high stress conditions, especially if they are going to serve as an example of the rule of law.[11]By contrast, the US Peace Corps has 10-12 weeks of training, and explicitly does not operate in conflict zones.[12]Most peacekeepers come from countries that aren’t well known for their strong rule of law; the top contributors are Bangladesh, Nepal, India, Rwanda, Ethiopia, and Pakistan. #### 6.2.3 Disarmament, Demobilization, and Reintegration (DDR) Programs There are relatively few evaluations of programs to reintegrate ex-combatants. This is unfortunate, as they are nearly universal and would seem to be a key to maintaining peace – it is not uncommon for former combatants to join other conflicts or start their own, new conflicts.[13]This can be within the same country – for instance, Ahmad Shah Massoud clearly never stopped rebelling – or in another country – for instance, Paul Kagame, who would lead the RPF in the Rwandan Civil War, began his career as soldier in Musevini’s army in the Ugandan Bush War. In order to encourage fighters to truly put their weapons down, DDR programs often have multiple goals – both combatant reintegration into the economy (increasing the opportunity cost of rejoining the war) but also convincing fighters to support the peace process (so that they would not want to rejoin anyway). The very limited literature (a JPAL/IPA review includes only two papers) shows limited support for the former and no support for the latter. The program Gilligan, Mvukiyehe, and Samii 2012 evaluates – in Burundi – increased employment, but the program evaluated in Humphreys and Weinstein 2007 – in Sierra Leone – did not. Neither GMS 2012 nor Humphreys and Weinstein 2007 find any impact of the DDR programs they evaluate in breaking social linkages from rebel groups or increasing support for peace. A friend running a survey on ex-FARC rebels was also quite skeptical of the efficacy of the DDR program in Colombia. The empirical results here don’t accord terribly well with my priors; I’d expected that DDR programs would be well-defined and effective. Nonetheless, it doesn’t seem like there’s much to fund here; DDR programs are often run by either the government (as in Colombia) or by a UN mission if there is one (El Salvador, Cambodia, Mozambique, Angola, Liberia, Sierra Leone, Guatemala, Tajikistan). The lack of evaluations and reports suggests that there has been relatively little NGO input in the process. #### 6.2.4 Community-Driven Development Community-driven development has been a very popular form of intervention since the 1990s. In a typical CDD program, an NGO provides funding for development projects provided that the local community plans how that development money should be spent. There is usually an area of development that the funding is earmarked for (e.g. health, education), but the community can choose what to implement within that broad category. In theory, this encourages the local community to take ownership of the project, allows them to choose what they need, and the resulting project will provide services in contexts where the government is generally absent. In a setting where the economy is growing and the government is providing a useful service, locals are less likely to choose to support rebel groups and conflict will be reduced. In practice, this does not appear to be the case. Across a large sample set (not necessarily in conflict areas), Rao and Mansuri 2013 finds no evidence that CDD projects improve social cohesion. JPAL has evaluated a handful of CDD projects in post-conflict societies, and also found little evidence of change in governance. There is even some evidence that CDD projects become rebel targets, and are likely to increase rather than decrease conflict (Crost, Felter, and Johnson 2014). Due to their dubious efficacy, CDD programs are somewhat falling out of favor as development interventions, and given the evidence, I see no reason Open Philanthropy should try to change that. When we advocate for more global aid, this should likely not be one of the kinds of aid we advocate for. #### 6.2.5 Cognitive Behavioral Therapy Most pre-civil war countries are not at peace; rather, they have some background level of conflict that does not rise to the level of a civil war. Perhaps a murder goes unsolved and a community is blamed (see tractability and speed); perhaps some tourists are kidnapped, perhaps two allied political parties fall out, perhaps police kill a few civilians. Some percentage of these low-level conflicts – events where a few people are killed, but not dozens – are likely to escalate to further violence. (In the above linked cases, they did, but not all tourist kidnappings or cases of extrajudicial killings lead to rebellions.) The experts we spoke to suggested it is almost a stochastic process which conflicts escalate and which do not. While it’s difficult to predict which particular violent event will spiral into wider conflict, probably one of them will eventually. If there are fewer low-level violent events, the number of opportunities for things to spiral out of control is also reduced. Ex-combatants are often a source of this low-level violence, as they previously used violence to resolve disputes. Small studies show that giving ex-combatants cognitive behavioral therapy can make them less likely to use violence in future (Blattman, Jamison, and Sheridan 2019; Heller, Shah, Guryan, Ludwig, Mullainathan, and Pollack 2016). In one study, even therapy with non-licensed professionals with minimal training (Dinarte and Egana-delSol 2019) was successful at reducing violent activity and improved participants’ level of emotional regulation. CBT may have contributed to Liberian men choosing not to join the Ivorian civil war, but it’s equally likely job training increased their opportunity costs such that being a mercenary was less attractive. CBT seems to be most effective when combined with other forms of intervention – that is, CBT and cash are more effective than either cash or CBT alone (though notably CBT plus cash also costs more) – and the effects fade with time, so it is possible ongoing interventions would be required. It is worth noting CBT seems less effective for the “hard cases”; Dinarte and Egana-delSol 2019 finds that the positive results are driven by people who were committing less violence to start with. This may limit the efficacy of CBT if the majority of violence is committed by a small number of individuals.[14]I don’t know if post-war violence is evenly distributed across the violence, or mostly comes from a small number of individuals. It is also not clear how this would scale. CBT tends to be quite resource (and people) intensive, and requires direct contact with every individual. (The Blattman study involved 999 men; Dinarte and Egana-delSol 1000 children.) While it is promising that such programs may not require trained professionals as facilitators, there is only one study that shows that. I would thus put this firmly in the bucket of “promising, but needs more research to determine how implementation would work at scale”. If OP was interested in this, we would want to invest in more research. I would be particularly interested to see more research on training facilitators for therapy; there is a fairly limited supply of trained psychologists to conduct interventions; I really only think this has legs if it’s fairly easy to train people to teach CBT. #### 6.2.6 Cash Transfers and/or Job Training Per Paul Collier, civil wars are not a problem of grievances, but of greed. Combatants choose to participate in rebellion because E[participation] > E[not participating]. For combatants to choose to lay down their arms, and continue to not participate in the war, E[not participating] must be greater than E[participating]. This is challenging because ex-combatants often have limited non-martial skills and little ability to support themselves in a post-civil war economy. Cash transfers and job training make the non-martial world more appealing and increase the opportunity cost to participating in the war. In theory, this will also serve to win hearts and minds – that given better opportunities, youths will be less interested in the war and more interested in their economic prospects. They will hopefully also be more supportive of the government that provided said economic prospects. The theory here is not dissimilar to that of most DDR programs. As DDR programs have also found, encouraging young people to work is considerably easier than changing their minds about politics. Most – but not all – employment programs do seem to increase wages and employment outside of illicit activities. Programs in LiberiaUganda, the Ivory Coast and the USA have all been relatively successful (though one in Afghanistan was not). The record on conflict prevention and social cohesion is more mixed. There is at least one success – a job training program in Liberia kept men from joining the nearby Ivorian civil war, though there was no impact on social cohesion within Liberia. In Afghanistan, cash transfers increased support for the Taliban and job training increased support for the government; in Uganda, there was no impact on social cohesion or protest. A review article shows “few sustained positive impacts of programs on stability, even in the face of economic gains”. As a development intervention, job training appears to show some promise. As a way to reintegrate fighters, job training seems to have little impact. #### 6.2.7 Alternative Dispute Resolution (ADR) Cognitive behavioral therapy is probably the best studied way to reduce the number of low-level violent events, but it is not the only way. Since the desired outcome is a behavior change rather than a mindset change, it is plausible that changing social norms would suffice to reduce using violence to resolve disputes. In Liberia, Hartman, Blair and Blattman find reasonably large effects from a training program on resolving disputes peacefully. Over three years, the number of violent disputes declined 9%, at a cost of about$14,000 per village.  How cost effective this is depends heavily on the assumptions you use; if a national program of ADR reduced the chance of a civil war as brutal as the Liberian Civil War by 4.5%, it would be near a 1000x return, but if it reduced the chance of a less severe civil war by 4.5%, the return would be nearer to 100x[15]The Liberian Civil War involved considerably more deaths than would have been expected for a country of Liberia’s size. (BOTEC here).

Mercy Corps finds a giant effect from an ADR program in Nigeria – a 43% (!) reduction in reported violence.  Even if the effect is considerably smaller than that (likely), it likely had an effect of >1000x.

However, all the caveats about cognitive behavioral therapy also apply here.  Alternative dispute resolution is hard to scale, relatively expensive (possibly cheaper than CBT at scale, but it’s not clear), and it’s unclear how many civil wars reducing low-level violence would really avert.  Still, given there’s at least one paper that suggests the return could be >1000x, this is where I would spend research money first.

#### 6.2.8 Contact Interventions and Mass Media

Violence is often fueled by a fear or hatred of the “other”.  This is often but is not exclusively a racial or ethnic other; it can be a class/occupation other, such as in farmer/herder conflicts in the Sahel.

There is good evidence that the more one interacts with the othered group, the less likely one is to hold prejudiced views.  A number of programs therefore have tried to promote intergroup cooperation through promoting direct interaction (for instance, through mixed soccer teams or mixed vocational training) or by broadcasting mass media showcasing cooperation.

Direct Contact

In general, direct contact interventions reduce reported prejudice, though this is not a universal finding. This is probably one of the more reliable and replicated findings in the conflict literature – that increased contact with an outgroup leads to reduced negative feeling towards that group.

There is somewhat less good evidence that this affects one’s behavior toward the othered group.  It’s possible this is simply because the interventions are generally quite small in scale, and are swamped by the political context in which people are living.  However, it’s also not clear that a change in reported prejudice will always map to a change in discriminatory behavior.

Interestingly, there is a recent exception to the relative lack of impact on behavior.  Mercy Corps recently did an impact evaluation on a peacebuilding program in the Middle Belt in Nigeria[16]Not generally considered a civil war, but has often exceeded 1000 deaths per year. and found positive effects on economic behavior.  As the conflict worsened elsewhere, and herders were less inclined to come to the market to do business with farmers in control communities, interactions in the treated communities were stable.

This seems like the most promising intervention studied, because it shows a clear theory of change for blunting the economic impact of civil war.  Per my BOTEC, most of the cost of civil war comes from a loss of income rather than a loss of life, and the severity of the economic contraction varies widely across civil conflicts.

I do not know how to map a decrease in contact to a decline in GDP but a return of >1000x seems within the range of possibility.  This intervention ($60/person in Nigeria) prevented a 13%-19% decline in pastoralists selling products at the market. If this could shift a village from a path of a 7% economic contraction to a 2% economic contraction during the war,[17]A 5% shift is essentially made up here, but it seems plausible enough – pastoralists selling at the market isn’t the only economic activity in a village, but it’s probably a significant part of the economy. this would avert losses of approx.$1000 USD ($50K OP value) in income/person, for a return of 833x. Mass Media Rather than relying on direct contact, some interventions use mass media to show cooperation between groups. Multiple experts mentioned one NGO – Search for Common Ground – as being particularly good at this type of intervention (they reach 51M people currently). The evidence for this is somewhat less strong than for direct contact. In Nigeria, a broadcast from religious leaders led to higher acceptance of ex-combatants, and one intervention in Rwanda made ethnicity less salient. However, a different mass media intervention in Rwanda did not change attitudes towards outgroups. A talk show in the DRC designed to reduce prejudice actually increased it. Mass media is substantially cheaper than a direct contact intervention, though, and avoids difficulty scaling. I would be interested to see what results come from a Search for Common Ground mass media intervention in Nigeria (funded 2020, PIs Dube and Robinson). #### 6.2.9 Investigative Journalism Two of the experts we spoke to expressed skepticism of the effectiveness of journalism for journalism’s sake. However, both seemed to think that journalism can provide information for targeted sanctions. Both government and rebel leaders often become very wealthy, and most of that money is not kept in-country. International sanctions have the potential to put pressure on individuals to come to the table by freezing access to their money. Country-wide sanctions do seem to end conflicts somewhat sooner (Letzkian and Regan 2016Escriba-Folch 2010), but these results are not strong. Lezkian and Regan find sanctions alone are not terribly effective – military intervention is also required – and Escriba-Folch finds that total economic sanctions for the country are required, which can have major humanitarian implications. The theory for country-wide sanctions thus seems not great – this requires journalism to get sufficient traction to change policy, and then the policy may or may not actually work. There is some anecdotal evidence that individual sanctions can be more effective than country-wide, though I can find no systematic work on this. One person we spoke with mentioned further conflict was averted in Kenya in 2007 by threatening the leaders’ children’s visa statuses in the west. I am still very uncertain how this would actually work, though. Funding journalism to reduce conflict requires quite a lot of things to happen after the production of a story. Policy-makers must read the story, decide to act, apply (hopefully individual-level sanctions), and if all that happens, it is possible that rebels will be incentivized to make peace. I also think too much international attention risks drawing a major power directly into the conflict. This requires a sweet spot where the major powers care enough to apply sanctions but not get involved directly. If the civil war is too important, a major power may decide to back one side and the conflict will become much more complicated. For instance, the Syrian Civil War is probably the most documented conflict everthe country remains sanctioned, and yet the war continues, because of a major power intervention. #### 6.2.10 Mediation and Diplomacy Multiple people mentioned that quality mediation can make a difference in getting a peace deal. The difficulty is figuring out what “quality” mediation is; it’s not clear to me how we would evaluate the quality of high-level mediation. The academic literature on mediation has varied conclusions – possibly it works, possibly it doesn’t. However, mediation is definitely cheap; even a small probability of success would make mediation cost effective. For instance, the Center for Humanitarian Dialogue is one of the larger such organizations. Their annual budget is about$42M/yearly, and they are active in 23 conflict zones, for an average cost of $2M per country. If a marginal Center for Humanitarian Dialogue mediation-year has a 0.52% chance of ending a war one year sooner, it would reach an OP return of 1000x. #### 6.3 Sources of Uncertainty #### 6.3.1 Individual-Level Measures My most significant source of uncertainty is how much we can rely on individual measures to tell us about a country-wide event. Much of the post-2010 literature on conflict has focused on micro-level interventions with strong causal identification. In general, the treatment is applied to individuals or communities; outcomes are measured at the individual level. Unfortunately, civil war is not an individual action, and it is not clear how changing individual actions map to 1) meso-scale rebel group organization, 2) macro-scale war. Attitudes exist within a context of political institutions; do we believe that changing attitudes without changing institutions will change the likelihood of civil war? I think the answer to that depends on one’s theory of civil war escalation. Most of the experts we spoke to model civil war escalation as a stochastic process, where some events end up escalating but it is not clear ex-ante which ones will do so. In this case, reducing the number of conflict events should reduce the number of events that spiral out of control (see alternative dispute resolution). #### 6.3.2 The Changing Nature of Civil War This shallow discusses prevention of civil wars where there is not extensive international intervention. For much of the 1990s and 2000s, this has been most civil wars (see below) However, there has been a recent uptick in proxy wars or wars where at least one side is substantially backed by a great power. The interventions described below are unlikely to be as successful in Syria or Ukraine, or even Yemen, as they are in Nigeria or Liberia. I do not have a good sense of how likely future civil wars are to have major international intervention. I am concerned that the universe of civil wars in which such interventions may work is shrinking. On the other hand, civil wars don’t start as internationalized conflicts; if events are de-escalated before major powers are interested, perhaps the interventions themselves could prevent civil wars from becoming internationalized. Footnotes ↑1 The GBD estimate includes only direct DALYs, so war injuries/war deaths but not deaths caused from disrupted access to healthcare, etc. Paul Collier finds the financial cost of the average civil war is$64B 2008 USD in The Bottom Billion, though he doesn’t specify in any detail how he reaches this figure. This is greater than the cost of the number of civil wars that started in 2017 * the cost of a civil war, because more civil wars start than resolve, so the number of civil wars is increasing over time. On average,  ~100 people are displaced per battle death. The true value obviously varies substantially by the size of the country and the severity of the civil war; even small countries can have particularly deadly civil wars (Liberia), while India is a large country that has had several small civil conflicts. Estimating government military and policing spending to reduce the chance of civil war seems difficult, so I’m not doing it. Note: This section describes how OP might support an intervention if it chose to do so, rather than describing a recommendation for what OP should do. If an intervention doesn’t work, OP would not fund it or advocate for it. Once they are deployed; the process of getting peacekeepers deployed is notoriously long. The UN reimburses at $1400/month/peacekeeper. This is cost-effective for Egypt (2812 peacekeepers), where the average soldier costs$672/month, but means that even South Africa is effectively subsidizing the UN when it contributes peacekeepers (average South African expenditure/soldier/month = $2700). France (618 peacekeepers), Germany (563 peacekeepers), and the UK (605 peacekeepers) are unlikely to want to contribute more personnel at that reimbursement rate. This might also serve to reduce the number of bad PR incidents involving peacekeeping troops. Well-trained soldiers are hopefully less likely to be involved in sexual abuse of minors. By contrast, the US Peace Corps has 10-12 weeks of training, and explicitly does not operate in conflict zones. Most peacekeepers come from countries that aren’t well known for their strong rule of law; the top contributors are Bangladesh, Nepal, India, Rwanda, Ethiopia, and Pakistan. This can be within the same country – for instance, Ahmad Shah Massoud clearly never stopped rebelling – or in another country – for instance, Paul Kagame, who would lead the RPF in the Rwandan Civil War, began his career as soldier in Musevini’s army in the Ugandan Bush War. I don’t know if post-war violence is evenly distributed across the violence, or mostly comes from a small number of individuals. The Liberian Civil War involved considerably more deaths than would have been expected for a country of Liberia’s size. Not generally considered a civil war, but has often exceeded 1000 deaths per year. A 5% shift is essentially made up here, but it seems plausible enough – pastoralists selling at the market isn’t the only economic activity in a village, but it’s probably a significant part of the economy. ## Social returns to productivity growth ## Intro At Open Philanthropy we aim to do the most good possible with our grantmaking. Historically, economic growth has had huge social benefits, lifting billions out of poverty and improving health outcomes around the world. This leads some to argue that accelerating economic growth, or at least productivity growth,[1]If environmental constraints require that we reduce our use of various natural resources, productivity growth can allow us to maintain our standards of living while using fewer of these scarce inputs. should be a major philanthropic and social priority going forward.[2]For example, in Stubborn Attachments Tyler Cowen argues that the best way to improve the long-run future is to maximize the rate of sustainable economic growth. A similar view is held by many involved in Progress Studies, an intellectual movement that aims to understand and accelerate … Continue reading In this report, I describe a model that helps assess this view and inform our Global Health and Wellbeing (GHW) grantmaking. Specifically, I focus on quantitatively estimating the social returns to directly funding research and development (R&D), in a relatively simple/tractable model. I focus on R&D spending because it seems like a particularly promising way to accelerate productivity growth, but I think broadly similar conclusions would apply to other innovative activities. Having a reliable estimate of the social returns to innovation would allow us to quantitatively compare it to other potential philanthropic activities like cash transfers to the global poor or public health interventions that extend life, and then accordingly allocate our total funding in the optimal fashion. However, we’re not sure how much weight to put on this particular estimate given the simplifications and assumptions involved. In brief, I find that: • In a highly stylized calculation, the social returns to marginal R&D are high, but typically not as high as the returns in some other areas we’re interested in (e.g. cash transfers to those in absolute poverty). Measured in our units of impact (where “1X” is giving cash to someone earning$50k/year) I estimate the cost effectiveness of funding R&D is 45X. This is 45% the ROI from giving cash to someone earning $500/year, and 4.5% the GHW bar for funding. More. • This is an estimate of the average returns to R&D, but the best R&D projects might have higher returns. In addition, leveraged opportunities to increase the amount of R&D — like advocating for a more liberal approach to high-skill immigration — could have significantly higher returns. More. • Returns to R&D were plausibly much higher in the past. This is because R&D was much more neglected, and because of feedback loops where R&D increased the amount of R&D occurring at later times. More. • The stylized estimate has many important limitations, and is not an all-things-considered estimate of the social returns to R&D. For example: • It omits potential downsides to R&D, e.g. increasing global catastrophic risks. For certain types of R&D, these downsides may significantly outweigh the benefits. This is a very significant limitation of the estimate. More. • It focuses on a specific scenario in which population growth stagnates and historical rates of returns to R&D continue to apply. In this scenario, productivity growth eventually stagnates. While this is arguably the most popular model of the future, it is not the only plausible one, and I discuss how placing weight on alternative scenarios would change the bottom line. More. • One such scenario is that R&D today brings forward the development of a future technology, like advanced AI, that accelerates R&D progress much more than past technologies. This could significantly increase the returns to R&D. But conditional on such a scenario Open Philanthropy sees a stronger case for reducing long-term risks from this future technology than accelerating its development. More. • Overall, the model gives us a new stylized value for the ROI of abstract marginal R&D spending that we may use in Global Health and Wellbeing cause prioritization work, though we have substantial uncertainty about how much weight to put on it given all of the assumptions and limitations. The stylized value we get out does make us think that some causes aimed at accelerating overall innovation, like science policy or high-skill immigration advocacy, would likely pencil out as above our GHW bar, but it also leaves us relatively skeptical of arguments that accelerating innovation should be the primary social priority going forward. More. ## My estimate of the social returns to R&D I draw heavily on the methodology of Jones and Summers (2020). I won’t explain their model in full, but the basic idea is to estimate two things: 1. How much would a little bit of extra R&D today increase people’s incomes into the future, holding fixed the amount of R&D conducted at later times?[3]An example of an intervention causing a temporary boost in R&D activity would be to fund some researchers for a limited period of time. Another example would be to bring forward in time a policy change that permanently increases the number of researchers. 2. How much welfare is produced by this increase in income? For part 1, I use economic growth models that connect R&D investments with subsequent productivity growth. For part 2, I use a simple log-utility model: welfare = k + m*log(income).[4]Three comments on the log-utility model. First, the results are the same whatever the values of the constants k and m. Second, I do a sensitivity analysis of the consequences of different utility functions; if the diminishing returns to income are steeper than log, this favours cash transfers more … Continue reading This log-utility model has two implications that I will use: • Increasing someone’s income by 10% has the same welfare effect whatever their initial income. • Increasing one person’s income by 10% has roughly the same welfare effect as increasing ten people’s incomes by 1%, or as increasing 100 people’s incomes by 0.1%, or 1000 people’s incomes by 0.01%.[5]log(110) – log(100) ~= 10*[log(101) – log(100)] ~= 100*[log(100.1) – log(100)] ~= 1000*[log(100.01) – log(100)]. ### Toy Example Here’s a toy example to roughly demonstrate how I calculate the social returns to R&D, and how this can be compared with cash transfers to people in global poverty. Let’s estimate the welfare benefits of spending$20 billion on R&D.

• Total global R&D spend is $2 trillion per year. This produces frontier productivity growth of 1% per year. •$20 billion would increase global R&D spend by a fraction of 1/100 for 1 year. So in that year, rather than 1% productivity growth we’d expect to have 1.01% productivity growth.[6]More realistically, there will be a lag before productivity benefits are felt. Currently I don’t model this lag because it wouldn’t affect the results by much. I use a discount of 0.2%; so a 50 year lag would reduce the returns to R&D by ~10%.
• In subsequent years, everyone’s incomes will be 0.01% higher because of the extra money spent on R&D in that one year.
• The benefit in each year is equal to (number of people alive) * (value of raising someone’s income by 0.01% for one year).
• Let’s ignore the benefits after 50 years, as a rough way to incorporate a discount rate.
• For simplicity, let’s assume the number of people alive is constant at 8 billion.

These assumptions imply that:

Social returns to $20 billion on R&D = (number of people alive) * (value of raising someone’s income by 0.01% for one year) * (years of benefit) = 8 billion * (value of raising someone’s income by 0.01% for one year) * 50 = 400 billion * (value of raising someone’s income by 0.01% for one year) = 400 million * (value of raising someone’s income by 10% for one year) This last line follows from the log-utility model: increasing 1000 people’s incomes by 0.01% has the same welfare effect as increasing one person’s income by 10%. The conclusion is that spending$20 billion on R&D has the same welfare benefit as increasing the incomes of 400 million people by 10% each. (In fact it would increase many more people’s incomes by a much smaller amount; the log-utility model allows us to express the welfare benefit in this way.)

An alternative altruistic intervention is simply to transfer cash directly to the global poor.[7]GiveDirectly implements this intervention. Note, I use simplified numbers in this post that don’t exactly match GiveDirectly’s cost effectiveness, and I believe GiveDirectly is somewhat more impactful than the numbers I use imply. For someone living below the international poverty line on $500/year,$50 raises their income by 10% for one year. With $20 billion, you could do this 20 billion / 50 = 400 million times. The total benefit would equal 400 million * (value of raising someone’s income by 10% for one year). In this toy example, the social returns to R&D exactly equals that from cash transfers. As we’ll see below, a more realistic calculation seems to favour cash transfers over R&D. The main factor favouring R&D is its potential to help so many people as technological innovations spread across the world. The main factor favouring cash transfers is that even a small amount of money can significantly improve the lives of people in poverty. ### A more realistic calculation The actual calculation of the social returns to R&D differs from this toy example in a few important ways. This table summarises these differences, and their effects on the social returns to R&D. Some of the differences interact in complex ways, so there’s no simple way to describe their quantitative effect.  Difference from the toy example Effect on the social returns to R&D compared to the toy example I count benefits over a longer period of time Increase I recognise that ideas are getting harder to find Decrease I use UN population projections Small increase I think that some people might not benefit from frontier technological progress. Decrease, 0.7X I only give R&D partial credit for productivity growth Decrease, 0.4X I assume diminishing returns to adding more researchers within any given year Small decrease I incorporate capital deepening: higher productivity → more machines → higher incomes. Increase, 1.5X Total, combining all the above differences Decrease, 0.45X Let’s discuss each difference in more detail. I count benefits over a longer period of time. We should value improving someone’s life equally whether they live now or in 500 years time.[8]Note, rising incomes mean we don’t value adding an equal dollar amount to people’s incomes the same amount through time. We value a dollar more today because people today are poorer than they will be in the future. I use a small pure time discount of 0.2%, representing the possibility that a major disruption (e.g. extinction) prevents the welfare benefits of R&D from occurring at all. This small discount pushes towards placing a higher value on R&D, compared to the toy example. I recognise that ideas are getting harder to find. Let’s say that R&D makes progress by coming up with new ideas, and define an ‘idea’ so that each new idea raises productivity by 1%. It turns out that there’s good evidence that it takes more research effort to find new ideas than it used to.[9] During the 20th century, the number of researchers grew exponentially, but productivity growth did not increase (in fact it decreased slightly). If R&D is responsible for the productivity growth, then more research effort is required to achieve each subsequent 1% gain in productivity. One plausible explanation is that the most obvious ideas are discovered first, so that over time increasingly difficult ones remain. Another explanation is that researchers must spend increasingly long studying before they’re able to contribute to their fields. Importantly, ideas are getting harder to find despite researchers having better tools (e.g. the internet) to aid their research today than in the past. Even with these improved tools, it still takes more research effort to find new ideas than it used to. How does this dynamic affect social returns to R&D? In the toy example, the extra R&D caused everyone’s incomes to be 0.01% higher forever. It turns out that once you incorporate ‘ideas getting harder to find’, this is no longer true. The % income benefit shrinks over time and approaches 0%.[10]Note: this does not mean that the absolute$ increase in incomes shrinks over time. It may decline, stay constant or increase, depending on the rate at which ideas are getting harder to find. Technically, if the “fishing out” parameter $$\phi$$ > 0, then the absolute $benefit increases … Continue reading This is shown in the diagram below, which compares the total factor productivity (TFP) in a world without any intervention (orange) and a world where an intervention temporarily boosts R&D activity (blue). Note: the y-axis is log, so the gap between the lines represents the % difference in TFP, not the absolute difference. The initial % increase declines towards 0% over time. This dynamic decreases the social returns to R&D, compared to the toy example. Why does the % productivity increase decline in this way? Essentially, the initial extra R&D “steals” easier to find ideas from future years, making future research less productive (see footnote for more detail).[11]The key point is as follows: when ideas are getting harder to find, the number of new ideas found with a marginal researcher-year is roughly proportional to 1 / (total researcher-years so far). So if the 100th researcher-year finds 1/100 new ideas, the 200th researcher-year will find only 1/200 new … Continue reading How quickly are ideas getting harder to find? I use an estimate from Bloom et al (2020), which looks at how research efforts translated into TFP growth in the US from 1930 to 2015. The implication is that each time TFP doubles, it becomes ~5X as hard to find a new idea.[12]Mathematically, in the semi-endogenous growth model the effort needed to find a new idea is proportional to TFP^($$\phi$$ – 1), where $$\phi$$ is the parameter controlling how quickly ideas are getting harder to find. I use $$\phi$$ = -1.4, so every time TFP doubles the effort needed to … Continue reading As a result of this dynamic, most of the benefits of today’s R&D occur in the first 100 years despite the small discount rate. I use UN population projections. When estimating the number of future beneficiaries for today’s R&D, I used the UN population projections which forecast that the global population will rise to around 11 billion by 2100 and then remain at that level.[13]In the long run, there are reasons to think population will fall (fertility rates in developed countries), reasons to think it might increase (relating to biological and cultural evolution), and no compelling reason to think it will stay exactly the same. Still, this feels like a fair … Continue reading These projections also inform my estimate of the amount of R&D that will be done in each year in the future. This contrasts with formal economic models in which the population is typically assumed to be constant or increasing exponentially. The toy example assumed that the population would remain at 8 billion. Compared to this, using UN population projections increases the number of beneficiaries and so increases the returns to R&D. I think that some people may never benefit from frontier technological progress. The toy example assumed that everyone benefits from frontier technological progress. After all, people in all continents use technologies like smart phones, cars and solar panels. However, I’m very uncertain about whether people in low-income countries will ever feel the full benefits of frontier TFP growth. For example, some agricultural R&D done in the US won’t ever be applicable in countries with different climates. Currently, I assume (arbitrarily) that if frontier TFP increases by 10% then TFP around the world will eventually increase by 7%.[14]The lag until productivity benefits are felt will probably be larger in low income countries than in high income countries. As mentioned above, I don’t model this lag because it wouldn’t affect the results by much. I use a discount of 0.2%, so a 50 year lag would reduce the returns to R&D … Continue reading This part of the calculation is particularly uncertain about and I’m interested in suggestions for how to think about this. This adjustment multiplies the social returns to R&D by a factor of 0.7. I only give R&D partial credit for productivity growth. Activities with a potential claim to credit include misallocation reduction, business innovation (e.g. startups), learning by doing and capital accumulation. Ultimately, I credit R&D with 40% of productivity growth and explain my reasoning here. This straightforwardly multiplies the bottomline by 0.4. In an appendix, I briefly sense-check this assumption against studies using statistical techniques to tease out the causal impact of R&D on growth. Naively, this suggests I should give R&D more credit for growth, but there are a number of complications involved in the comparison. I assume diminishing returns to adding more researchers within any given year. The toy example assumed that if you increase R&D funding by 1%, you’ll make 1% more R&D progress. That logic implies that doubling R&D funding would double the rate of R&D progress. However, there may in fact be diminishing returns to spending, e.g. because some research effort is duplicated. In line with this, the model assumes that the marginal$ spent on R&D causes only 75% as much R&D progress as the average $spent.[15]In economic growth models, this corresponds to the “stepping on toes” parameter λ = 0.75. I’m not aware of data that pins down λ, and it seems like values between 0.4 and 1 could be correct. I use the estimate from Bloom et al. (2020) Appendix Table A1, where they set λ = 0.75 and then … Continue reading I incorporate capital deepening: higher productivity → more machines → higher incomes. Suppose you invent a drug (e.g. caffeine) that makes everyone slightly better at their jobs. We can distinguish between two effects. A primary effect is that everything people buy is higher quality – e.g. better haircuts, tastier food, faster transport – because the people producing these goods and services are better at their jobs. A secondary effect is that when people and companies invest in buying tools to help them do their jobs (e.g. computers), they’ll get more tools. After all, the people producing these tools are better at their jobs. Having more tools makes people better at their jobs. In the economics literature, this secondary effect is called capital deepening.[16]The primary effect is recorded as a TFP increase because GDP went up holding constant the amount of labour and physical machinery. The secondary effect is recorded as capital deepening because each person has more physical capital (i.e. more or better machinery). Both effects ultimately increase the quality or quantity of goods and services produced, and so raise incomes. My toy example included the primary effect, but not the secondary effect. Including both effects increases the benefit by a factor of 1.5.[17]Growth theory relates the size of these effects on income: (income increase from TFP and capital deepening) = (income increase from TFP alone) / (1 – capital share of GDP). The capital share is about 35%, so this multiplies the bottom line by 1 / (1 – 0.35) = 1.5. ### Bottom line – social returns to R&D Once we incorporate all these changes to the toy example, what are the social returns to R&D? The toy example found that a marginal$20 billion to R&D has the same welfare benefit as increasing the incomes of 400 million people by 10% for one year. (In fact it would increase many more people’s incomes by a much smaller amount; the log-utility model allows us to express the welfare benefit in this way.)

With these changes, $20 billion to R&D has the same welfare benefit as increasing the incomes of 180 million people by 10% each for one year. (Again, it would actually increase many more people’s incomes by a much smaller amount; the log-utility assumption allows us to express the welfare benefit in this way.) This is 45% of the benefit calculated in the toy example. We can break this decrease down into a 0.4 penalty from only giving R&D partial credit for productivity growth, a 0.7 penalty from uncertainty about whether frontier TFP progress really spills over to the whole world, and a 1.5 gain from capital deepening; combining these yields a 0.42 penalty. Then the other changes mostly cancel each other out. (If you want to know more, there’s a full description of the model in this appendix.) To make this result more relatable, let’s consider smaller expenditures. Dividing both the costs and benefits by 180 million, ~$110 on R&D has the same welfare benefit as increasing one person’s income by 10% for one year.

These returns are high. However, they’re not as high as cash transfers to people in global poverty. $110 to someone living on$500/year increases their income by 22% for one year.

 Intervention Welfare impact of $20 billion Welfare impact of$110 Cost effectiveness in Open Philanthropy’s units of impact R&D (final calc) Increase the incomes of 180 million people by 10% for 1 year. Increase the income of one person by 10% for 1 year. 45X In fact, R&D would increase many more people’s incomes by a much smaller amount and for a much longer time; the log-utility model allows us to state the benefit in this way. Cash transfers to people on $500/year[18]As mentioned in a previous footnote, I think GiveDirectly is somewhat more impactful than the numbers in this row. Increase the incomes of 400 million people by 10% for 1 year. Increase the income of one person by 22% for 1 year. 100X Cash transfers to people on$50,000/year Increase the incomes of 4 million people by 10% for 1 year. Increase the income of one person by 0.22% for 1 year. 1X

The GHW team at Open Philanthropy aims to only make grants whose expected impact is above a certain bar. Our current tentative bar is 1000X, as measured in units where “1X” is giving $1 to someone earning$50k/year. In these units R&D comes out as 45X, 4.5% of the bar.

There’s an important sense in which this comparison is biased in R&D’s favour. The calculated benefits from R&D include those that occur many decades into the future, while the income increase from a cash transfer is immediate. If we included long-run benefits from cash transfers, they would beat R&D by a wider margin.

### The best pro-growth interventions are better than average R&D

One important caveat is that there may be leveraged ways to boost the amount of R&D. For example, lobbying for more high-skilled visas could increase the effective global number of skilled R&D workers, accelerating R&D progress more than paying for R&D directly.

A second caveat is that we’ve estimated the average impact of marginal R&D funding. Of course, the actual impact of any particular grant could be much larger or much smaller than this, depending on the project being funded. If a funder can consistently identify particularly promising projects, their impact could be larger than my estimate. One way to do this might be to focus on R&D projects that are specifically designed to help the global poor. Just as $1 goes further when transferred to the global poor, so too R&D might be more effective when targeted in this way. Some of those involved with Progress Studies think accelerating innovation should be the world’s top priority. I discuss ways in which my outlook differs from theirs in this appendix. ## Funding R&D was even better in the past Today the returns to R&D are high. I think that they were even higher in the past for a couple of reasons. R&D was more neglected. We’ve seen that ideas are getting harder to find over time as the easiest ones are discovered. In the past, much less R&D had been done in total and so ideas were significantly easier to find. Appendix F estimates that the fraction of the economy dedicated to research was 36 – 96X smaller in 1800 than today. R&D increased the amount of R&D occurring at later times. I think that, historically, there were two mechanisms by which R&D caused more R&D to occur at later times. • Increasing the fraction of resources used for R&D. The first is providing evidence that R&D was a worthwhile activity. The fraction of people doing R&D has increased significantly over time.[19]For example data from Bloom et al. 2020 find the number of US researchers increasing by 4.3% per year on average since 1930. US population grew less than 1.5% per year on average in the same period, implying that the fraction of people doing research was growing. Earlier instances of R&D are probably an important reason, as their success fuelled the expansion of R&D efforts. The mechanism is: R&D → evidence of success → more R&D. Today this mechanism is probably less important, as there is already ample evidence of the fruitfulness of R&D (though funding specific neglected areas of R&D can have a similar effect today if you are better at predicting fruitful areas than the average funder). • Increasing the future population. The second is that, going back hundreds and thousands of years, productivity improvements allowed a fixed supply of land to support larger populations, which meant more people to engage in innovative activity (though the fraction of people doing so was very low). The mechanism is: R&D → fixed supply of land can support larger populations → more people are alive to do R&D in the future. This dynamic is important in some prominent models of long-run growth.[20]See for example Lee (1988), Kremer (1993), Jones (2001) and Galor and Weil (2000). Today this mechanism is not important: population growth is determined by people’s fertility decisions rather than by how many people society is able to feed. A back-of-the-envelope calculation suggests that the combined effects of these mechanisms could be very large, with the social impact of R&D hundreds of times greater in the past than today. So it may be that historically R&D was the most promising philanthropic intervention, even if it isn’t quite as promising today. ## Limitations of the model ### The model ignores potential harms from R&D Certain types of R&D might have large downside risks. For example, gain of function research can make pathogens more deadly and transmissible, potentially increasing global catastrophic risk from a pandemic. This consideration might make funding certain types of R&D very harmful, reversing the conclusion of the model. I think it would be a mistake to act on the basis of this post without explicitly considering these downside risks. Evaluating which types of R&D pose the largest risks is beyond the scope of this post. This is a significant limitation, and highlights that this post gives a stylized estimate of the returns to R&D but does not give an all-things-considered assessment. ### The model implies growth will stagnate My mainline scenario, used for the stylized calculation, implies that productivity growth will tend to 0% per year in the very long run. Why does this happen? It’s the combination of two assumptions: • Ideas are getting harder to find. Each 1% increase in productivity requires more research effort over time, even accounting for the fact that researchers can use new technologies to aid their research effort. • Population stagnation. Population will rise to 11 billion and then remain roughly constant. Together these two assumptions imply the pace of productivity growth will slow. Population stagnation implies that the number of researchers will eventually stagnate.[21]Though the fraction of people doing research can increase, this can only go on for so long. I discuss this possibility below. Ideas are getting harder to find then implies that it will then take increasingly long to find each new idea. This implication is explored in depth in Jones (2020). The following graph shows my model’s prediction of productivity stagnation alongside a scenario where TFP grows exponentially at its recent historical rate forever.[22]There are good theoretical reasons to think TFP can’t grow exponentially at its recent rate for more than 10,000 years, but these don’t rule out exponential growth continuing for another 1000 years. Productivity stagnation is a surprising implication of the model; people might wonder whether the model is overly pessimistic about productivity growth. To address this, I explored four alternate scenarios in which productivity doesn’t stagnate: 1. Maybe we’ll avoid productivity stagnation by increasing the fraction of people doing R&D (even more than in my model)? 2. Maybe ideas will not get harder to find in the future? 3. Maybe the world’s population won’t stagnate? 4. Maybe some trend-breaking future technology will allow us to avoid growth stagnation despite ideas getting harder to find? More. I did very rough back-of-the-envelope calculations of how the social returns to R&D change in each scenario. Scenario 1 reduces the returns to R&D, multiplying the bottom line by ~0.4X. Scenarios 2 – 4 can multiply the bottom line by up to ~7X, or much more for some versions of scenario 4.[23]If R&D today expedites a future technology that massively accelerates future growth, the bottom line can increase by much more than 100X. More. If I did a weighted average across these scenarios, it would probably increase my bottom line compared to simply using the mainline scenario. (The size of the increase would be very sensitive to the specific weights used, especially for the versions of scenario 4 with massively outsized returns.) For now, the stylized estimate at the top of this post doesn’t put any weight on these scenarios. One reason for this is that the stylized estimate excludes scenarios where R&D causes large harm, so it feels fair to similarly exclude trend-breaking scenarios where R&D has large upsides. Another reason is that I want the stylized estimate to be comparable with Open Philanthropy’s impact estimates for other Global Health and Wellbeing cause areas, and we don’t consistently place weight on unlikely scenarios with large upsides. In addition, I have specific reasons for excluding each of the four scenarios: 1. My mainline scenario already involves significant increases in the fraction of people doing R&D. More. 2. This scenario seems very implausible. More. 3. The high returns in this scenario are driven by tiny benefits enjoyed by a massively expanded population 100s of years into the future. More. 4. Conditional on the development of such a growth-boosting technology in the next century, Open Philanthropy currently prioritizes work reducing risks from this technology over work accelerating its arrival. More. That said, I’m very uncertain about how much weight the GHW team should place on these scenarios and I think there is room for reasonable disagreement. It’s worth noting that even a 7X increase would leave unlevered R&D funding ~3X less effective than Open Philanthropy’s bar for funding within GHW (though it would likely imply that various levered advocacy or research spending opportunities should make up much more of our portfolio).[24]My mainline scenario found R&D to be 45% as impactful as giving cash to someone on$500/year. This implies R&D is 4.5% as impactful as our current bar for GHW grantmaking. A 7X increase would leave R&D 31.5% as impactful as the GHW bar.

I find scenario 4 the most plausible, and discuss it further in the next section. I discuss scenarios 1 – 3 at greater length in this appendix.

#### Maybe some trend-breaking future technology will allow us to avoid growth stagnation despite ideas getting harder to find?

Increasing research effort has been required to find new ideas, even though previous discoveries have made researchers more productive (e.g. calculators, coding tools, caffeine, the internet). Extrapolating this trend, we’d predict that future technological progress will make researchers somewhat more productive but that this won’t be enough to avoid productivity stagnation.

Perhaps, though, this trend won’t continue. Perhaps future technologies will enhance our research abilities more than those from the last 80 years. For example, if we develop advanced AI systems that can do independent research we might massively increase our research efforts.[25]Aghion et al. (2017) discuss the possibility that AI will accelerate productivity growth by automating research tasks. Another possibility is advanced bio-technology that radically enhances the productivity of human researchers. Call such technologies growth-enhancing technologies.[26]Note, a growth-enhancing technology might allow a constant population of human researchers to maintain ~2% productivity growth, or it might allow them to accelerate productivity growth. Open Philanthropy thinks the latter possibility is more likely than many actors seem to think, for reasons … Continue reading

Past technologies have enhanced research productivity somewhat, but we’ve still had to increase the number of human researchers to maintain constant productivity growth. Growth-enhancing technologies would (by definition) allow us to maintain constant productivity growth without increasing the number of human researchers.[27]Of course, growth-enhancing technologies might enable other trends to continue. E.g. the trend of ~2% annual growth in US GDP/capita over the past 150 years, or the trend of growth accelerating over the past 10,000 years.

If a growth-enhancing technology is developed sometime in the future, how would this alter the value of R&D today?

It turns out that this depends on whether R&D today affects the time at which the growth-enhancing technology is developed.

If R&D today doesn’t affect when a future growth-enhancing technology is developed, then its development reduces the value of R&D today. This is somewhat counter-intuitive: although more total R&D will happen, the marginal value of R&D today is lower. The reason is related to ideas getting harder to find. More R&D in later periods pushes us further up the diminishing returns curve for finding new ideas, so the additional R&D we funded makes less difference.[28]Let’s demonstrate this point with an example. Suppose an intervention causes an extra researcher-year to happen in 2021. Let’s consider its impact on TFP in 2100 if a growth-enhancing technology isn’t developed, and if it is developed. Suppose that if a growth-enhancing technology isn’t … Continue reading

If R&D today accelerates the development of a future growth-enhancing technology, its impact on future incomes could be much larger than my estimate. It would bring forward in time an income boost, raising people’s incomes for a long time into the future.[29]Appendix D does a very rough BOTEC on the returns to R&D for one possible growth-enhancing technology.

If this is the scenario anticipated by proponents of a “growth-first” worldview or the Progress Studies community, I think this would be worth being more explicit about; I take them to more typically be arguing from past trends rather than speculating about future trend-breaking technologies.

Open Philanthropy institutionally thinks the possibility of these sorts of trend-breaking future technologies is notably higher than many other actors in society seem to, and the Longtermist team of OP focuses explicitly on optimizing the expected impact of their spending in the world where such technologies are likely in the next century or so. However, conditional on placing high probabilities of the development of such technologies in the next century, the Longtermist side sees work on accelerating growth as lower impact than work reducing risks.[30]Eg. see this draft report by Joe Carlsmith on risk from power-seeking AI, or these two posts from the Cold Takes blog.

Absent more external consensus, we’re reluctant to have the prioritisation of our Global Health and Wellbeing team be driven by the possibility of trend-breaking future technologies. This is for reasons related to our views on worldview diversification.[31]Even if R&D isn’t competitive according to either worldview, might it look competitive according to a weighted sum of both? I think not. I estimate funding generic R&D to be ~10X worse than the GHW bar, and it looks significantly worse from a LTist perspective than alternative … Continue reading As such, I am currently not putting significant weight on this scenario in evaluating the social returns to R&D. We’d be interested to know if advocates for the primacy of growth or Progress Studies think that such trend-breaking future technologies are likely and/or crucial to their case for prioritizing growth going forward — that hasn’t been our impression from what we’ve read — and if so how they think about prioritizing growth relative to reducing longterm risks.[32]Appendix H discusses some potential differences between my perspective and that of Progress Studies advocates.

### Other limitations of the model

Appendix K discusses two more debatable assumptions made by the model:

• It assumes that increasing the amount of R&D in 2021 doesn’t affect the amount of R&D effort in future years.
• It assumes that welfare increases with log(income).

In both cases, the assumptions of the model could be too aggressive or too conservative about the returns to R&D.

Appendix J lists additional ways in which the stylized calculation is arguably pessimistic or optimistic about the social returns to R&D.

### How significant are these limitations?

I think these limitations are very significant. Including the downsides to R&D could make the returns to R&D substantially worse or even negative, while putting weight on scenarios with extreme upside could make the returns much better.

## Conclusion

The model discussed here gives Open Philanthropy a stylized value for marginal R&D spending that we may use in our GHW cause prioritization work. It suggests that the social returns to direct R&D spending are high, but not as high as some opportunities relating to poverty alleviation. Still, leveraged ways to boost R&D activity could be highly impactful, and Open Philanthropy may enter causes like high-skill immigration or science policy in part because of this modeling.

In addition, the research leaves me relatively skeptical of arguments that accelerating innovation is the primary social priority going forward. I estimated marginal R&D spending to be 4.5% of the GHW bar. Even if we ignore potential harms from R&D and consider an alternative scenario where R&D is 7X more valuable, directly funding R&D still doesn’t meet the GHW bar. The only scenario I considered where R&D returns are higher still – R&D today accelerates the development of a growth-enhancing technology – is one where Open Philanthropy currently, and in my view correctly, prioritizes reducing risks over accelerating timelines.

The stylized estimate in this post has huge limitations. Perhaps most important is that it excludes potential harms from R&D, and accounting for this factor could reverse the conclusion of the model. Based on this, and the model’s many other limitations, I see this post as opening a conversation about the returns to R&D rather than closing it.

## Description of the appendices

The appendices, contained in a public google doc, dig into various aspects of this post in more detail. I recommend only reading appendices that are of particular interest.

• Appendix A explains the model used in the stylized estimate of the returns to R&D, including all the assumptions needed to recover the result.
• Appendix B and appendix C do a deep dive into the model’s implications for how marginal R&D affects incomes in the short run and the long run.
• Appendix D contains very rough estimates of the social returns to R&D in the alternative scenarios mentioned in the main post, allowing for a comparison with this post’s mainline scenario. Appendix E discusses some of these alternative scenarios qualitatively. The main takeaways from these two appendices were discussed above.
• Appendix F contains a very rough estimate of the returns to R&D in 1800. The key takeaway is that incorporating the qualitative differences discussed above implies that R&D in 1800 could have been more than 100X more impactful than R&D today.
• Appendix G sanity checks some of the assumptions of the model used in this post against economics papers that use statistical techniques to try and tease out the causal effect of R&D on growth.
• Appendix H discusses how my current view differs from the views of those involved in the Progress Studies movement.
• Appendix J lists ways in which the stylized calculation is arguably too optimistic or too pessimistic about R&D spending.
• Appendix K briefly explores alternatives to two assumptions of the stylized calculation. The assumptions are:
• The model assumes that increasing the amount of R&D in 2021 doesn’t affect the amount of R&D effort in future years; but it might increase or decrease it.
• The model assumes that welfare increases with log(income).

Note that the calculations in the appendices are somewhat less vetted than those in the main text. The main text calculations have been reproduced using multiple methods, and checked by other researchers at Open Philanthropy.

## Appendix A: Assumptions of the model

The calculation proceeds in two stages. First, I run a simple simulation to estimate the social impact of R&D. Second, I make a series of adjustments to make the bottom line more realistic.

### Simulation assumptions

#### What model of R&D am I using?

I use the semi-endogenous growth model of Jones (1995).

TFP growth is given by:

$$g_A = constant * L^{\lambda}A^{\phi – 1}$$

$$L$$ is the number of researchers, A is the level of TFP, $$g_A$$ is the growth rate of TFP.

$${\phi}$$ and $${\lambda}$$ are constants that control the diminishing marginal returns (DMR) to R&D. The lower these constants, the steeper the DMR.

•  $$\lambda$$ controls the DMR to more researchers at a given point in time (“stepping on toes”)
• If  $$\lambda$$=1, doubling researchers in a year doubles the TFP growth in that year
• If $$\lambda$$=0, doubling researchers in a year doesn’t change TFP growth that year.
• If $$\lambda$$=0.4, doubling researchers in a year increases TFP growth by 1.3X.
• I think 0.4<<1 is reasonable.
• Unfortunately, this is based on little but intuition. I think <1 because I think there are a few plausible mechanisms for giving at least a small stepping on toes effect. I tentatively think >0.4 because I’d be surprised if doubling the number of researchers speeds up R&D progress by <1.3X. I’m very interested to hear about further evidence on this point.
• $$\phi$$ controls whether progress today makes future progress harder
• The smaller $$\phi$$, the more progress today makes future progress harder (“fishing out”).

What can the empirical data tell us about $$\phi$$  and $$\lambda$$?

The basic empirical trend is that (# researchers) has grown exponentially over the past 80 years, but TFP growth has stayed roughly constant.

Let’s assume that the R&D effort has been driving the TFP growth (or some constant fraction of it). Then there are then two explanations you could give for why TFP growth has stayed constant:

1. Adding more researchers in a given year doesn’t actually increase the progress made very much. There’s a big “stepping on toes” effect. Small $$\lambda$$.
2. Each 1% increment in TFP is harder to achieve than the last. So more progress is required to achieve it. There’s a big “fishing out” effect. Small $$\phi$$.

We can have (1) and (2) in various different combinations, as long as their combined effect is strong enough to explain the basic empirical fact. If you increase $$\lambda$$, you’ll need to decrease $$\phi$$ to compensate and make your predictions consistent with the historical data.

So the historical data don’t pin down both $$\phi$$ and $$\lambda$$ separately. Instead, I make an assumption about $$\lambda$$ and the data pins down $$\phi$$. Two options mentioned in Bloom et al. (2020) are:

• $$\lambda$$=1, $$\phi$$=-2.1
• $$\lambda=0.75$$, $$\phi$$=-1.4

I use the second; serial bottlenecks do place DMR on parallel research and marginal researchers are less talented (if our intervention increases researcher-concentration).

In appendix B I explain the consequences of this model for the social returns to R&D. The rest of this appendix continues to list the assumptions used.

#### How do I apply this model in the simulation?

The simulation compares utility in two worlds. In world 1 we slightly boost the amount of global R&D activity for 1 year.[33]The simulation assumes that the increase in R&D activity is proportional to the increase in funding. This may be optimistic: in reality you need both funding and researchers to do R&D. Essentially, the simulation assumes that more funding will bring with it more researchers, which may be … Continue reading In world 2 we do nothing. After this first year the R&D activity is the same in both worlds.

The simulation applies the R&D equation above in each year; making assumptions about the number of researchers L in each year and calculating the resultant TFP trajectory. It assumes R&D instantaneously boosts TFP around the whole world. It assumes incomes are proportional to TFP,[34]So it ignores the additional effect that capital deepening has on TFP increases in standard growth models. and calculates income trajectories for worlds 1 and 2.[35]It assumes everyone is on the world average income. Representing income inequality wouldn’t change the results. This is because we ultimately care about the percentage effect of R&D on income, and this is the same no matter what people’s starting incomes are. We care about the percentage … Continue reading It then converts these to utility trajectories, assuming utility = k + m*ln(income).[36]The specific values used for k and m do not affect the result as they cancel. In practice we use k=0 and m=1. It makes assumptions about the world population to calculate the total difference in utility between worlds 1 and 2 in each year.

To quantify the result, the simulation compares the utility from funding R&D to the utility that could have been achieved by increasing the consumption of someone on the average global income.[37]It quantifies the result in this way because this is a metric Open Philanthropy uses internally to compare the impacts from different kinds of intervention.

In particular, the simulation calculates the following quantity:

$$Impact \, multiplier = (utility \, from \, TFP \, gains \, of \, \1 \, to \, R \&D) \, / \, (utility \, from \, \1 \, to \, average \, world \, consumption)$$

#### Other simulation assumptions

The simulation makes assumptions about the world population in each year and about the fraction of the world population who are researchers in each year.

• Population
• The current population is 7.9 billion.
• Population increases at 0.4% per year until 2100, when it reaches 11 billion.
• Thereafter the population remains constant at 11 billion.
• Fraction of population who are researchers
• This eventually increases by a factor of 13 compared to today.
• The fraction of the population doing research is proportional to ‘research intensity’, the fraction of GWP spent on R&D.
• Current research intensity = 2.3%.
• Research intensity increases linearly at an absolute rate of ~0.05% per year until it reaches 30%.[38]More precisely, the absolute size of the annual increment is 2% of current research intensity: 0.02 * 2.3% = 0.046%. So this assumption corresponds to thinking that research intensity has been growing exponentially at about 2% per year, but this exponential rate of increase will decline over time.

• 2050: 3.7%
• 2100: 6%
• 2200: 10.6%
• 2300: 15.2%
• 2500: 24.4%
• The research intensity increase can be understood as including the effects of catch-up growth. When countries develop they typically increase their R&D output; for example R&D from India and China is likely to significantly increase this century.

Note that the calculations in the appendices are somewhat less vetted than those in the main text. The main text calculations have been reproduced using multiple methods, and checked by other researchers at Open Philanthropy.

#### Expressing the benefit in terms of a % income increase

Most readers should skip this section. It describes a conceptual adjustment from the ‘impact multiplier’ metric used in the simulation to the ‘% income increase’ metric used in this blog.

The simulation calculates the following quantity:

$$Impact \, multiplier = (utility \, from \, \1 \, to \, R \&D) \, / \, (utility \, from \, \1 \, to \, average \, world \, consumption)$$

It finds that this quantity = 21. Rearranging:

$$(utility \, from \, \1 \, to \, R \&D) = 21 \times (utility \, from \, \1 \, to \, average \, world \, consumption)$$

Average world consumption (in nominal terms) is ~$10,000, so$1 to average world consumption would increase one person’s income by 0.01% for one year.

$$(utility \, from \, \1 \, to \, R \&D)=21 \times \, (utility \, from \, increasing \, one \, person’s \, income \, by \, 0.01 \% \, for \, one \, year)$$

$$(utility \, from \, \1 \, to \, R \&D)=(utility \, from \, increasing \, 21 \, people’s \, income \, by \, 0.01 \% \, for \, one \, year)$$

Multiplying the costs and benefits by 20 billion (in line with the discussion in the main text):
$$(utility \, from \, \20b \, to \, R \&D)=(utility \, from \, increasing \, 420 \, billion \, people’s \, incomes \, by \, 0.01 \% \, for \, one \, year)$$
$$(utility \, from \, \20b \, to \, R \&D)=(utility \, from \, increasing \, 420 \, million \, people’s \, incomes \, by \, 10 \% \, for \, \, one \, year)$$

I multiply the above number by 0.42 due to three adjustments: 0.7 for the possibility that productivity benefits are not felt worldwide, 0.4 because I only credit R&D with 40% of TFP growth, and 1.5 because higher TFP leads to capital deepening. This implies:
$$(utility \, from \, \20b \, to \, R \&D) =$$
$$=(utility \, from \, increasing \, 420 \, million \, people’s \, incomes \, by \, 10% \, for \, one \, year)*0.42$$
$$=(utility \, from \, increasing \, 180 \, million \, people’s \, incomes \, by \, 10% \, for \, one \, year)$$

This gets us to the bottom line quoted in the main body. This sheet contains the above calcs for getting from the simulation output to the table in the main body.

The next section explains the 0.4X adjustment in more detail; the others are explained in the main post.

##### What proportion of productivity benefits should we credit to R&D?

I multiply by a factor of 0.4 because I credit measured R&D with 40% of productivity growth. Deciding how much credit to assign to different sources of income growth is a major source of uncertainty, and I don’t know of a principled and fully satisfactory approach for doing this. My reasoning was as follows:

• Misallocation reduction gets 25% of the credit. This is based on Hsieh et al. (2013), which estimates that improvements in the allocation of talent explains 24% of growth in GDP/worker. A similar adjustment is made by Jones (2021).
• Learning by doing gets 0% of the credit.
• Although learning by doing does lead to productivity improvements, this happens downstream of the introduction of new production processes. Without these new production processes, learning by doing would eventually dry up. I give the ultimate credit to the R&D (and other types of innovation) that develops these new production processes in the first place.
• What we really care about is the counter-factual: if R&D progresses faster than it otherwise would have, what effect will this have on productivity growth? In this context, I am claiming that if R&D progressed faster, learning by doing would speed up in response (with a lag). The extra R&D would be counterfactually responsible for both more R&D progress and more learning by doing.
• Measured R&D gets 55% of the remaining credit for innovation.
• In Eurostat’s survey of 28 countries, firms reported that R&D is 55% of total innovation costs.[39] See page 20 of Jones and Summers (2020). The other costs are linked to acquiring new equipment and software.
• You could argue this is overly generous to measured R&D if R&D is systematically under-reported and this won’t be captured by survey responses. Some examples:
• Wal-Mart sometimes doesn’t report R&D expenses but its logistics innovation has probably contributed to US TFP growth.
• Innovation related to improving services and introducing more product variety may also not be reported as R&D.
• Startups are often focussed around highly innovative activities like taking new products to market, but may not bill much of this as R&D.

So I give R&D credit for 0.75*0.55 = ~40% of TFP growth.

Another method for arriving at ~40% is to assume that the combination of R&D spending and all net investment is responsible for 100% of growth. Applying this within the US, R&D spending is 2.7% GDP and net domestic investment is 4% of GDP. so the R&D spending is responsible for 2.7/6.7 = 40% of TFP growth.

In an appendix, I sense-check this assumption against studies using statistical techniques to tease out the causal impact of R&D on growth. Naively, this suggests I should give R&D more credit for growth, but there are a number of complications involved in the comparison.

### Full list of assumptions

• Utility = k + m * log(income)
• R&D equation:
• $$g_A = constant * L^{\lambda}A^{\phi – 1}$$
• Or equivalently, $$dA = constant * L^{\lambda}A^{\phi}$$
• $$\lambda$$=0.75
• $$\phi$$=-1.4
• L proportional to $spent • Initial value of $$A$$ set by assuming a steady state with $$L$$ growing at 3% per year – its average growth over the last 20 years in the Bloom et al. (2020) data set. • The formula for the steady state value of A is from the equation just below equation (19) on p.53 of Jones and Summers (2020). • If this initial value was lower, marginal R&D today would be more impactful. • Population • Current population is 7.9 billion. • Population increases at 0.4% per year until 2100, when it reaches 11 billion. • Thereafter the population remains constant at 11 billion. • Fraction of population who are researchers • The fraction of the population doing research is proportional to ‘research intensity’, the fraction of GWP spent on R&D. • Current research intensity = 2.3%. • Research intensity increases linearly at an absolute rate of ~0.05% per year until it reaches 30%. Then it remains at 30%. • That’s a factor of 30/2.3 = 13 increase compared to today. • Discount rate on future utility: 0.2% • Adjustment for R&D benefits spreading all over the world: 0.7 • Proportion of TFP growth credited to R&D: 0.5 • TFP gains from R&D increase everyone’s incomes by the same % amount. You can see the full simulation code here, which provides sources for these assumptions. Alternatively, you can see a spreadsheet version of the calculation here. ## Appendix B: understanding the implications of semi-endogenous growth for the social returns to R&D I think about the impact of the intervention in terms of a ‘TFP wedge’: the % difference in TFP between the worlds with and without the intervention. First, I’ll discuss the initial size of the wedge; then how the wedge changes over time. (In this section I quote various technical results; I derive these in appendix C.) #### What’s the initial size of TFP wedge? Suppose TFP growth in the world without the intervention is $$g_0$$, and the intervention increases the amount of R&D that year from L to L(1 + v). Then it turns out that the initial size of the wedge is: $$wedge = {\lambda} g_0v$$ $$wedge = {\lambda}* (growth \, without \, intervention)*(fractional \, increase \, in \, R \&D)$$ For example, suppose $${\lambda}$$ = 0.5, TFP growth would be 2% without our intervention, and we boost total R&D spend one year by 1%. Then our intervention increases TFP by 0.5 * 2% * 1% = 0.01%. #### How does the TFP wedge change over time? Suppose the initial wedge is x% of total TFP. Over time x% falls towards 0%. Why? Because ideas are getting harder to find. Suppose we counterfactually insert 1 extra researcher-year in 1800. Every year y after this, rather than R(y) researcher-years having occurred R(y)+1 have occured. Initially, the level of technology might have been quite low, and this extra researcher-year might make a noticeable difference to TFP. Ideas are still easy to find. Once the level of technology is high, however, this extra researcher-year makes little difference to TFP. Ideas are now very difficult to find. To summarize: higher level of technology → ideas are harder to find → the extra researcher-year makes a smaller counterfactual difference to TFP → TFP wedge is smaller. Indeed, it turns out that the size of the wedge is inversely proportional to the level of technology (to some power). The following diagram shows the technology level for two paths. The orange path is the one without an intervention; the number of researchers grows exponentially. The blue path has the same number of researchers each year as the orange path, except for the first year when it has 3X as many. You can see that the initial ‘ TFP wedge’ between the paths declines over time. This means that the faster technology progresses, the faster the wedge declines. Faster tech progress → ideas become harder to find more quickly → TFP wedge declines more quickly. It turns out that the wedge declines at an exponential rate of $$(1-{\phi})g_A$$, where $$g_A$$ is the growth rate of TFP. In an equilibrium where population grows at a constant exponential rate n, the wedge declines at an exponential rate of $${\lambda}$$n. This has some interesting consequences. • If population growth is higher, the TFP wedge declines more quickly. • Faster population growth → faster tech progress → wedge declines faster • Naively, population growth would dramatically increase the returns to TFP boosts, but this is partly cancelled out by the faster-falling wedge. • Conversely: a stagnating population would naively significantly lower the returns to TFP boosts, but this is again partly cancelled by the slower-falling wedge. When population is constant, technological progress becomes slower and slower over time (ideas getting harder to find). So the wedge falls more and more slowly over time. If it takes 100 years to halve in size, it will take a further 200 years to halve again, and then 400 years, etc. • If $$\lambda$$ is smaller (more “stepping on toes”), the wedge declines more slowly. • Smaller $$\lambda$$ → slower tech progress → wedge declines more slowly We saw above that smaller $$\lambda$$ leads to a small initial wedge. This effect is partly cancelled by the slower decline of the wedge. The “smaller initial wedge” effect dominates, except for very small discounts (<0.3%)[40]How long does it take for the “wedge declines more slowly” effect to dominate if we have no discount? Let’s assume we change our value of $$\phi$$ to compensate when we change lambda. How long does it take for the total impact from $$\lambda$$=0.75 to exceed the impact from $$\lambda … Continue reading ## Appendix C: deriving quantitative implications of semi-endogenous models for returns to R&D This appendix derives some results discussed in appendix B and has some further discussion. ### What’s the initial size of wedge? TFP growth is given by: \( g_A = constant * L^{\lambda}A^{\phi – 1}$$ The initial size of this TFP ‘wedge’ is given by:[41]I get this expression by differentiating the expression for $$g_A$$ with respect to $$L$$: $$wedge = d(g_A)/dL$$. $$wedge_i = (constant * {\lambda}) / (A^{1- \phi}L^{1- \lambda})$$ What this means is: • The higher the current level of tech $$A$$, the smaller the initial impact • The lower $${\phi}$$ (more “fishing out”), the smaller the initial impact • The larger the current research effort $$L$$, the smaller the initial impact • Unless $${\lambda}$$ =1, in which case $$L$$ makes no difference • The smaller $${\lambda}$$ (more “stepping on toes”), the smaller the initial impact If TFP growth in the world without the intervention is $$g_0$$, and the intervention increases the amount of R&D that year from L to L(1 + v), we can simplify the above expression. The initial size of the wedge is: $$wedge_i = {\lambda}g_0v$$ $$wedge_i = {\lambda} * (growth \, without \, intervention) * (fractional \, increase \, in \, R \&D)$$ ### How does the wedge change over time? The wedge declines over time as the level of technology increases: $$wedge(t) = wedge_i *[A_i /(t)]^{1- \phi}$$ where $$wedge(t)$$ gives the size of the wedge at time t, $$A(t)$$ gives the level of technology at time t, and $$A_i$$ gives the initial level of technology. Therefore the wedge declines in size at the same rate at which $$A(t)^{1- \phi}$$ grows. In other words, at an exponential rate of $$(1- \phi)g_A$$, where $$g_A$$ is the growth rate of TFP. If the exponential growth of researchers n is constant, it turns out that the growth rate of $$A(t)^{1- \phi}$$ equals $$\lambda n$$. This makes sense. The more “stepping on toes” (small $$\lambda$$), and the slower researcher growth (small $$n$$), the slower the growth of technology. So the wedge declines in size at an exponential rate of $$\lambda n$$. This can have counterintuitive consequences for how $$\lambda$$ affects the intervention’s impact. The more “stepping on toes” (small $$\lambda$$ ), the slower the wedge declines over time. In some circumstances, can mean that reducing $$\lambda$$ actually increases the impact of the intervention over the very long run, despite the initial size of the wedge being smaller. The intuition is that tech progress is slower at later times, and this means that the intervention’s initial impact diminishes more slowly. How does the annual utility from the intervention change over time? It decays at the exponential rate $$(time \, discount) + (decline \, of \, wedge) – (population \, growth) = r + {\lambda}n – n.$$ If this quantity is negative (if r is very small and $$\lambda<1$$), the annual utility can grow over time. What’s going on here is the effect of the growing population outweighs that of the declining wedge. ## Appendix D: very rough estimates of the social returns to R&D under different scenarios These are very rough back-of-the-envelope calculations (BOTECs) of the social returns to R&D for a few different scenarios discussed in the blog. I haven’t made a special effort to make these easy to understand, but am including them for completeness. The first scenario, the ‘mainline scenario’, is a simplified version of the model discussed in the main body of the blog. The others are variants on this model which avoid predicting productivity stagnation. I calculated how much each scenario changes the returns to R&D compared to the mainline scenario.  Impact compared to mainline scenario Scenario Ignore impacts after 100 years Ignore impacts after 500 years Mainline scenario: Ideas getting harder to find and stagnating population drive productivity stagnation. 1X 1X Maybe we’ll avoid productivity stagnation by increasing the fraction of people doing R&D? 0.7X 0.4X Maybe ideas won’t get harder to find in the future? 2.3X, or less 6.6X, or less Maybe some trend-breaking future technology will allow us to avoid growth stagnation despite ideas getting harder to find? ~2X, or much more ~6X, or much more Maybe the world’s population won’t stagnate? 2.3X, or less 6.6X, or less The sections below estimate the ‘impact multiplier’, defined as follows: $$Impact \, multiplier = (utility \, from \, \1 \, to \, R \&D) / (utility \, from \, \1 \, to \, average \, consumption)$$ This table summarises the results, and was used to construct the above table.  Scenario Impact multiplier (ignore impacts after 100 years) Impact multiplier (ignore impacts after 500 years) Mainline scenario: Ideas getting harder to find and stagnating population drive productivity stagnation. 22X 38X Maybe we’ll avoid productivity stagnation by increasing the fraction of people doing R&D? 15X 15X Maybe ideas won’t get harder to find in the future? 50X, or less 250X, or less Maybe some trend-breaking future technology will allow us to avoid growth stagnation despite ideas getting harder to find? 50X, or less 250X, or less Maybe the world’s population won’t stagnate? 50X 250X ### Mainline scenario: Ideas getting harder to find and stagnating population drive productivity stagnation. • R&D is 2% of US GDP, and currently produces 1% growth in incomes per year. • So 1% extra of GDP on R&D buys an initial income wedge of 0.5%. • This wedge falls over time, as ideas get harder to find.[42]Why would this happen? We caused some extra counterfactual science to happen: R(t)+1 researcher-years rather than R(t) at each time t. But this extra science makes less % difference to income as ideas become harder to find. 100 vs 101 researcher-years makes a bigger % difference to income than 1000 … Continue reading More specifically, (it turns out that plausible parameters imply that[43]The wedge halves each time researcher population doubles (assuming the ‘stepping on toes’ parameter $$\lambda$$=1). The relationship between TFP growth g and population growth n in steady state is given by g = $$\lambda$$ * n / (1 – phi). Using $$\lambda$$=1 and $$\phi$$=-2 … Continue reading) each time the economy grows by 30% the wedge ~halves. • (I’m assuming that the amount of R&D in subsequent years is unchanged by our intervention.) • So the wedge first halves after 15 years. • Population is constant but ideas are getting harder to find, so growth slows. More specifically, let’s assume each 30% of growth takes twice as long as the last. As a result, the wedge only halves a second time after 30 years, a third time after 60 years, and a fourth after 120 years. • So the effects of an extra 1% on GDP are roughly as follows: • 0.5% wedge for 15 years • 0.25% wedge for 30 years • 0.125% wedge for 60 years • The total effect in the first 100 years is: 0.5*15+0.25*30+0.125*55 = 22% income boost. • 22X direct consumption. • The total effect in the first 500 years is: 0.5*15+0.25*30+0.125*60 + (1/16)*120 + (1/32)*240 + (1/64)*35 = 38% income boost, 38X Note: this is only a rough BOTEC and gives slightly different results to the full model discussed in the blog. The full model includes a pure discount rate and is more complicated in a number of other ways discussed in the main text post and appendix A. ### Maybe we’ll avoid productivity stagnation by increasing the fraction of people doing R&D? In a sentence “Yes in the last 80 years we’ve needed a growing population to sustain constant growth; but in the future increased R&D intensity will allow us to sustain constant growth with a constant population.” A rough BOTEC: • R&D is 2% of US GDP, and currently produces 1% growth per year. • So 1% extra of GDP on R&D buys an initial income wedge of 0.5%. • This wedge falls over time, as ideas get harder to find. As above, it halves each time technology increases 30%. • (I’m assuming that the amount of R&D in subsequent years is unchanged by our intervention.) • So the wedge halves every 15 years. • So the effects of an extra 1% on GDP are roughly as follows: • 0.5% wedge for 15 years • 0.25% wedge for 15 years • 0.125% wedge for 15 years… • So the total effect in the first 100 years is: ~0.5*15*2 = ~15% income boost. • So spending on R&D is 15X direct consumption, with a 100-year horizon. • With a 500-year horizon, the effect is still ~15% income boost, 15X The returns to extra R&D today are lower than in my baseline stagnation scenario. The extra R&D effort here improves technology faster, making ideas harder to find, so the extra counterfactual science we caused makes less difference. ### Maybe ideas won’t get harder to find in the future? Maybe the observed pattern over the last 80 years will stop, and constant researcher effort will be capable of sustaining a constant rate of rate. • R&D is 2% of US GDP, and currently produces 1% growth per year. • So 1% extra of GDP on R&D buys an initial income wedge of 0.5%. • This wedge is constant over time. It doesn’t diminish because ideas are not getting harder to find. • (I’m assuming that the amount of R&D in subsequent years is unchanged by our intervention.) • So the total effect in the first 100 years is: 0.5*100 = 50% income boost. • So spending on R&D is 50X direct consumption, with a 100-year horizon. • With a 500-year horizon, the effect is 0.5*500 = 250% income boost, 250X This BOTEC assumes ideas stop getting harder to find just before our intervention. If they continue to get harder to find for a while, this would reduce the bottom line. ### Maybe some trend-breaking future technology will allow us to avoid growth stagnation despite ideas getting harder to find? For concreteness, suppose R&D today brings forward in time the tech level after which we can sustain 1% annual productivity growth with a constant population. If we bring forward that day by 1 year, we boost income in every year thereafter by 1%. Rough BOTEC (this one is fiddly): • R&D is 2% of US GDP, and currently produces 1% growth per year. • Currently we need to increase the # researchers each year to sustain exponential growth, but once we reach tech level X a constant # researchers can sustain 1% exponential growth. • 1% extra of GDP moves the tech level forward by 0.5 years, and we reach tech level X 0.5 years earlier. Income at all later times is boosted by 0.5%. • Suppose we reach X immediately. Then the total effect in the first 100 years is: 0.5*100 = 50% income boost. That’s 50X. • The total effect in the first 500 years would be 0.5*500 = 250% income boost, 250X • Suppose we only reach X after 50 years. This would reduce the value of the intervention somewhat, for complicated reasons. • The benefits in the first 50 years are smaller than 0.5% as the 0.5% wedge shrinks over time. • If annual research effort has grown, we will reach tech level X less than 0.5 years earlier. E.g. if we do twice as much research by the time we reach tech level X, we’ll only reach it 0.25 years earlier, ~halving the impact each year thereafter. This is an interesting and complicated case. We could extend it to consider bringing forward periods of much faster growth, which would make the returns to R&D much higher. For example, if we bring forward an economic singularity (where growth is hyperbolic until it output approaches a very high ceiling), we could be bringing forward a time when all biological humans have lives that are unimaginably happy and fulfilled by today’s standard. In addition, if it’s possible for people to exist as simulations on a computer, that could expand the human population by a trillion-fold or much more.[44]See Bostrom (2003) for a slightly more detailed explanation of this point. If this happened, it would continue what I believe to be the long-run historical trend whereby R&D has had massive welfare returns by bringing a much richer, more populous and happier future forward in time (see appendix F for more about this model of historical R&D). ### Maybe the world’s population won’t stagnate? In a sentence: “yes ideas are getting harder to find, but the population growth will allow us to meet the demands for ever-more research for each 1% output gain”. A rough BOTEC: • R&D is 2% of US GDP, and currently produces 1% growth per year. • So 1% extra of GDP on R&D buys an initial income wedge of 0.5%. • This wedge falls over time, as ideas get harder to find. As above, it halves each time technology increases 30%. • So the wedge halves every 15 years. • But each time the wedge halves, the number of people alive ~doubles,[45]This is exactly true in the semi-endogenous framework when the “stepping on toes” parameter lambda=1. If lambda < 1 then the population more than doubles each time the wedge halves. so the total impact of the wedge is constant over time. • So the total effect in the first 100 years is: 0.5*100 = 50% GDP boost. • So spending on R&D is 50X direct consumption, with a 100-year horizon. • With a 500-year horizon, the benefits would be 250X. This calc assumed that the population doesn’t stagnate at all. In practice, the realistic version of this scenario is one where population stagnates temporarily. If we incorporate a temporary population stagnation into this BOTEC, the estimated returns to R&D would decrease. ## Appendix E: reasons growth might not stagnate, and how that would affect the bottom line Appendix D contains very rough BOTECs calculating the returns to R&D in each of these scenarios. ### Maybe we’ll avoid stagnating growth by increasing the fraction of people doing R&D? As mentioned earlier, the fraction of people doing R&D (research intensity) has increased significantly over time. If it continues to increase, we can have a growing pool of researchers despite a stagnant population. This mechanism can temporarily sustain steady productivity growth despite population stagnation. It cannot do so indefinitely as the fraction of people doing research cannot exceed 100% (and will likely cap out much earlier). Indeed, when I incorporate this mechanism into the model, the stagnation of productivity growth is only delayed temporarily. How does an increasing research intensity in the future affect the impact of R&D today? Counterintuitively, it reduces the impact. The reason is related to ideas getting harder to find. More R&D in later periods pushes us further up the diminishing returns curve for finding new ideas, so the additional R&D we funded makes less difference.[46]Here’s another way to understand this effect. Because ideas are getting harder to find, the number of new ideas found with a marginal researcher-year is roughly proportional to 1 / (total years of research so far). Suppose we fund an extra year of research in 2021. As a result, in 2050 one more … Continue reading In fact, the mainline estimate of the social return for R&D does assume that research intensity will continue to increase to some extent. If I put weight on a scenario where there are even greater increases in research intensity, I should also put weight on scenarios where there are smaller increases in research intensity. As it is, I’m happy to just use my central estimate in the stylized calculation of this post. ### Maybe ideas won’t get harder to find in the future? There is strong evidence that ideas became harder to find during the 20th century. But perhaps this is a temporary trend, and soon enough the average difficulty of finding a new idea will stay constant over time. If this happens, a constant population would be able to sustain constant productivity growth. This would significantly raise the estimated value of today’s R&D. As mentioned above, the model implies that the % income increase due to R&D today will decline over time. But if ideas stop getting harder to find, this decline will stop. In other words, extra R&D today could raise people’s incomes by (e.g.) 0.5% forever into the future. Given my low discount rate, this would significantly raise the estimated social returns to R&D. I don’t find this scenario plausible. The evidence suggests that ideas have been getting consistently harder to find since 1930, and there’s no reason to expect this trend to change. Another version of this claim would be that ideas have never been getting harder to find. Instead our R&D institutions have become exponentially worse over time, and this compounding inefficiency explains why an exponentially growing number of researchers has led to merely constant growth. Concretely, the data implying that it takes 41X as much research effort to find a new idea than in 1930 is interpreted as implying that institutions have become 41X less efficient. The startling implication is that, if only institutions had remained at their 1930s level, productivity growth would be 41X faster today. That would involve TFP growing by ~40% every year! Again, I don’t find this scenario to be very plausible. ### Maybe the world’s population won’t stagnate? The argument for stagnating productivity growth assumes that the global population will level off at 11 billion. But if the world population grew exponentially in the long-run, growth would not stagnate. The number of researchers could continue to grow exponentially, maintaining constant productivity growth despite ideas getting harder to find. Is this scenario plausible? It is the UN’s high-end projection out to 2300, and there are some reasons it could happen. Many subcultures and countries currently have very high population growth and, in the long-run, cultural and biological evolution will select for these groups. On the other hand, governments may make a concerted effort to avoid sustained population growth if it would have devastating environmental consequences. Overall, it’s hard to know whether this scenario is likely to happen. How would sustained future population growth affect the value of R&D today? Overall, it would significantly increase the value of R&D today. There are two effects, which point in opposite directions. The first effect is that there are more future beneficiaries of today’s R&D, with the number growing exponentially. The second effect is complex. There is more R&D at later times and so the counterfactual impact of today’s marginal R&D on TFP falls over time. In particular, if today’s R&D initially raised TFP by x% then over time x falls exponentially. Combining these two effects, marginal R&D causes a % TFP gain which falls off exponentially over time but is enjoyed by an exponentially growing population. Compared to a case with constant population, the first effect dominates and the impact of R&D is significantly higher. However, a very large future population would raise the impact of many interventions with positive long-run effects. So it’s not clear that this scenario gives us reason to prefer R&D in particular. Partly on this basis, I’m not currently adjusting the stylized estimate based on this scenario. ## Appendix F: back-of-the-envelope calculations of value of R&D in 1800 This appendix estimates the value of R&D in 1800 using a method that incorporates the two factors mentioned in the main text: • Historically, R&D was more neglected than it is today. • Historically, R&D increased the amount of R&D occurring at later times, both by increasing the fraction of resources used for R&D and by increasing future populations. The calculation is very rough. Its purpose is to highlight just how much these two factors can increase the importance of historical R&D; it shouldn’t be interpreted as a precise estimate of R&D returns in 1800. The choice of 1800, rather than another year, is largely arbitrary. The calculation uses a different model to one described in the main text to incorporate the second factor: R&D increasing the amount of R&D occurring at later times. The model implies that a researcher-year in 1800 was hundreds of times more impactful than in 2020. ### Explaining the model Suppose (as a toy example) that in 1800 there were exactly 100 researchers. What is the impact of funding an extra researcher-year in 1800? The assumption of this model is that the impact is to bring forward the technological level of the present world by 1/100 years. In other words, we start enjoying the level of wealth of the modern world 1/100 years earlier than we otherwise would have. Why make this assumption? An extra researcher-year in 1800 means we make more technological progress in 1800. How much more? It increases the number of researchers by 1%, so by the end of 1800 technology is 1/100 years ahead of where it would have been.[48]101 researchers working for 1 year make the same amount of progress as 100 researchers working for (1 + 1/100) years. Based on the second factor (‘R&D increased the amount of R&D at later times’), this brings all subsequent R&D efforts forward in time by 1/100 years. I.e. the population passes each milestone 1/100 years earlier, and research concentration ramps up on a schedule that’s brought forward in time by 1/100 years. As a result, the whole future trajectory of R&D progress is brought forward by 1/100 years, and we reach modern levels of technology and wealth 1/100 years earlier.[49]Another way to think about this is that we assume that the total R&D effort at each time is determined by the level of technology. I.e. the level of technology determines both the population and the research concentration and so determines the total R&D effort. So if we reach a given level … Continue reading So the effect of the extra researcher in 1800 is that we spend 1/100 years less time at 1800 levels of wealth and 1/100 years more time at modern levels of wealth. How should we value reaching modern levels of wealth x years earlier (and spending x years fewer at 1800 levels of wealth)? The population in 1800 was 1 billion. So a conservative valuation is just the value of raising 1 billion people’s incomes from the average income in 1800 to the average income in 2020. Average global income today is$9600, compared with an estimate of $700 for 1800.[50] See data from Roodman (2020). Assuming the log-utility model, this benefit is equivalent to increasing 27 billion people’s income by 10% for x years.[51]You need ~27 10% income increases to go from$700 to $9600. So you need 27 billion 10% income increases to raise 1 billion people’s income from$700 to $9600. Here I use the fact that all 10% income increases are valued equally, which comes from the log-utility model. 1.1^23 = 14 = 9600 / 700. A less conservative valuation would also value the additional 7 billion people alive today, or take into account future increases in wealth that we will also reach x years earlier in time. ### Applying the model Currently we spend 2.3% of GWP on R&D. Let’s say that percentage was 20-30X lower in 1800 — we suggest this is a conservative estimate below. Then R&D was only 0.1% of GWP. How many researchers would that buy? Let’s assume that with 0.1% of GWP you could pay for 0.1% of people to be researchers.[52]The assumption that researchers earn the average global wage, rather than a higher wage, will make no difference to the result. This is because I will make an equivalent assumption about the wages of researchers in 2020. The results would change somewhat if you think researchers today demand more … Continue reading The world population was 1 billion, so that would pay for 1 million researchers. Funding an additional researcher-year in 1800 would bring forward the technological level of the present world by (1 / 1 million) years. This benefit is equivalent to increasing 27 billion people’s income by 10% for (1 / 1 million) years; or to increasing 27,000 people’s income by 10% for 1 year. We can adjust this downwards based on some of the factors considered in the calculation of the value of R&D today: • Stepping on toes. Twice as many researchers make less than twice as much progress due to duplication. 0.75X • Proportion of tech progress credited to R&D. Only 50% of TFP growth comes from targeted R&D. 0.5X • Skepticism about global spillovers of R&D. 0.7X This reduces the benefit to increasing 7100 people’s income by 10% for 1 year. How does this compare to the benefit of funding a researcher-year today? The model described in the main body of this implies that the benefit is equivalent to raising 70 people’s income by 10% for a year.[53]For consistency with the 1800 calculation (see most recent fn), I assume that 1 researcher-year costs the same as the global average income. Today that is about$10,000. The model for R&D today implies that $100 to R&D has the same welfare effect as raising someone’s income by 7% for one … Continue reading This calculation implies that R&D today has lower returns by a factor of 7100/70 = ~101. One reason for this large factor is that the calculation in this section assumes an additional feedback loop between R&D and future R&D. I think that this captures a real difference between funding R&D now and in the past; but someone skeptical of that distinction might prefer a smaller factor. On the other hand, the factor would be much larger if we made any of the following changes: • Used a time before 1800, when the fraction of resources used for R&D would be lower still. • Used a less conservative estimate of the value of arriving at the modern levels of wealth x years earlier (e.g. by placing value on the associated population increase, or by including the benefit of arriving at future levels of wealth x years earlier). • Compare the impact of$1 on R&D today vs in 1800, rather than comparing the impact of a researcher-year. $1 would buy much more research in 1800 than today, as salaries were much lower in 1800. In light of this, my opinion is that historical R&D was more impactful than R&D today by even more than the factor of 101X estimated above. We should be cautious with this comparison. As mentioned above, its purpose is to highlight just how much these two factors can increase the importance of historical R&D; it is not a precise estimate of R&D returns in 1800. However, the comparison does suggest that R&D was historically hundreds of times more impactful than R&D today. ### Data on the research concentration in 1800 What was the research concentration (% of GWP was spent on R&D) in 1800? I’m not aware of high-quality data on this question. But various sources suggest that research-concentration in 2020 was probably at least 20-30X higher: • Data from Bloom et al. 2020 find the number of US researchers increasing by an average of 4.3% per year since 1930. US population grew less than an average of 1.5% per year in the same period, implying that the fraction of people doing research was growing by 2.8%. This implies a 12X increase in research intensity in 1930 – 2020.[54]1.028^90 = 12. • Bakker (2013) estimates growth in overall R&D using growth of specific types of R&D funding. One such estimate implies a 1.6% annual growth in research intensity between 1767 and 1904, corresponding to a 8X increase between 1800 and 1930.[55]1.016^130 = 8 Another such estimate implies a 0.8% annual increase between 1823 and 1941, corresponding to a 3X increase between 1800 and 1930.[56]1.008^130 = 3. • Combining these estimates with the Bloom et al. data for 1930-2020 implies a total increase in research-concentration of 36X – 96X. • Utility patents records show large increases between 1800 and 2020. • E.g. utility patent applications grew 300X since 1850. Accounting for a 15X population increase in the same period, that’s a 20X increase in patent application concentration. • Design patents similarly show a 30X increase in patent-concentration since 1850. • This is only weakly informative because patents are just proxies whose correlation with growth-enhancing research effort could chance over time. • Shuttleworth and Charnley (2016) claim science publications grew by 100X between 1800 and 1900. With a 1.6X increase in population in the same period, that’s a publication-concentration increase of 60X in that period alone. • Again, science publications are only a proxy for growth-enhancing research effort. Based on the above numbers, a conservative estimate of the increase in research concentration from 1800 to 2020 is 20-30X. ## Appendix G: top-down vs bottom-up calculations of the returns to R&D ### Summary This post uses a top-down approach to calculate the returns to R&D, using very high level empirical inputs and an assumption about what fraction of TFP growth is due to R&D. Many economics papers use bottom-up approaches, attempting to tease out the causal effect of R&D from micro-level data. We can use bottom-up approaches to sanity check this post’s assumption about the fraction of TFP growth that is due to R&D. Bottom up approaches calculate a quantity called the social rate of return, and estimates vary widely between 30% and 130%. The social rate of return can also be calculated using inputs to my model; this calculation gives a value of only 13%. Naively, this implies I’m underestimating the returns to R&D. However, it’s possible that bottom-up approaches overestimate the social rate of return, or that the calculation using my inputs is not really comparable with the results from bottom-up approaches. So I only see this as weak evidence that I’m underestimating the returns to R&D. ### Top-down vs bottom-up calculations of R&D returns The methodology in this post, adjusted from Jones and Summers (2020), is a top-down approach to calculating the social returns to R&D. Its key empirical inputs are very high level quantities: frontier TFP growth and total global R&D expenditures. Based on these quantities, and a simple growth model relating them together, it estimates the welfare benefits from TFP growth then credits R&D with some portion of those benefits. A strength of this approach is its conceptual clarity. A weakness is its reliance on a bald assumption about what fraction of TFP growth is due to R&D. Many economics papers estimate the social returns using bottom-up approaches. These look for correlations between R&D spending and subsequent TFP growth. Some papers focus on specific technologies or firms or industries. Some study entire countries or even groups of countries. A strength of these approaches is they can potentially identify the causal contribution of R&D to growth by applying statistical techniques to micro-level data and controlling for other causes of growth. Unsurprisingly, this causal identification faces many challenges. A weakness of these approaches is that they aren’t well suited to capturing spillovers that occur far away in space and time because these may not be included in the data used. Indeed, the coefficients estimated typically cannot be translated into all-things-considered estimates of the social returns to R&D without substantive further theoretical assumptions about how the benefit changes over time. In addition, spillovers to low-income countries are typically not included in calculations of the social returns in these papers. Appendix B of Jones and Summers (2020) contains a good overview of bottom-up approaches. ### Using bottom-up approaches to calibrate the assumptions of the model in this post Many bottom-up approaches estimate a quantity called the social rate of return,[57] See Bloom et al. (2013)Coe and Helpman (1995) and references within Jones and Williams (1998) and Jones and Summers (2020). which means “if I invest a marginal$1 in R&D, how much will GDP increase?”. If the social rate of return is 50%, investing $1 in R&D raises GDP by$0.5. Formally, the social rate of return equals $$dY/dR$$, where Y is output and R is the cumulative stock of R&D effort.

Jones and Williams (1998) relate empirical estimates of this quantity to the semi-endogenous model used in this post. They show that
$$r’ = {\lambda}{g_A}/s$$
where r’ is an empirical estimate of the rate of return, $$\lambda$$ is the stepping on toes parameter discussed in Appendix A, $$g_A$$ is the growth rate of TFP due to R&D, and s is the fraction of GDP used for R&D.

We can use the model in this post to calculate the RHS of this equation, and compare it to empirical estimates of the LHS.

This post uses:

• $$\lambda$$=0.75
• $$g_A$$ = 1% * 0.4 = 0.4%
• ~1% is the average value of TFP growth over the last 40 years in my data set.
• 0.4 is the fraction of TFP growth I’m crediting to R&D.
• s = 2.3%

These imply a social rate of return of 13%.[58]0.75*0.4%/2.3% = 0.13. This is significantly lower than most estimates in the literature. Bottom-up approaches typically find the private rate of return — which only includes benefits to the innovator — is 20-30%. Estimates of the social rate of return are much higher and vary considerably, at 30 – 130%.[59]E.g. Bloom et al. (2013) estimate a private return of 21% and a social return of 55%. See multiple estimates in table 1 of Jones and Williams (1998) and in Appendix B of Jones and Summers (2020).

There are a few possible interpretations of this result:

1. This post underestimates the social benefits of R&D.

A natural adjustment would be to increase fraction of TFP growth credited to R&D and to increase $$\lambda$$. Using $$\lambda$$ =1 and crediting R&D with all frontier TFP growth would leave the RHS at 43%.

2. The empirical studies overestimate the social benefits of R&D.

If other causes of TFP growth (e.g. capital expenditures) are correlated with measured R&D, these studies may overestimate the causal effect of measured R&D on TFP growth. Another possibility is that these studies focus on sectors in which R&D is particularly lucrative.

3. The LHS and RHS of the equation are not really comparable, for subtle theoretical reasons.

Here’s one such reason.

The model I use in this post assumes that there is no depreciation of the R&D stock. Once ideas are discovered, they are not forgotten. This means that the measured stock of R&D includes all historical R&D inputs without discount.

But some central empirical estimates of the social rate of return to R&D assume that the R&D stock depreciates at a very fast rate. Coe and Helpman (1995) use a depreciation rate of 5%; Bloom et al. (2013) use a rate of 15%. This means that the estimated stock of R&D only includes R&D inputs from recent years.

If instead these empirical estimates had assumed no depreciation of the R&D stock, they would estimate a larger stock of R&D and so a lower social benefit per $of R&D stock. The estimated social rate of return would fall.[60]Conversely, if I’d used the depreciation assumptions used in the empirical estimates, I’d have expected the benefit from marginal R&D to diminish much more quickly over time and calculated a lower total return to R&D. Fall by how much? A rough calculation suggests Bloom et al. (2013)’s estimate of the R&D stock would rise by a factor of 3, and so their estimate of the social rate of return would fall by a factor of 3 from 55% to 18%. Analogously, the estimate of Coe and Helpman (1995) would fall by a factor of 1.5 from 100% to 67%. So if bottom-up empirical studies used the same “no depreciation” assumption as I use in this post, their estimates of the social rate of return would fall and the tension with my assumptions would reduce. This depreciation issue is just one reason the LHS and RHS of the above equation are not straightforwardly compatible. I suspect digging into specific analyses in more detail would uncover further reasons why they are not straightforwardly comparable. Which of these three interpretations is correct? Interpretations (2) and (3) both seem plausible to me; in combination I think they could explain the discrepancy. That said, I place some weight on interpretation (1) and doing this analysis updated me towards thinking that measured R&D accounted for a greater fraction of TFP growth. This tension could be investigated further. An ambitious project would be to take the data used in these analyses and use them to directly fit the inputs to the semi-endogenous growth model used in this post.[61]The model here differs from the standard semi-endogenous growth model in two ways. First, the standard semi-endogenous growth model assumes that all TFP growth is due to R&D; this model relaxes this assumption. Second, this model translates income changes to welfare changes using a log-utility … Continue reading ## Appendix H: Potential disagreements with Progress Studies Progress Studies is an intellectual movement that aims to understand and accelerate civilisational progress. Some people involved in Progress Studies believe accelerating civilisational progress, for example by improving institutions for innovation, should be the world’s top priority. My conclusion here seems to be somewhat in tension with this. My stylized estimate puts direct R&D spending at 45% the impact of cash transfers to the global poor, and 4.5% the impact of the GHW bar. The apparent disagreement here may be smaller than it appears. Firstly, the typical$ spent by governments of rich countries probably has many times less social impact than cash transfers to the global poor. This implies, in line with Progress Studies, that R&D spending has an unusually high social impact compared to typical government spending.

Secondly, leveraged ways to boost long-run innovation, like increasing high-skilled immigration or improving institutions, might be much more effective than directly funding R&D. I expect, partly based on unpublished work by Open Philanthropy, that some such opportunities do meet the GHW bar. In other words, I think that some interventions to boost innovation are among the best in the world for improving wellbeing. It’s not obvious to me that Progress Studies enthusiasts should be interpreted as making a stronger claim than this.

That said, I expect that I do have some substantive disagreements with Progress Studies enthusiasts. In particular:

1. I expect that ideas will continue to become harder to find. Bloom et al. (2020) offers strong evidence that ideas (defined as insights that increase TFP by 1%) have been getting harder to find for 80 years. This means that even a permanent increase in the fraction of GDP used for R&D only temporarily increases the growth rate. This lowers the returns to R&D compared with if you could permanently increase the rate of economic growth, a possibility that Tyler Cowen highlights in his book Stubborn Attachments.

Some people in Progress Studies have suggested that the evidence from Bloom et al. (2020) might be explained by institutions becoming worse over time.[62]E.g. see Patrick Collison here, 19:57-25:32. While this may be true to some extent, I don’t think it could be true to such a large extent that ideas haven’t been getting harder to find.
2. I’m less inclined to place weight on trend-breaking scenarios where funding R&D has a very large upside. Stubborn Attachments acknowledges the possibility that any given intervention might only temporarily increase growth, but argues we should do an expected value calculation and put some weight on the possibility that we can permanently increase the economic growth rate. I feel reluctant for GHW grantmaking to place much weight on trend-breaking scenarios where the upside is very large; see here, here and here for further discussion. This is especially compelling when combined with the next point.
3. I’m more wary of potential harms from R&D. Part of the reason I’m not inclined to put weight on scenarios with very large upside is because I’m not explicitly modeling possible harms. As mentioned above, this is not an all-things-considered analysis, but more like “How good does R&D look if we use our median estimate of R&D returns and future population growth, ignoring scenarios that could massively increase or massively decrease the returns?

## Appendix J: ways in which the stylized estimate is too pessimistic vs too optimistic

Some assumptions of the report are listed as both “arguably too optimistic” and “arguably too pessimistic”.

### Arguably too optimistic

The stylized calculation:

• Ignores potential harms from R&D, which could dominate the benefits in some cases.
• Gives R&D substantial credit for long-run growth.
• An alternative view is that credit for growth should be distributed fairly evenly over economic activities rather than heavily concentrated on a few types of activity. For example, if R&D activity stopped tomorrow, growth would continue for decades due to learning by doing, business innovation, and diffusion of existing tech across the economy.
• A core reason I give R&D substantial credit is that over long timescales it seems that these sources of growth would dry up and R&D is necessary to sustain growth.
• On the other hand, many reviewers of this post suggested that I credit R&D with 50% of the TFP growth or more, and this is supported by a naive comparison with bottom up calculations of the returns to R&D.
• Assumes significant R&D spillover between countries.
• In my stylized estimate, all global R&D goes into a common pot that boosts incomes around the world.
• In a model in which each country’s R&D was only relevant to their own growth, then the social returns to R&D would vary by country. The R&D in rich countries would be much less impactful than my stylized estimate because it would raise the incomes of fewer people.
• On the other hand, many reviewers of this post suggested that eventually 100% of frontier growth will spillover around the world, rather than the 70% that I assumed.
• Uses a utility function that may overestimate the benefits of increasing the incomes of people who are rich relative to those who are poor.
• Assumes the marginal $on R&D is 75% as impactful as the average$. You might think that the smartest people are already doing R&D, and the best projects are already funded, such that the marginal $is worse than this. • Assumes R&D increases everyone’s incomes by the same %. If R&D increases the incomes of the rich by a larger % than those of the poor, the stylized calculation will overestimate the utility benefits from R&D. • Includes benefits hundreds of years into the future. • While I think this is right in principle, Open Philanthropy typically only includes near-term benefits in its analyses of philanthropic opportunities. So we should be very cautious when using the social impact estimate here to compare R&D with other opportunities; a naive comparison would give R&D an unfair advantage. • I use a discount of 0.2%, based on the possibility of a major disruption. Other sources of uncertainty about whether the benefits of R&D will really be felt might argue for a larger discount. ### Arguably too pessimistic The stylized calculation: • Ignores scenarios in which returns to R&D are much higher • e.g. expediting a large growth increase due to AI • Taking the expected value over those scenarios would probably significantly increase the estimated expected returns. • Doesn’t give current R&D credit for increasing the amount of research in future years, even though I think this was a significant dynamic historically and accelerating catch-up growth would have this effect. • Assumes ideas are getting harder to find rapidly (i.e. the diminishing returns are steep) • I use the estimate from Bloom et al. (2020), but my guess is that their data probably overestimate the growth in researchers and so overestimate the steepness of diminishing returns to finding new ideas. • Also, if innovation institutions have become worse over time (as people in the Progress Studies movement claim), then Bloom et al. (2020) will overestimate the steepness of diminishing returns to finding new ideas. • Assumes that US R&D$ and non-US R&D $are equally effective in boosting frontier growth, despite US$ being more focussed on frontier growth. An alternative assumption would be that US R&D does more to boost frontier growth per $than other R&D. • Uses a harsh value for the DMR to R&D. It’s the value from “Are Ideas getting harder to find”, but I think their data probably overestimates the growth in researchers and so overestimates the steepness of returns. Adjusting for this would probably increase the bottom line by 1-1.5X. • Uses a utility function that may underestimate the benefits of technological progress by assuming the only effect is to raise incomes. • Assumes 30% of frontier innovation never spreads around the whole world. This would be fine if I assumed that only the US was driving productivity growth, but it seems kinda harsh when we’re giving all countries’ R&D equal credit for frontier productivity growth. ## Appendix K: additional limitations of the model ### The model assumes research today doesn’t change the amount of research tomorrow My model assumes that funding additional research in 2021 doesn’t affect the amount of research effort in future years. But you might think research today increases the amount of research tomorrow. I claimed this dynamic was important in the past, with successful R&D projects providing evidence that R&D was a fruitful activity. Today, I think this dynamic is especially plausible for small R&D sectors that have the potential to be significantly scaled up. For example, early stage solar panel R&D may have brought forward in time the point at which it was profitable for private entities to invest in R&D. In other words, early solar R&D increased the amount of solar R&D occurring at later times. Another reason R&D today might increase R&D in later years is if it accelerates catch-up growth. Richer countries spend more on R&D, so faster catch-up growth implies more R&D spending. For example, India and China are contributing an increasing amount to global R&D, and faster catch-up growth would accelerate this process. Conversely, more research today might remove a promising research project from the pool of possibilities, decreasing the amount of future research. For example, suppose that certain R&D funders only make grants if the project is sufficiently promising. If ideas are getting harder to find, projects will tend to become less promising over time. More research today would accelerate this process, reducing the appeal of future projects and so reducing the amount of future funding. It seems plausible that R&D today could increase or decrease the amount of R&D that happens in the future. The current estimate doesn’t make an adjustment in either direction. ### Uncertainty about the utility function My calculation assumes that utility increases with the log of income. But there are plausible alternative assumptions that would change the bottom line in both directions. On the one hand, you might think that utility diminishes more sharply than the log of income. This can be captured by using a CES function of income with $$\eta >1$$. This assumption increases the benefit of extra income for the very poorest people relative to richer people.[63]I.e. as income falls, the marginal utility of income increases by more. This favours cash transfers targeted at the world’s poorest people) over R&D whose benefits are spread amongst people at all income levels. But on the other hand, you might think that the welfare benefits of developing new technologies outstrip the income increases that they cause. For example, between 1990 and 2020 the average US real income increased from$40k to $60k.[64]https://ourworldindata.org/grapher/gdp-per-capita-worldbank?tab=chart&yScale=log&country=OWID_WRL~USA Imagine someone in 1990 who earns$40k being offered the following choice:

1. Their income is raised to $60k, and they must spend all their income on 1990 goods and 1990 prices 2. They receive the average income of someone in 2020, and must spend it on 2020 goods at 2020 prices Plausibly, the second option is better because of the possibility of using entirely new products like the internet, smartphones, Amazon, etc.[65]To make the comparison fair, in option 2 you should be forced to buy all goods and services at 2020 prices. This means that some services will be more expensive than in 1990. Even so, I think option 2 is better because there are many new products that are cheap but useful. If this doesn’t seem … Continue reading The implication is that the increase in income underestimates the benefit of technological progress, or (relatedly) that inflation adjustments doesn’t adequately capture the value of new products and services.[66]How much does the income increase underestimate the welfare benefit of technological progress? You could roughly estimate this by increasing the cash payment in option 1. E.g. suppose you’re indifferent between option 2 and option 1’) A cash payment of$40k. This would imply that the true … Continue reading Using a utility function that adjusted for this would raise the returns to R&D.[67] Phil Trammell describes one such utility function here. Also see discussion by Matt Clancy in the “What about other benefits?” section of this post.

So some plausible changes to the utility function would reduce the social returns to R&D, while others would increase the returns.

Footnotes

↑1 If environmental constraints require that we reduce our use of various natural resources, productivity growth can allow us to maintain our standards of living while using fewer of these scarce inputs. For example, in Stubborn Attachments Tyler Cowen argues that the best way to improve the long-run future is to maximize the rate of sustainable economic growth. A similar view is held by many involved in Progress Studies, an intellectual movement that aims to understand and accelerate civilisational progress. An example of an intervention causing a temporary boost in R&D activity would be to fund some researchers for a limited period of time. Another example would be to bring forward in time a policy change that permanently increases the number of researchers. Three comments on the log-utility model. First, the results are the same whatever the values of the constants k and m. Second, I do a sensitivity analysis of the consequences of different utility functions; if the diminishing returns to income are steeper than log, this favours cash transfers more strongly. Third, by expressing the benefits of R&D in terms of their welfare impact I differ from Jones and Summers (2020) who express the benefits in terms of $. log(110) – log(100) ~= 10*[log(101) – log(100)] ~= 100*[log(100.1) – log(100)] ~= 1000*[log(100.01) – log(100)]. More realistically, there will be a lag before productivity benefits are felt. Currently I don’t model this lag because it wouldn’t affect the results by much. I use a discount of 0.2%; so a 50 year lag would reduce the returns to R&D by ~10%. GiveDirectly implements this intervention. Note, I use simplified numbers in this post that don’t exactly match GiveDirectly’s cost effectiveness, and I believe GiveDirectly is somewhat more impactful than the numbers I use imply. Note, rising incomes mean we don’t value adding an equal dollar amount to people’s incomes the same amount through time. We value a dollar more today because people today are poorer than they will be in the future. During the 20th century, the number of researchers grew exponentially, but productivity growth did not increase (in fact it decreased slightly). If R&D is responsible for the productivity growth, then more research effort is required to achieve each subsequent 1% gain in productivity. Note: this does not mean that the absolute$ increase in incomes shrinks over time. It may decline, stay constant or increase, depending on the rate at which ideas are getting harder to find. Technically, if the “fishing out” parameter $$\phi$$ > 0, then the absolute $benefit increases over time. If $$\phi$$ < 0, it decreases over time. If $$\phi$$ = 0 exactly, it stays constant. I use $$\phi$$ = -2.1, estimated in Bloom et al (2020).) The key point is as follows: when ideas are getting harder to find, the number of new ideas found with a marginal researcher-year is roughly proportional to 1 / (total researcher-years so far). So if the 100th researcher-year finds 1/100 new ideas, the 200th researcher-year will find only 1/200 new ideas and the 1000th researcher-year will only find 1/1000 new ideas. Let’s work through the consequences of this point using an example. Suppose an intervention funds an extra researcher-year in 1900, and doesn’t change the amount of research happening in subsequent years. We’ll estimate the impact of the intervention on TFP in 1900 and in 2000. What’s the impact of the intervention on TFP in 1900? Suppose that a total of 100 researcher-years have occurred by 1900. Then the intervention makes the difference between 100 researcher-years vs 101 researchers-years having happened, a difference of 1/100 new ideas. [Here I assume that the number of new ideas found with a marginal researcher-year = 1 / (total researcher-years so far).] 1/100 new ideas correspond to a 0.01% increase in TFP, because we’re defining “an idea” as a 1% TFP increase. What’s the impact of the intervention on TFP in 2000? Suppose that a total of 1000 researcher-years had occurred by 2000. Then the intervention makes the difference between 1000 researcher-years vs 1001 researchers-years having happened, a difference of 1/1000 new ideas. [Here I again assume that the number of new ideas found with a marginal researcher-year = 1 / (total researcher-years so far).] 1/1000 new ideas correspond to a 0.001% increase in TFP. So the intervention raises TFP by 0.01% in 1900, but only by 0.001% in 2000. Its impact on TFP falls towards 0% over time. Mathematically, in the semi-endogenous growth model the effort needed to find a new idea is proportional to TFP^($$\phi$$ – 1), where $$\phi$$ is the parameter controlling how quickly ideas are getting harder to find. I use $$\phi$$ = -1.4, so every time TFP doubles the effort needed to find a new idea increases by 2^2.4 = 5.3. In the long run, there are reasons to think population will fall (fertility rates in developed countries), reasons to think it might increase (relating to biological and cultural evolution), and no compelling reason to think it will stay exactly the same. Still, this feels like a fair ‘default’ case to consider for calculating a stylised value of R&D for our Global Health and Wellbeing team. I discuss some alternative scenarios in appendix E, and list ways the model is optimistic and pessimistic in appendix J. The lag until productivity benefits are felt will probably be larger in low income countries than in high income countries. As mentioned above, I don’t model this lag because it wouldn’t affect the results by much. I use a discount of 0.2%, so a 50 year lag would reduce the returns to R&D by ~10%. Note, the 70% average could arise from <<70% of benefits eventually spilling over to low income countries, and >70% of benefits eventually spilling over to everywhere else. E.g. the poorest half of the global population could get spillovers of 40% while the richest half get spillovers of 100%. In economic growth models, this corresponds to the “stepping on toes” parameter λ = 0.75. I’m not aware of data that pins down λ, and it seems like values between 0.4 and 1 could be correct. I use the estimate from Bloom et al. (2020) Appendix Table A1, where they set λ = 0.75 and then estimate $$\phi$$ = -1.4. The primary effect is recorded as a TFP increase because GDP went up holding constant the amount of labour and physical machinery. The secondary effect is recorded as capital deepening because each person has more physical capital (i.e. more or better machinery). Growth theory relates the size of these effects on income: (income increase from TFP and capital deepening) = (income increase from TFP alone) / (1 – capital share of GDP). The capital share is about 35%, so this multiplies the bottom line by 1 / (1 – 0.35) = 1.5. As mentioned in a previous footnote, I think GiveDirectly is somewhat more impactful than the numbers in this row. For example data from Bloom et al. 2020 find the number of US researchers increasing by 4.3% per year on average since 1930. US population grew less than 1.5% per year on average in the same period, implying that the fraction of people doing research was growing. See for example Lee (1988), Kremer (1993), Jones (2001) and Galor and Weil (2000). Though the fraction of people doing research can increase, this can only go on for so long. I discuss this possibility below. There are good theoretical reasons to think TFP can’t grow exponentially at its recent rate for more than 10,000 years, but these don’t rule out exponential growth continuing for another 1000 years. If R&D today expedites a future technology that massively accelerates future growth, the bottom line can increase by much more than 100X. More. My mainline scenario found R&D to be 45% as impactful as giving cash to someone on$500/year. This implies R&D is 4.5% as impactful as our current bar for GHW grantmaking. A 7X increase would leave R&D 31.5% as impactful as the GHW bar. Aghion et al. (2017) discuss the possibility that AI will accelerate productivity growth by automating research tasks. Note, a growth-enhancing technology might allow a constant population of human researchers to maintain ~2% productivity growth, or it might allow them to accelerate productivity growth. Open Philanthropy thinks the latter possibility is more likely than many actors seem to think, for reasons discussed in this report. Of course, growth-enhancing technologies might enable other trends to continue. E.g. the trend of ~2% annual growth in US GDP/capita over the past 150 years, or the trend of growth accelerating over the past 10,000 years. Let’s demonstrate this point with an example. Suppose an intervention causes an extra researcher-year to happen in 2021. Let’s consider its impact on TFP in 2100 if a growth-enhancing technology isn’t developed, and if it is developed. Suppose that if a growth-enhancing technology isn’t developed then a total of 1000 researcher-years will have happened by 2100. Then the intervention makes the difference between 1000 researcher-years vs 1001 researcher-years having happened, a difference of 1/1000 new ideas.  [Here I assume that the number of new ideas found with a marginal researcher-year = 1 / (total researcher-years so far). This is a consequence of ideas getting harder to find.] If a growth-enhancing technology is developed, then a total of 9000 researcher-years will have happened by 2100. Then the intervention makes the difference between a 9000 researcher-years vs 9001 researchers-years having happened, a difference of 1/9000 new ideas.  [Here I assume that the number of new ideas found with a marginal researcher-year = 1 / (total researcher-years so far).] So the intervention causes 1/1000 new ideas in 2100 if a growth-enhancing technology isn’t developed, but only 1/9000 new ideas if it is developed. Note: I assume that the intervention doesn’t change the amount of research done in later years, so it always makes a difference of 1 researcher-year. But if the intervention brought the growth-enhancing technology forward in time, it would increase the amount of research in later years. This would significantly change the calculation. Appendix D does a very rough BOTEC on the returns to R&D for one possible growth-enhancing technology. Eg. see this draft report by Joe Carlsmith on risk from power-seeking AI, or these two posts from the Cold Takes blog. Even if R&D isn’t competitive according to either worldview, might it look competitive according to a weighted sum of both? I think not. I estimate funding generic R&D to be ~10X worse than the GHW bar, and it looks significantly worse from a LTist perspective than alternative interventions. Appendix H discusses some potential differences between my perspective and that of Progress Studies advocates. The simulation assumes that the increase in R&D activity is proportional to the increase in funding. This may be optimistic: in reality you need both funding and researchers to do R&D. Essentially, the simulation assumes that more funding will bring with it more researchers, which may be optimistic. So it ignores the additional effect that capital deepening has on TFP increases in standard growth models. It assumes everyone is on the world average income. Representing income inequality wouldn’t change the results. This is because we ultimately care about the percentage effect of R&D on income, and this is the same no matter what people’s starting incomes are. We care about the percentage effect because we assume utility = k + m*ln(income). The specific values used for k and m do not affect the result as they cancel. In practice we use k=0 and m=1. It quantifies the result in this way because this is a metric Open Philanthropy uses internally to compare the impacts from different kinds of intervention. More precisely, the absolute size of the annual increment is 2% of current research intensity: 0.02 * 2.3% = 0.046%. So this assumption corresponds to thinking that research intensity has been growing exponentially at about 2% per year, but this exponential rate of increase will decline over time. See page 20 of Jones and Summers (2020). How long does it take for the “wedge declines more slowly” effect to dominate if we have no discount? Let’s assume we change our value of $$\phi$$ to compensate when we change lambda. How long does it take for the total impact from $$\lambda$$=0.75 to exceed the impact from $$\lambda$$=1? With constant exponential population growth of 1% it takes ~250 years. With population stagnating after 80 years, it takes ~5000 years. If we held $$\phi$$ constant when changing lambda, or used a discount, it would take longer (perhaps never) for the impact from $$\lambda$$=0.75 to exceed the impact from $$\lambda$$=1. I get this expression by differentiating the expression for $$g_A$$ with respect to $$L$$: $$wedge = d(g_A)/dL$$. Why would this happen? We caused some extra counterfactual science to happen: R(t)+1 researcher-years rather than R(t) at each time t. But this extra science makes less % difference to income as ideas become harder to find. 100 vs 101 researcher-years makes a bigger % difference to income than 1000 vs 1001. The wedge halves each time researcher population doubles (assuming the ‘stepping on toes’ parameter $$\lambda$$=1). The relationship between TFP growth g and population growth n in steady state is given by g = $$\lambda$$ * n / (1 – phi). Using $$\lambda$$=1 and $$\phi$$=-2 (values from Bloom et al. 2020) implies g = n/3. Researcher population growth is 3X faster than TFP growth. By the time TFP has grown 30%, population has ~doubled and to the wedge has ~halved. See Bostrom (2003) for a slightly more detailed explanation of this point. This is exactly true in the semi-endogenous framework when the “stepping on toes” parameter lambda=1. If lambda < 1 then the population more than doubles each time the wedge halves. Here’s another way to understand this effect. Because ideas are getting harder to find, the number of new ideas found with a marginal researcher-year is roughly proportional to 1 / (total years of research so far). Suppose we fund an extra year of research in 2021. As a result, in 2050 one more researcher-year has occurred. The number of ideas found with this extra researcher-year is proportional to 1 / (total years of research by 2050). If the fraction of people doing R&D increases, this quantity will be smaller. See Bloom et al (2020) and https://mattsclancy.substack.com/p/innovation-gets-mostly-harder. 101 researchers working for 1 year make the same amount of progress as 100 researchers working for (1 + 1/100) years. Another way to think about this is that we assume that the total R&D effort at each time is determined by the level of technology. I.e. the level of technology determines both the population and the research concentration and so determines the total R&D effort. So if we reach a given level of technology x years earlier, we also reach the corresponding level of R&D effort x years earlier. Again, this assumption captures the idea that R&D in 1800 increased the amount of R&D at later times. See data from Roodman (2020). You need ~27 10% income increases to go from $700 to$9600. So you need 27 billion 10% income increases to raise 1 billion people’s income from $700 to$9600. Here I use the fact that all 10% income increases are valued equally, which comes from the log-utility model. 1.1^23 = 14 = 9600 / 700. The assumption that researchers earn the average global wage, rather than a higher wage, will make no difference to the result. This is because I will make an equivalent assumption about the wages of researchers in 2020. The results would change somewhat if you think researchers today demand more (or less) of the premium than researchers in 1800. For consistency with the 1800 calculation (see most recent fn), I assume that 1 researcher-year costs the same as the global average income. Today that is about $10,000. The model for R&D today implies that$100 to R&D has the same welfare effect as raising someone’s income by 7% for one year. So $10,000 has the same effect as raising 100 people’s incomes by 7%. Given the log-utility model, this is roughly the same welfare effect as raising 70 people’s incomes by 10%. 1.028^90 = 12. 1.016^130 = 8 1.008^130 = 3. See Bloom et al. (2013), Coe and Helpman (1995) and references within Jones and Williams (1998) and Jones and Summers (2020). 0.75*0.4%/2.3% = 0.13. E.g. Bloom et al. (2013) estimate a private return of 21% and a social return of 55%. See multiple estimates in table 1 of Jones and Williams (1998) and in Appendix B of Jones and Summers (2020). Conversely, if I’d used the depreciation assumptions used in the empirical estimates, I’d have expected the benefit from marginal R&D to diminish much more quickly over time and calculated a lower total return to R&D. The model here differs from the standard semi-endogenous growth model in two ways. First, the standard semi-endogenous growth model assumes that all TFP growth is due to R&D; this model relaxes this assumption. Second, this model translates income changes to welfare changes using a log-utility model. E.g. see Patrick Collison here, 19:57-25:32. I.e. as income falls, the marginal utility of income increases by more. https://ourworldindata.org/grapher/gdp-per-capita-worldbank?tab=chart&yScale=log&country=OWID_WRL~USA Imagine someone in 1990 who earns$40k being offered the following choice: To make the comparison fair, in option 2 you should be forced to buy all goods and services at 2020 prices. This means that some services will be more expensive than in 1990. Even so, I think option 2 is better because there are many new products that are cheap but useful. If this doesn’t seem right, you might consider a more extreme version of the thought experiment. Instead of comparing 2020 with 1990, compare 2020 with 990. Imagine offering a medieval peasant enough medieval goods and services — food, land, servants — that their income is $60k a year. Would they prefer that, or being on$60k in 2020 with the conveniences of central heating, clean running water, varied food, modern medicine, etc? Here it seems clear to me that the latter option is better. How much does the income increase underestimate the welfare benefit of technological progress? You could roughly estimate this by increasing the cash payment in option 1. E.g. suppose you’re indifferent between option 2 and option 1’) A cash payment of $40k. This would imply that the true welfare benefit is very roughly 2X the income increase. (This is very rough because the utility gains from receiving$40k are not twice those from $20k, due to diminishing returns to$.) Phil Trammell describes one such utility function here. Also see discussion by Matt Clancy in the “What about other benefits?” section of this post.

## History of Philanthropy: Work We’ve Commissioned

This page collects work we’ve commissioned on the history of philanthropy, as outlined on our focus area page.

## Case studies we’ve commissioned

Case study on the Healthcare for the Homeless Program, by Ben Soskis:

Case study on the Pew Charitable Trusts’ drug safety legislation program, by Tamara Mann Tweel:

Case study on the role of multiple funders in the passage of the Affordable Care Act, by Ben Soskis:

Case study on the founding of the Center on Budget and Policy Priorities, by Suzanne Kahn:

Case study on the founding of the Center for Global Development, by Ben Soskis:

Case study on the role of the Center on Budget and Policy Priorities in state EITC programs, by Suzanne Kahn:

Case study on the Clinton Health Access Initiative’s role in global price drops for antiretroviral drugs, by Tamara Mann Tweel:

Case study on Philanthropy’s Role in The Fight for Marriage Equality, by Benjamin Soskis:

Case study on the role of philanthropy in promoting nuclear nonproliferation and threat reduction, by Paul Rubinson:

## Case studies produced by the Center on Nonprofits and Philanthropy at the Urban Institute

Case study on the Pugwash conferences on science and world affairs, by Paul Rubinson:

Philanthropy and US Student Movements: Four Cases, by Maoz Brown:

Literature Review: Conservative Philanthropy in Higher Education, by David Austin Walsh:

## Case studies we’ve completed

Some case studies in early field growth, by Luke Muehlhauser:

## Other work we’ve commissioned

Historian Ben Soskis reviewed the existing literature on the history of philanthropy and created the following resources:

• Annotated bibliography: A list of books he identified as possibly informative. He briefly reviewed each book and summarized its contents.
• Extended bibliography: A list of books he reviewed (covered in the list above); books he considered but did not review; and books he would review if he spent more time on this project.
• Process and findings: The process he used to select books for consideration and his preliminary conclusions about the state of the literature.

We also support Ben’s work on HistPhil. This work is structured as part of Ben’s consulting for us rather than as a grant, which is why it does not appear in our grants database.

We have also commissioned reviews of the existing literature on several particular cases of philanthropic impact:

## Could Advanced AI Drive Explosive Economic Growth?

This report evaluates the likelihood of ‘explosive growth’, meaning > 30% annual growth of gross world product (GWP), occurring by 2100. Although frontier GDP/capita growth has been constant for 150 years, over the last 10,000 years GWP growth has accelerated significantly. Endogenous growth theory, together with the empirical fact of the demographic transition, can explain both trends. Labor, capital and technology were accumulable over the last 10,000 years, meaning that their stocks all increased as a result of rising output. Increasing returns to these accumulable factors accelerated GWP growth. But in the late 19th century, the demographic transition broke the causal link from output to the quantity of labor. There were not increasing returns to capital and technology alone and so growth did not accelerate; instead frontier economies settled into an equilibrium growth path defined by a balance between a growing number of researchers and diminishing returns to research.

This theory implies that explosive growth could occur by 2100. If automation proceeded sufficiently rapidly (e.g. due to progress in AI) there would be increasing returns to capital and technology alone. I assess this theory and consider counter-arguments stemming from alternative theories; expert opinion; the fact that 30% annual growth is wholly unprecedented; evidence of diminishing returns to R&D; the possibility that a few non-automated tasks bottleneck growth; and others. Ultimately, I find that explosive growth by 2100 is plausible but far from certain.

## 1. How to read this report

Read the summary (~1 page). Then read the main report (~30 pages).

The rest of the report contains extended appendices to the main report. Each appendix expands upon specific parts of the main report. Read an appendix if you’re interested in exploring its contents in greater depth.

I describe the contents of each appendix here. The best appendix to read is probably the first, Objections to explosive growth. Readers may also be interested to read reviews of the report.

Though the report is intended to be accessible to non-economists, readers without an economics background may prefer to read the accompanying blog post.

## 2. Why we are interested in explosive growth

Open Philanthropy wants to understand how far away we are from developing transformative artificial intelligence(TAI). Difficult as it is, a working timeline for TAI helps us prioritize between our cause areas, including potential risks from advanced AI.

In her draft report, my colleague Ajeya Cotra uses TAI to mean ‘AI which drives Gross World Product (GWP) to grow at ~20-30% per year’ – roughly ten times faster than it is growing currently. She estimates a high probability of TAI by 2100 (~80%), and a substantial probability of TAI by 2050 (~50%). These probabilities are broadly consistent with the results from expert surveys,1 and with plausible priors for when TAI might be developed.2

Nonetheless, intuitively speaking these are high probabilities to assign to an ‘extraordinary claim’. Are there strong reasons to dismiss these estimates as too high? One possibility is economic forecasting. If economic extrapolations gave us strong reasons to think GWP will grow at ~3% a year until 2100, this would rule out explosive growth and so rule out TAI being developed this century.

I find that economic considerations don’t provide a good reason to dismiss the possibility of TAI being developed in this century. In fact, there is a plausible economic perspective from which sufficiently advanced AI systems are expected to cause explosive growth.

## 3. Summary

If you’re not familiar with growth economics, I recommend you start by reading this glossary or my blog post about the report.

Since 1900, frontier GDP/capita has grown at about 2% annually.3 There is no sign that growth is speeding up; if anything recent data suggests that growth is slowing down. So why think that > 30% annual growth of GWP (‘explosive growth’) is plausible this century?

I identify three arguments to think that sufficiently advanced AI could drive explosive growth:

1. Idea-based models of very long-run growth imply AI could drive explosive growth.
• Growth rates have significantly increased (super-exponential growth) over the past 10,000 years, and even over the past 300 years. This is true both for GWP growth, and frontier GDP/capita growth.
• Idea-based models explain increasing growth with an ideas feedback loop: more ideas → more output → more people → more ideas… Idea-based models seem to have a good fit to the long-run GWP data, and offer a plausible explanation for increasing growth.
• After the demographic transition in ~1880, more output did not lead to more people; instead people had fewer children as output increased. This broke the ideas feedback loop, and so idea-based theories expect growth to stop increasing shortly after the time. Indeed, this is what happened. Since ~1900 growth has not increased but has been roughly constant.
• Suppose we develop AI systems that can substitute very effectively for human labor in producing output and in R&D. The following ideas feedback loop could occur: more ideas → more output → more AI systems → more ideas… Before 1880, the ideas feedback loop led to super-exponential growth. So our default expectation should be that this new ideas feedback loop will again lead to super-exponential growth.
2. A wide range of growth models predict explosive growth if capital can substitute for labor. Here I draw on models designed to study the recent period of exponential growth. If you alter these models with the assumption that capital can substitute very effectively for labor, e.g. due to the development of advanced AI systems, they typically predict explosive growth. The mechanism is similar to that discussed above. Capital accumulation produces a powerful feedback loop that drives faster growth: more capital → more output → more capital …. These first two arguments both reflect an insight of endogenous growth theory: increasing returns to accumulable inputs can drive accelerating growth.
3. An ignorance perspective assigns some probability to explosive growth. We may not trust highly-specific models that attempt to explain why growth has increased over the long-term, or why it has been roughly constant since 1900. But we do know that the pace of growth has increased significantly over the course of history. Absent deeper understanding of the mechanics driving growth, it would be strange to rule out growth increasing again. 120 years of steady growth is not enough evidence to rule out a future increase.

I discuss a number of objections to explosive growth:

• 30% growth is very far out of the observed range.
• Models predicting explosive growth have implausible implications – like output going to infinity in finite time.
• There’s no evidence of explosive growth in any subsector of the economy.
• Limits to automation are likely to prevent explosive growth.
• Won’t diminishing marginal returns to R&D prevent explosive growth?
• And many others.

Although some of these objections are partially convincing, I ultimately conclude that explosive growth driven by advanced AI is a plausible scenario.

In addition, the report covers themes relating to the possibility of stagnating growth; I find that it is a highly plausible scenario. Exponential growth in the number of researchers has been accompanied by merely constant GDP/capita growth over the last 80 years. This trend is well explained by semi-endogenous growth models in which ideas are getting harder to find.4 As population growth slows over the century, number of researchers will likely grow more slowly; semi-endogenous growth models predict that GDP/capita growth will slow as a result.

Thus I conclude that the possibilities for long-run growth are wide open. Both explosive growth and stagnation are plausible.

Acknowledgements: My thanks to Holden Karnofsky for prompting this investigation; to Ajeya Cotra for extensive guidance and support throughout; to Ben Jones, Dietrich Vollrath, Paul Gaggl, and Chad Jones for helpful comments on the report; to Anton Korinek, Jakub Growiec, Phil Trammel, Ben Garfinkel, David Roodman, and Carl Shulman for reviewing drafts of the report in depth; to Harry Mallinson for reviewing code I wrote for this report and helpful discussion; to Joseph Carlsmith, Nick Beckstead, Alexander Berger, Peter Favaloro, Jacob Trefethen, Zachary Robinson, Luke Muehlhauser, and Luisa Rodriguez for valuable comments and suggestions; and to Eli Nathan for extensive help with citations and the website.

## 4. Main report

If you’re not familiar with growth economics, I recommend you start by reading this glossary or my blog post about the report.

How might we assess the plausibility of explosive growth (>30% annual GWP) occurring by 2100? First, I consider the raw empirical data; then I address a number of additional considerations.

• What do experts think (here)?
• How does economic growth theory affect the case of explosive growth (here and here)?
• How strong are the objections to explosive growth (here)?
• Conclusion (here).

#### 4.1 Empirical data without theoretical interpretation

When looking at the raw data, two conflicting trends jump out.

The first trend is the constancy of frontier GDP/capita growth over the last 150 years.5 The US is typically used to represent this frontier. The following graph from Our World in Data shows US GDP/capita since 1870.

The y-axis is logarithmic, so the straight line indicates that growth has happened at a constant exponential rate – ~2% per year on average.6 Extrapolating the trend, frontier GDP/capita will grow at ~2% per year until 2100. GWP growth will be slightly larger, also including a small boost from population growth and catch-up growth. Explosive growth would be a very large break from this trend.

I refer to forecasts along these lines as the standard story. Note, I intend the standard story to encompass a wide range of views, including the view that growth will slow down significantly by 2100 and the view that it will rise to (e.g.) 4% per year.

The second trend is the super-exponential growth of GWP over the last 10,000 years.7 (Super-exponential means the growth rate increases over time.) Another graph from Our World in Data shows GWP over the last 2,000 years:

Again, the y-axis is logarithmic, so the increasing steepness of the slope indicates that the growth rate has increased.

It’s not just GWP – there’s a similar super-exponential trend in long-run GDP/capita in many developed countries – see the graphs of US, English, and French GDP/capita in section 14.3.8 (Later I discuss whether we can trust these pre-modern data points.)

It turns out that a simple equation called a ‘power law’ is a good fit to GWP data going all the way back to 10,000 BCE. The following graph (from my colleague David Roodman) shows the fit of a power law (and of exponential growth) to the data. The axes of the graph are chosen so that the power law appears as a straight line.9

If you extrapolate this power law trend into the future, it implies that the growth rate will continue to increase into the future and that GWP will approach infinity by 2047!10

Many other simple curves fit to this data also predict explosive (>30%) growth will occur in the next few decades. Why is this? The core reason is that the data shows the growth rate increasing more and more quickly over time. It took thousands of years for growth to increase from 0.03% to 0.3%, but only a few hundred years for it to increase from 0.3% to 3%.11 If you naively extrapolate this trend, you predict that growth will increase again from 3% to 30% within a few decades.

We can see this pattern more clearly by looking at a graph of how GWP growth has changed over time.12

The graph shows that the time needed for the growth rate to double has fallen over time. (Later I discuss whether this data can be trusted.) Naively extrapolating the trend, you’d predict explosive growth within a few decades.

I refer to forecasts along these lines, that predict explosive growth by 2100, as the explosive growth story.

So we have two conflicting stories. The standard story points to the steady ~2% growth in frontier GDP/capita over the last 150 years, and expects growth to follow a similar pattern out to 2100. The explosive growth story points to the super-exponential growth in GWP over the last 10,000 years and expects growth to increase further to 30% per year by 2100.

Which story should we trust? Before taking into account further considerations, I think we should put some weight on both. For predictions about the near future I would put more weight on the standard story because its data is more recent and higher quality. But for predictions over longer timescales I would place increasing weight on the explosive growth story as it draws on a longer data series.

Based on the two empirical trends alone, I would neither confidently rule out explosive growth by 2100 nor confidently expect it to happen. My attitude would be something like: ‘Historically, there have been significant increases in growth. Absent a deeper understanding of the mechanisms driving these increases, I shouldn’t rule out growth increasing again in the future.’ I call this attitude the ignorance story.13 The rest of the main report raises considerations that can move us away from this attitude (either towards the standard story or towards the explosive growth story).

#### 4.2 Expert opinion

In the most recent and comprehensive expert survey on growth out to 2100 that I could find, all the experts assigned low probabilities to explosive growth.

All experts thought it 90% likely that the average annual GDP/capita growth out to 2100 would be below 5%.14 Strictly speaking, the survey data is compatible with experts thinking there is a 9% probability of explosive growth this century, but this seems unlikely in practice. The experts’ quantiles, both individually and in aggregate, were a good fit for normal distributions which would assign ≪ 1% probability to explosive growth.15

Experts’ mean estimate of annual GWP/capita growth was 2.1%, with standard deviation 1.1%.16 So their views support the standard story and are in tension with the explosive growth story.

There are three important caveats:

1. Lack of specialization. My impression is that long-run GWP forecasts are not a major area of specialization, and that the experts surveyed weren’t experts specifically in this activity. Consonant with this, survey participants did not consider themselves to be particularly expert, self-reporting their level of expertise as 6 out of 10 on average.17
2. Lack of appropriate prompts. Experts were provided with the data about the growth rates for the period 1900-2000, and primed with a ‘warm up question’ about the recent growth of US GDP/capita. But no information was provided about the longer-run super-exponential trend, or about possible mechanisms for producing explosive growth (like advanced AI). The respondents may have assigned higher probabilities to explosive growth by 2100 if they’d been presented with this information.
3. No focus on tail outcomes. Experts were not asked explicitly about explosive growth, and were not given an opportunity to comment on outcomes they thought were < 10% likely to occur.

#### 4.3 Theoretical models used to extrapolate GWP out to 2100

Perhaps economic growth theory can shed light on whether to extrapolate the exponential trend (standard story) or the super-exponential trend (explosive growth story).

• Do the growth models of the standard story give us reason beyond the empirical data to think 21st century growth will be exponential or sub-exponential?
• They could do this if they point to a mechanism explaining recent exponential growth, and this mechanism will continue to operate in the future.
• Do the growth models of the explosive growth story give us reason beyond the empirical data to think 21st century growth will be super-exponential?
• They could do this if they point to a mechanism explaining the long-run super-exponential growth, and this mechanism will continue to operate in the future.

My starting point is the models actually used to extrapolate GWP to 2100, although I draw upon economic growth theory more widely in making my final assessment. First, I give a brief explanation of how growth models work.

#### 4.3.1 How do growth models work?

In economic growth models, a number of inputs are combined to produce output. Output is interpreted as GDP (or GWP). Typical inputs include capital (e.g. equipment, factories), labor (human workers), human capital (e.g. skills, work experience), and the current level of technology.18

Some of these inputs are endogenous,19 meaning that the model explains how the input changes over time. Capital is typically endogenous; output is invested to sustain or increase the amount of capital.20 In the following diagram, capital and human capital are endogenous:

Other inputs may be exogenous. This means their values are determined using methods external to the growth model. For example, you might make labor exogenous and choose its future values using UN population projections. The growth model does not (attempt to) explain how the exogenous inputs change over time.

When a growth model makes more inputs endogenous, it models more of the world. It becomes more ambitious, and so more debatable, but it also gains the potential to have greater explanatory power.

#### 4.3.2 Growth models extrapolating the exponential trend to 2100

I looked at a number of papers in line with the standard story that extrapolate GWP out to 2100. Most of them treated technology as exogenous, typically assuming that technology will advance at a constant exponential rate.21In addition, they all treated labor as exogenous, often using UN projections. These growth models can be represented as follows:

The blue ‘+’ signs represent that the increases to labor and technology each year are exogenous, determined outside of the model.

In these models, the positive feedback loop between output and capital is not strong enough to produce sustained growth. This is due to diminishing marginal returns to capital. This means that each new machine adds less and less value to the economy, holding the other inputs fixed.22 Even the feedback loop between output and (capital + human capital) is not strong enough to sustain growth in these models, again due to diminishing returns.

Instead, long-run growth is driven by the growth of the exogenous inputs, labor and technology. For this reason, these models are called exogenous growth models: the ultimate source of growth lies outside of the model. (This is contrasted with endogenous growth models, which try to explain the ultimate source of growth.)

It turns out that long run growth of GDP/capita is determined solely by the growth of technology.23These models do not (try to) explain the pattern of technology growth, and so they don’t ultimately explain the pattern of GDP/capita growth.

#### 4.3.2.1 Evaluating models extrapolating the exponential trend

The key question of this section is: Do the growth models of the standard story give us reason beyond the empirical data to think 21st century growth of frontier GDP/capita will be exponential or sub-exponential?

My answer is ‘yes’. Although the exogenous models used to extrapolate GWP to 2100 don’t ultimately explain why GDP/capita has grown exponentially, there are endogenous growth models that address this issue. Plausible endogenous models explain this pattern and imply that 21st century growth will be sub-exponential. This is consistent with the standard story. Interestingly, I wasn’t convinced by models implying that 21st century growth will be exponential.

The rest of this section explains my reasoning in more detail.

Endogenous growth theorists have for many decades sought theories where long-run growth is robustly exponential. However, they have found it strikingly difficult. In endogenous growth models, long-run growth is typically only exponential if some knife-edge condition holds. A parameter of the model must be exactly equal to some specific value; the smallest disturbance in this parameter leads to completely different long-run behavior, with growth either approaching infinity or falling to 0. Further, these knife-edges are typically problematic: there’s no particular reason to expect the parameter to have the precise value needed for exponential growth. This problem is often called the ‘linearity critique’ of endogenous growth models.

Appendix B argues that many endogenous growth models contain problematic knife-edges, drawing on discussions in Jones (1999), Jones (2005), Cesaratto (2008), and Bond-Smith (2019).

Growiec (2007) proves that a wide class of endogenous growth models require a knife-edge condition to achieve constant exponential growth, generalizing the proof of Christiaans (2004). The proof doesn’t show that all such conditions are problematic, as there could be mechanisms explaining why knife-edges hold. However, combined with the observation that many popular models contain problematic knife-edges, the proof suggests that it may be generically difficult to explain exponential growth without invoking problematic knife-edge conditions.

Two attempts to address this problem stand out:

1. Claim that exponential population growth has driven exponential GDP/capita growth. This is an implication of semi-endogenous growth models (Jones 1995). These models are consistent with 20th century data: exponentially growing R&D effort has been accompanied by exponential GDP/capita growth. Appendix B argues that semi-endogenous growth models offer the best framework for explaining the recent period of exponential growth.24However, I do not think their ‘knife-edge’ assumption that population will grow at a constant exponential rate is likely to be accurate until 2100. In fact, the UN projects that population growth will slow significantly over the 21st century. With this projection, semi-endogenous growth models imply that GDP/capita growth will slow.25 So these models imply 21st century growth will be sub-exponential rather than exponential.26
2. Claim that market equilibrium leads to exponential growth without knife-edge conditions.
• In a 2020 paper Robust Endogenous Growth, Peretto outlines a fully endogenous growth model that achieves constant exponential growth of GDP/capita without knife-edge conditions. The model displays increasing returns to R&D investment, which would normally lead to super-exponential growth. However, these increasing returns are ‘soaked up’ by the creation of new firms which dilute R&D investment. Market incentives ensure that new firms are created at exactly the rate needed to sustain exponential growth.
• The model seems to have some implausible implications. Firstly, it implies that there should be a huge amount of market fragmentation, with the number of firms growing more quickly than the population. This contrasts with the striking pattern of market concentration we see in many areas.27 Secondly, it implies that if no new firms were introduced – e.g. because this was made illegal – then output would reach infinity in finite time. This seems to imply that there is a huge market failure: private incentives to create new firms massively reduce long-run social welfare.
• Despite these problems, the model does raise the possibility that an apparent knife-edge holds in reality due to certain equilibrating pressures. Even if this model isn’t quite right, there may still be equilibrating pressures of some sort.28
• Overall, this model slightly raises my expectation that long-run growth will be exponential.29

This research shifted my beliefs in a few ways:

• I put more probability (~75%) on semi-endogenous growth models explaining the recent period of exponential growth.30
• So I put more weight on 21st century growth being sub-exponential.
• We’ll see later that these models imply that sufficiently advanced AI could drive explosive growth. So I put more weight on this possibility as well.
• It was harder than I expected to for growth theories to adequately explain why income growth should be exponential in a steady state (rather than sub- or super-exponential). So I put more probability on the recent period of exponential growth being transitory, rather than part of a steady state.
• For example, the recent period could be a transition between past super-exponential growth and future sub-exponential growth, or a temporary break in a longer pattern of super-exponential growth.
• This widens the range of future trajectories that I regard as being plausible.

#### 4.3.3 Growth models extrapolating the super-exponential trend

Some growth models extrapolate the long-run super-exponential trend to predict explosive growth in the future.31 Let’s call them long-run explosive models. The ones I’m aware of are ‘fully endogenous’, meaning all inputs are endogenous.32

Crucially, long-run explosive models claim that more output → more people. This makes sense (for example) when food is scarce: more output means more food, allowing the population to grow. This assumption is important, so it deserves a name. Let’s say these models make population accumulable. More generally, an input is accumulable just if more output → more input.33

The term ‘accumulable’ is from the growth literature; the intuition behind it is that the input can be accumulated by increasing output.

It’s significant for an input to be accumulable as it allows a feedback loop to occur: more output → more input → more output →… Population being accumulable is the most distinctive feature of long-run explosive models.

Long-run explosive models also make technology accumulable: more output → more people → more ideas (technological progress).

All growth models, even exogenous ones, imply that capital is accumulable: more output → more reinvestment → more capital.34 In this sense, long-run explosive models are a natural extension of the exogenous growth models discussed above: a similar mechanism typically used to explain capital accumulation is used to explain the accumulation of technology and labor.

We can roughly represent long-run explosive models as follows:35

The orange arrows show that all the inputs are accumulable: a marginal increase in output leads to an increase in the input. Fully endogenous growth models like these attempt to model more of the world than exogenous growth models, and so are more ambitious and debatable; but they potentially have greater explanatory power.

Why do these models predict super-exponential growth? The intuitive reason is that, with so many accumulable inputs, the feedback loop between the inputs and output is powerful enough that growth becomes faster and faster over time.

More precisely, the key is increasing returns to scale in accumulable inputs: when we double the level of every accumulable input, output more than doubles.36

Why are there increasing returns to scale? The key is the insight, from Romer (1990), that technology is non-rival. If you use a new solar panel design in your factory, that doesn’t prevent me from using that same design in my factory; whereas if you use a particular machine/worker, that does prevent me from using that same machine/worker.

Imagine doubling the quantity of labor and capital, holding technology fixed. You could literally replicate every factory and worker inside it, and make everything you currently make a second time. Output would double. Crucially, you wouldn’t need to double the level of technology because ideas are non-rival: twice as many factories could use the same stock of ideas without them ‘running out’.

Now imagine also doubling the level of technology. We’d still have twice as many factories and twice as many workers, but now each factory would now be more productive. Output would more than double. This is increasing returns to scale: double the inputs, more than double the output.37

Long-run explosive models assume that capital, labor and technology are all accumulable. Even if they include a fixed input like land, there are typically increasing returns to accumulable inputs. This leads to super-exponential growth as long unless the diminishing returns to technology R&D are very steep.38 For a wide of plausible parameter values, these models predict super-exponential growth.39

The key feedback loop driving increasing returns and super-exponential growth in these models can be summarized as more ideas (technological progress) → more output → more people → more ideas→…

These models seem to be a good fit to the long-run GWP data. The model in Roodman (2020) implies that GWP follows a ‘power-law’, which seems to fit the data well.

Long-run explosive models fitted to the long-run GWP data typically predict that explosive growth (>30% per year) is a few decades away. For example, you can ask the model in Roodman (2020) ‘When will be the first year of explosive growth?’ Its median prediction is 2043 and the 80% confidence range is [2034, 2065].

#### 4.3.3.1 Evaluating models extrapolating the super-exponential trend

The key question of this section is: Do the growth models of the explosive growth story give us reason to think 21st century growth will be super-exponential? My answer in this section is ‘no’, because the models are not well suited to describing post-1900 growth. In addition, it’s unclear how much we should trust their description of pre-1900 growth. (However, the next section argues these models can be trusted if we develop sufficiently powerful AI systems.)

#### 4.3.3.1.1.Problem 1: Long-run explosive models are not suitable for describing post-1900 growth

The central problem is that long-run explosive models assume population is accumulable.40 While it is plausible than in pre-modern times more output → more people, this hasn’t been true in developed countries over the last ~140 years. In particular, since ~1880 fertility rates have declined despite increasing GDP/capita.41 This is known as the demographic transition. Since then, more output has not led to more people, but to richer and better educated people: more output → more richer people. Population is no longer accumulable (in the sense that I’ve defined the term).42 The feedback loop driving super-exponential growth is broken: more ideas → more output → more richer people → more ideas.

How would this problem affect the models’ predictions? If population is not accumulable, then the returns to accumulable inputs are lower, and so growth is slower. We’d expect long-run explosive models to predict faster growth than we in fact observe after ~1880; in addition we wouldn’t expect to see super-exponential growth after ~1880.43

Indeed, this is what the data shows. Long-run explosive models are surprised at how slow GWP growth is since 1960 (more), and surprised at how slow frontier GDP/capita growth is since 1900 (more). It is not surprising that a structural change means a growth model is no longer predictively accurate: growth models are typically designed to work in bounded contexts, rather than being universal theories of growth.

A natural hypothesis is that the reason why long-run explosive models are a poor fit to the post-1900 data is that they make an assumption about population that has been inaccurate since ~1880. The recent data is not evidence against long-run explosive models per se, but confirmation that their predictions can only be trusted when population is accumulable.

This explanation is consistent with some prominent idea-based theories of very long-run growth.44 These theories use the same mechanism as long-run explosive models to explain pre-1900 super-exponential growth: labor and technology are accumulable, so there are increasing returns to accumulable inputs,45 so there’s super-exponential growth. They feature the same ideas feedback loop: more ideas → more output → more people → more ideas→…46

These idea-based theories are made consistent with recent exponential growth by adding an additional mechanism that makes the fertility rate drop once the economy reaches a mature stage of development,47 mimicking the effect of the demographic transition. After this point, population isn’t accumulable and the models predict exponential growth by approximating some standard endogenous or semi-endogenous model.48

These idea-based models provide a good explanation of very long-run growth and modern growth. They increase my confidence in the main claim of this section: long-run explosive models are a poor fit to the post-1900 data because they (unrealistically) assume population is accumulable. However, idea-based models are fairly complex and were designed to explain long-run patterns in GDP/capita and population; this should make us wary to trust them too much.49

#### 4.3.3.1.2 Problem 2: It is unclear how much we should trust long-run explosive models’ explanation of pre-1900 growth

None of the problems discussed above dispute the explosive growth story’s explanation of pre-1900 growth. How much weight should we put on its account?

It emphasizes the non-rivalry of ideas and the mechanism of increasing returns to accumulable factors. This mechanism implies growth increased fairly smoothly over hundreds and thousands of years.50 We saw that the increasing-returns mechanism plays a central role in several prominent models of long-run growth.51

However, most papers on very long run growth emphasize a different explanation, where a structural transition occurs around the industrial revolution.52 Rather than a smooth increase, this suggests a single step-change in growth occurred around the industrial revolution, without growth increasing before or after the step-change.53

Though a ‘step-change’ view of long-run growth rates will have a lesser tendency to predict explosive growth by 2100, it would not rule it out. For this, you would have to explain why step change increases have occurred in the past, but no more will occur in the future.54

How much weight should we place in the increasing-returns mechanism versus the step-change view? The ancient data points are highly uncertain, making it difficult to adjudicate empirically.55 Though GWP growth seems to have increased across the whole period 1500 – 1900, this is compatible with there being one slow step-change.56

There is some informative evidence:

• Kremer (1993) gives evidence for the increasing-returns mechanism. He looks at the development of 5 isolated regions and finds that the technology levels of the regions in 1500 are perfectly rank-correlated with their initial populations in 10,000 BCE. This is just what the increasing returns mechanism would predict.57
• Roodman (2020) gives evidence for the step-change view. Roodman finds that his own model, which uses the increasing-returns mechanism, is surprised by the speed of growth around the industrial revolution (see more).

Overall, I think it’s likely that the increasing-returns mechanism plays an important role in explaining very long-run growth. As such I think we should take long-run explosive models seriously (if population is accumulable). That said, they are not the whole story; important structural changes happened around the industrial revolution.58

#### 4.3.4 Summary of theoretical models used to extrapolate GWP out to 2100

I repeat the questions asked at the start of this section, now with their answers:

• Do the growth models of the standard story give us reason beyond the empirical data to think 21st century growth will be exponential or sub-exponential?
• Yes, plausible models imply that growth will be sub-exponential. Interestingly, I didn’t find convincing reasons to expect exponential growth.
• Do the growth models of the explosive growth story give us reason beyond the empirical data to think 21st century growth will be super-exponential?
• No, long-run explosive models assume population is accumulable, which isn’t accurate after ~1880.
• However, the next section argues that advanced AI could make this assumption accurate once more. So I think these models do give us reason to expect explosive growth if sufficiently advanced AI is developed.
 STANDARD STORY EXPLOSIVE GROWTH STORY Preferred data set Frontier GDP/capita since 1900 GWP since 10,000 BCE Predicted shape of long-run growth Exponential or sub-exponential Super-exponential (for a while, and then eventually sub-exponential) Models used to extrapolate GWP to 2100 Exogenous growth models Endogenous growth model, where population and technology are accumulable. Evaluation Semi-endogenous growth models are plausible and predict 21st century growth will be sub-exponential. Theories predicting exponential growth rely on problematic knife-edge conditions. Population is no longer accumulable, so we should not trust these models by default. However, advanced AI systems could make this assumption realistic again, in which case the prediction of super-exponential can be trusted.

#### 4.4 Advanced AI could drive explosive growth

It is possible that significant advances in AI could allow capital to much more effectively substitute for labor.59 Capital is accumulable, so this could lead to increasing returns to accumulable inputs, and so to super-exponential growth.60 I’ll illustrate this point from two complementary perspectives.

#### 4.4.1 AI robots as a form of labor

First, consider a toy scenario in which Google announces tomorrow that it’s developed AI robots that can perform anytask that a human laborer can do for a smaller cost. In this (extreme!) fiction, AI robots can perfectly substitute for all human labor. We can write (total labor) = (human labor) + (AI labor). We can invest output to build more AI robots,61 and so increase the labor supply: more output → more labor (AI robots). In other words, labor is accumulable again. When this last happened there was super-exponential growth, so our default expectation should be that this scenario will lead to super-exponential growth.

To look at it another way, AI robots would reverse the effect of the demographic transition. Before that transition, the following feedback loop drove increasing returns to accumulable inputs and super-exponential growth:

More ideas → more output → more labor (people) → more ideas →…

With AI robots there would be a closely analogous feedback loop:

More ideas → more output → more labor (AI robots) → more ideas →…

 PERIOD FEEDBACK LOOP? IS TOTAL LABOR ACCUMULABLE? PATTERN OF GROWTH Pre-1880 Yes: More ideas → more output → more people → more ideas →… Yes GWP grows at an increasing rate. 1880 – present No: More ideas → more output → more richer people → more ideas →… No GWP grows at a ~constant rate. AI robot scenario Yes: More ideas → more output → more AI systems → more ideas →… Yes GWP grows at an increasing rate.

Indeed, plugging the AI robot scenario into a wide variety of growth models, including exogenous growth models, you find that increased returns to accumulable inputs drives super-exponential growth for plausible parameter values.62

This first perspective, analysing advanced AI as a form of labor, emphasizes the similarity of pre-1900 growth dynamics to those of a possible future world with advanced AI. If you think that the increasing-returns mechanism increased growth in the past, it’s natural to think that the AI robot scenario would increase growth again.63

#### 4.4.2 AI as a form of capital

There are currently diminishing returns to accumulating more capital, holding the amount of labor fixed. For example, imagine creating more and more high-quality laptops and distributing them around the world. At first, economic output would plausibly increase as the laptops made people more productive at work. But eventually additional laptops would make no difference as there’d be no one to use them. The feedback loop ‘more output → more capital → more output →…’ peters out.

Advances in AI could potentially change this. By automating wide-ranging cognitive tasks, they could allow capital to substitute more effectively for labor. As a result, there may no longer be diminishing returns to capital accumulation. AI systems could replace both the laptops and the human workers, allowing capital accumulation to drive faster growth.64

Economic growth models used to explain growth since 1900 back up this point. In particular, if you adjust these models by assuming that capital substitutes more effectively for labor, they predict increases in growth.

The basic story is: capital substitutes more effectively for labor → capital’s share of output increases → larger returns to accumulable inputs → faster growth. In essence, the feedback loop ‘more output → more capital → more output → …’ becomes more powerful and drives faster growth.

What level of AI is required for explosive (>30%) growth in these models? The answer varies depending on the particular model:65

• Often the crucial condition is that the elasticity of substitution between capital and labor rises above 1. This means that some (perhaps very large) amount of capital can completely replace any human worker, though it is a weaker condition than perfect substitutability.66
• In the task-based model of Aghion et al. (2017), automating a fixed set of tasks leads to only a temporary boost in growth. A constant stream of automation (or full automation) is needed to maintain faster growth.67
• Appendix C discusses the conditions for super-exponential growth in a variety of such models (see here).

Overall, what level of AI would be sufficient for explosive growth? Based on a number of models, I think that explosive growth would require AI that substantially accelerates the automation of a very wide range of tasks in the production of goods and services, R&D, and the implementation of new technologies. The more rapid the automation, and the wider the range of tasks, the faster growth could become.68

It is worth emphasizing that these models are simple extensions of standard growth models; the only change is to assume that capital can substitute more effectively for labor. With this assumption, semi-endogenous models with reasonable parameter values predict explosive growth, as do exogenous growth models with constant returns to labor and capital.69

A draft literature review on the possible growth effects of advanced AI includes many models in which AI increases growth via this mechanism (capital substituting more effectively for labor). In addition, it discusses several other mechanisms by which AI could increase growth, e.g. changing the mechanics of idea discovery and changing the savings rate.70

#### 4.4.3 Combining the two perspectives

Both the ‘AI robots’ perspective and the ‘AI as a form of capital’ perspective make a similar point: if advanced AI can substitute very effectively for human workers, it could precipitate explosive growth by increasing the returns to accumulable inputs. In many growth models with plausible parameter values this scenario leads to explosive growth.

Previously, we said we should not trust long-run explosive models as they unrealistically assume population is accumulable. We can now qualify this claim. We should not trust these models unless AI systems are developed that can replace human workers.

#### 4.4.4 Could sufficiently advanced AI be developed in time for explosive growth to occur this century?

This is not a focus of this report, but other evidence suggests that this scenario is plausible:

• A survey of AI practitioners asked them about the probability of developing AI that would enable full automation.71 Averaging their responses, they assigned ~30% or ~60% probability to this possibility by 2080, depending on how the question is framed.72
• My colleague Joe Carlsmith’s report estimates the computational power needed to match the human brain. Based on this and other evidence, my colleague Ajeya Cotra’s draft report estimates when we’ll develop human-level AI; she finds we’re ~70% likely to do so by 2080.
• In a previous report I estimated the probability of developing human-level AI based on analogous historical developments. My framework finds a ~15% probability of human-level AI by 2080.

#### 4.5 Objections to explosive growth

My responses are brief, and I encourage interested readers to read Appendix A, which discusses these and other objections in more detail.

#### 4.5.1 What about diminishing returns to technological R&D?

Objection: There is good evidence that ideas are getting harder to find. In particular, it seems that exponential growth in the number of researchers is needed to sustain constant exponential growth in technology (TFP).

Response: The models I have been discussing take this dynamic into account. They find that, with realistic parameter values, increasing returns to accumulable inputs is powerful enough to overcome diminishing returns to technological progress if AI systems can replace human workers. This is because the feedback loop ‘more output → more labor (AI systems) → more output’ allows research effort to grow super-exponentially , leading to super-exponential TFP growth despite ideas becoming harder to find (see more).

Related objection: You claimed above that the demographic transition caused super-exponential growth to stop. This is why you think advanced AI could restart super-exponential growth. But perhaps the real cause was that we hit more sharply diminishing returns to R&D in the 20th century.

Response: This could be true. Even if true, though, this wouldn’t rule out explosive growth occurring this century: it would still be possible that returns to R&D will become less steep in the future and the historical pattern of super-exponential growth will resume.73

However, I investigated this possibility and came away thinking that diminishing returns probably didn’t explain the end of super-exponential growth.

• Various endogenous growth models suggest that, had population remained accumulable throughout the 20th century, growth would have been super-exponential despite the sharply diminishing returns to R&D that we have observed.
• Conversely, these models suggest that the demographic transition would have ended super-exponential growth even if diminishing returns to R&D had been much less steep.
• This all suggests that the demographic transition, not diminishing returns, is the crucial factor in explaining the end of super-exponential growth (see more).

That said, I do think it’s reasonable to be uncertain about why super-exponential growth came to an end. The following diagram summarizes some possible explanations for the end of super-exponential growth in the 20th century, and their implications for the plausibility of explosive growth this century.

#### 4.5.2 30% growth is very far out of the observed range

Objection: Explosive growth is so far out of the observed range! Even when China was charging through catch-up growth it never sustained more than 10% growth. So 30% is out of the question.

Response: Ultimately, this is not a convincing objection. If you had applied this reasoning in the past, you would have been repeatedly led into error. The 0.3% GWP growth of 1400 was higher than the previously observed range, and the 3% GWP growth of 1900 was higher than the previously observed range. There is historical precedent for growth increasing to levels far outside of the previously observed range (see more).

#### 4.5.3 Models predicting explosive growth have implausible implications

Objection: Endogenous growth models imply output becomes infinite in a finite time. This is impossible and we shouldn’t trust such unrealistic models.

Response: First, models are always intended to apply only within bounded regimes; this doesn’t mean they are bad models. Clearly these endogenous growth models will stop applying before we reach infinite output (e.g. when we reach physical limits); they might still be informative before we reach this point. Secondly, not all models predicting explosive growth have this implication; some models imply that growth will rise without limit but never go to infinity (see more).

#### 4.5.4 There’s no evidence of explosive growth in any economic sub-sector

Objection: If GWP growth rates were soon going to rise to 30%, we’d see signs of this in the current economy. But we don’t – Nordhaus (2021) looks for such signs and doesn’t find them.

Response: The absence of these signs in macroeconomic data is reason to doubt explosive growth will occur within the next couple of decades. Beyond this time frame, it is hard to draw conclusions. Further, it’s possible that the recent fast growth of machine learning is an early sign of explosive growth (see more).

#### 4.5.5 Why think AI automation will be different to past automation?

Objection: We have been automating parts of our production processes and our R&D processes for many decades, without growth increasing. Why think AI automation will be different?

Response: To cause explosive growth, AI would have to drive much faster and widespread automation than we have seen over the previous century. If AI ultimately enabled full automation, models of automation suggest that the consequences for growth would be much more radical than those from the partial automation we have had in the past (see more).

#### 4.5.6 Automation limits

Objection: Aghion et al. (2017) considers a model where growth is bottlenecked by tasks that are essential but hard to improve. If we’re unable to automate just one essential task, this would prevent explosive growth.

Response: This correctly highlights that AI may lead to very widespread automation without explosive growth occurring. One possibility is that an essential task isn’t automated because we care intrinsically about having a human perform the task, e.g. a carer.

I don’t think this provides a decisive reason to rule out explosive growth. Firstly, it’s possible that we will ultimately automate all essential tasks, or restructure work-flows to do without them. Secondly, there could be a significant boost in growth rates, at least temporarily, even without full automation (see more).74

#### 4.5.7 Limits to how fast a human economy can grow

Objection: The economic models predicting explosive growth ignore many possible bottlenecks that might slow growth. Examples include regulation of the use of AI systems, extracting and transporting important materials, conducting physical experiments on the world needed to make social and technological progress, delays for humans to adjust to new technological and social innovations, fundamental limits to how advanced technology can become, fundamental limits of how quickly complex systems can grow, and other unanticipated bottlenecks.75

Response: I do think that there is some chance that one of these bottlenecks will prevent explosive growth. On the other hand, no individual bottleneck is certain to apply and there are some reasons to think we could grow at 30% per year:

• There will be huge incentives to remove bottlenecks to growth, and if there’s just one country that does this it would be sufficient.
• Large human economies have already grown at 10% per year (admittedly via catch up growth), explosive growth would only be 3X as fast.76
• Humans oversee businesses growing at 30% per year, and individual humans can adjust to 30% annual increases in wealth and want more.
• AI workers could run much faster than human workers.77
• Biological populations can grow faster than 30% a year, suggesting that it is physically possible for complex systems to grow this quickly.78

The arguments on both sides are inconclusive and inevitably speculative. I feel deeply uncertain about how fast growth could become before some bottleneck comes into play, but personally place less than 50% probability on a bottleneck preventing 30% GWP growth. That said, I have spent very little time thinking about this issue, which would be a fascinating research project in its own right.

#### 4.5.8 How strong are these objections overall?

I find some of the objections unconvincing:

• Diminishing returns. The models implying that full automation would lead to explosive growth take diminishing returns into account.
• 30% is far from the observed range. Ruling out 30% on this basis would have led us astray in the past by ruling out historical increases in growth.
• Models predicting explosive growth have implausible implications. We need not literally believe that output will go to infinity to trust these models, and there are models that predict explosive growth without this implication.

I find other objections partially convincing:

• No evidence of explosive growth in any economic sub-sector. Trends in macroeconomic variables suggest there won’t be explosive growth in the next 20 years.
• Automation limits. A few essential but unautomated tasks might bottleneck growth, even if AI drives widespread automation.
• Limits to how fast a human economy can grow. There are many possible bottlenecks on the growth of a human economy; we have limited evidence on whether any of these would prevent 30% growth in practice.

Personally, I assign substantial probability (> 1/3) that the AI robot scenario would lead to explosive growth despite these objections.

#### 4.6 Conclusion

The standard story points to the constant exponential growth of frontier GDP/capita over the last 150 years. Theoretical considerations suggest 21st century growth is more likely to be sub-exponential than exponential, as slowing population growth leads to slowing technological progress. I find this version of the standard story highly plausible.

The explosive growth story points to the significant increases in GWP growth over the last 10,000 years. It identifies an important mechanism explaining super-exponential growth before 1900: increasing returns to accumulable inputs. If AI allows capital to substitute much more effectively for human labor, a wide variety of models predict that increasing returns to accumulable inputs will again drive super-exponential growth. On this basis, I think that ‘advanced AI drives explosive growth’ is a plausible scenario from the perspective of economics.

It is reasonable to be skeptical of all the growth models discussed in the report. It is hard to get high quality evidence for or against different growth models, and empirical efforts to adjudicate between them often give conflicting results.79  It is possible that we do not understand key drivers of growth. Someone with this view should probably adopt the ignorance story: growth has increased significantly in the past, we don’t understand why, and so we should not rule out significant increases in growth occurring in the future. If someone wishes to rule out explosive growth, they must positively reject any theory that implies it is plausible; this is hard to do from a position of ignorance.

Overall, I assign > 10% probability to explosive growth occurring this century. This is based on > 30% that we develop sufficiently advanced AI in time, and > 1/3 that explosive growth actually occurs conditional on this level of AI being developed.80 Barring this kind of progress in AI, I’m most inclined to expect sub-exponential growth. In this case, projecting GWP is closely entangled with forecasting the development of advanced AI.

#### 4.6.1 Are we claiming ‘this time is different’?

If you extrapolate the returns from R&D efforts over the last century, you will not predict that sustaining these efforts might lead to explosive growth this century. Achieving 3% growth in GDP/capita, let alone 30%, seems like it will be very difficult. When we forecast non-trivial probability of explosive growth, are we essentially claiming ‘this time will be different because AI is special’?

In a certain sense, the answer is ‘yes’. We’re claiming that economic returns to AI R&D will ultimately be much greater than the average R&D returns over the past century.

In another sense, the answer is ‘no’. We’re suggesting that sufficiently powerful AI would, by allowing capital to replace human labor, lead to a return to a dynamic present throughout much of human history where labor was accumulable. With this dynamic reestablished, we’re saying that ‘this time will be the same’: this time, as before, the economic consequence of an accumulable labor force will be super-exponential growth.

#### 4.7 Further research

• Why do experts rule out explosive growth? This report argues that one should not confidently rule out explosive growth. In particular, I suggest assigning > 10% to explosive growth this century. Experts seem to assign much lower probabilities to explosive growth. Why is this? What do they make of the arguments of the report?
• Investigate evidence on endogenous growth theory.
• Assess Kremer’s rank-correlation argument. Does the ‘more people → more innovation’ story actually explain the rank correlation, or are there other better explanations?
• Investigate theories of long-run growth. How important is the increasing returns mechanism compared to other mechanisms in explaining the increase in long-run growth?
• Empirical evidence on different growth theories. What can 20th century empirical evidence tell us about the plausibility of various growth theories? I looked into this briefly and it seemed as if the evidence did not paint a clear picture.
• Are we currently seeing the early signs of explosive GDP growth?
• How long before explosive growth of GDP would we see signs of it in some sector of the economy?
• What exactly would these signs look like? What can we learn from the economic signs present in the UK before the onset of the industrial revolution?
• Does the fast growth of current machine learning resemble these signs?
• Do returns to technological R&D change over time? How uneven has the technological landscape been in the past? Is it common to have long periods where R&D progress is difficult punctuated by periods where it is easier? More technically, how much does the ‘fishing out’ parameter change over time?
• Are there plausible theories that predict exponential growth? Is there a satisfactory explanation for the constancy of frontier per capita growth in the 20th century that implies that this trend will continue even if population growth slows? Does this explanation avoid problematic knife-edge conditions?
• Is there evidence of super-exponential growth before the industrial revolution? My sensitivity analysis suggested that there is, but Ben Garfinkel did a longer analysis and reached a different conclusion. Dig into this apparent disagreement.
• Length of data series: How long must the data series be for there to be clear evidence of super-exponential growth?
• Type of data: How much difference does it make if you use population vs GWP data?
• How likely is a bottleneck to prevent an AI-driven growth explosion?

## 5. Structure of the rest of the report

The rest of the report is not designed to be read end to end. It consists of extended appendices that expand upon specific claims made in the main report. Each appendix is designed so that it can be read end to end.

The appendices are as follows:

• Objections to explosive growth (see here).
• This is a long section, which contains many of the novel contributions of this report.
• It’s probably the most important section to read after the main report, expanding upon objections to explosive growth in detail.
• Exponential growth is a knife-edge condition in many growth models (see here).
• I investigate one reason to think long-run growth won’t be exponential: exponential growth is a knife-edge condition in many economic growth models.
• This is not a core part of my argument for explosive growth.
• The section has three key takeaways:
1. Sub-exponential is more plausible than exponential growth, out to 2100.
2. There don’t seem to be especially strong reasons to expect exponential growth, raising the theoretical plausibility of stagnation and of explosive growth.
3. Semi-endogenous models offer the best explanation of the exponential trend. When you add to these models the assumption that capital and substitute effectively for human labor, they predict explosive growth. This raises my probability that advanced AI could drive explosive growth.
• Conditions for super-exponential growth (see here).
• I report the conditions for super-exponential growth (and thus for explosive growth) in a variety of economic models.
• These include models of very long-run historical growth, and models designed to explain modern growth altered by the assumption that capital can substitute for labor.
• I draw some tentative conclusions about what kinds of AI systems may be necessary for explosive growth to occur.
• This section is math-heavy.
• Ignorance story (see here).
• I briefly explain what I call the ‘ignorance story’, how it might relate to the view that there was a step-change in growth around the industrial revolution, and how much weight I put on this story.
• Standard story (see here).
• I explain some of the models used to project long-run GWP by the standard story.
• These models forecast GWP/capita to grow at about 1-2% annually out to 2100.
• I find that the models typically only use post-1900 data and assume that technology will grow exponentially. However, the models provide no more support for this claim than is found in the uninterpreted empirical data.
1. Other endogenous models do provide support for this claim. I explore such models in Appendix B.
• I conclude that these models are suitable for projecting growth to 2100 on the assumption that 21st growth resembles 20th century growth. They are not well equipped to assess the probability of a structural break occurring, after which the pattern of 20th growth no longer applies.
• Explosive growth before 2100 is robust to accounting for today’s slow GWP growth (see here)
• Long-run explosive models predict explosive growth within a few decades. From an outside view perspective81, it is reasonable to put some weight on such models. But these models typically imply growth should already be at ~7%, which we know is false.
• I adjust for this problem, developing a ‘growth multiplier’ model. It maintains the core mechanism driving increases in growth in the explosive growth story, but anchors its predictions to the fact that GWP growth over the last 20 years has been about 3.5%. As a result, its prediction of explosive growth is delayed by about 40 years.
• From an outside view perspective, I personally put more weight on the ‘growth multiplier model’ than Roodman’s long-run explosive model.
• In this section, I explain the growth multiplier model and conduct a sensitivity analysis on its results.
• How I decide my probability of explosive growth (see here).
• Currently I put ~30% on explosive growth occurring by 2100. This section explains my reasoning.
• Links to reviews of the report (see here).
• Technical appendices (see here).
• These contain a number of short technical analyses that support specific claims in the report.

## 6. Appendix A: Objections to explosive growth

Currently, I don’t find any of these objections entirely convincing. Nonetheless, taken together, the objections shift my confidence away from the explosive growth story and towards the ignorance story instead.

I initially discuss general objections to explosive growth, then objections targeted specifically at using long-run growth data to argue for explosive growth.

Here are the objections, in the order in which I address them:

General objections to explosive growth

Partially convincing objections

Ultimately unconvincing objections

Objections to using long-run growth to argue for explosive growth

Partially convincing objections

Slightly convincing objections

#### 6.1.1 No evidence of explosive growth in any sub sector of the economy

Summary of objection: If GWP growth rates were soon going to rise to 30%, we’d see signs of this in the current economy. We’d see 30% growth in sectors of the economy that have the potential to account for the majority of economic activity. For example, before the industrial revolution noticeably impacted GDP, the manufacturing sector was growing much faster than the rest of the economy. But no sector of the economy shows growth anywhere near 30%; so GWP won’t be growing at 30% any time soon.

Response: I think this objection might rule out explosive growth in the next few decades, but I’d need to see further investigation to be fully convinced of this.

I agree that there should be signs of explosive growth before it registers on any country’s GDP statistics. Currently, this makes me somewhat skeptical that there will be explosive growth in the next two decades. However, I’m very uncertain about this due to being ignorant about several key questions.

• How long before explosive growth of GDP would we see signs of it in some sector of the economy?
• What exactly would these signs look like?
• Are there early signs of explosive growth in the economy?

I’m currently very unsure about all three questions above, and so am unsure how far into the future this objection rules out explosive growth. The next two sections say a little more about the third question.

#### 6.1.1.1.Does the fast growth of machine learning resemble the early signs of explosive growth?

With regards the penultimate question, Open Philanthropy believes that there is a non-negligible chance (> 15%) of very powerful AI systems being developed in the next three decades. The economic impact of machine learning is already growing fast with use in Google’s search algorithm, targeted ads, product recommendations, translation, and voice recognition. One recent report forecasts an average of 42% annual growth of the deep learning market between 2017 and 2023.

Of course, many small sectors show fast growth for a time and do not end up affecting the overall rate of GWP growth! It is the further fact that machine learning seems to be a general purpose technology, whose progress could ultimately lead to the automation of large amounts of cognitive labor, that raises the possibility that its fast growth might be a precursor of explosive growth.

#### 6.1.1.2 Are there signs of explosive growth in US macroeconomic variables?

Nordhaus (2021) considers the hypothesis that explosive growth will be driven by fast productivity growth in the IT sector. He proposes seven empirical tests of this hypothesis. The tests make predictions about patterns in macroeconomic variables like TFP, real wages, capital’s share of total income, and the price and total amount of capital. He runs these tests with US data. Five of the tests suggest that we’re not moving towards explosive growth; the other two suggest we’re moving towards it only very slowly, such that a naive extrapolation implies explosive growth will happen around 2100.82

Nordhaus runs three of his tests with data specific to the IT sector.83 This data is more fine-grained than macroeconomic variables, but it’s still much broader than machine learning as a whole. The IT data is slightly more optimistic about explosive growth, but still suggests that it won’t happen within the next few decades.84

These empirical tests suggest that, as of 2014, the patterns in US macroeconomic variables are not what you’d expect if explosive growth driven by AI R&D was happening soon. But how much warning should we expect these tests to give? I’m not sure. Nordhaus himself says that his ‘conclusion is tentative and is based on economic trends to date’. I would expect patterns in macroeconomic variables to give more warning than trends in GWP or GDP, but less warning than trends in the economic value of machine learning. Similarly, I’d expect IT-specific data to give more warning than macroeconomic variables, but less than data specific to machine learning.

Brynjolfsson (2017)85 suggests economic effects will lag decades behind the potential of the technology’s cutting edge, and that national statistics could underestimate the longer term economic impact of technologies. As a consequence, disappointing historical data should not preclude forward-looking technological optimism.86

Overall, Nordhaus’ analysis reduces my probability that we will see explosive growth by 2040 (three decades after his latest data point) but it doesn’t significantly change my probability that we see it in 2050 – 2100. His analysis leaves open the possibility that we are seeing the early signs of explosive growth in data relating to machine learning specifically.

#### 6.1.2 The evidence for endogenous growth theories is weak

Summary of objection: Explosive growth from sufficiently advanced AI is predicted by certain endogenous growth models, both theories of very long-run growth and semi-endogenous growth models augmented with the assumption that capital can substitute for labor.

The mechanism posited by these models is increasing returns to accumulable inputs.

But these endogenous growth models, and the mechanisms behind them, have not been confirmed. So we shouldn’t pay particular attention to their predictions. In fact, these models falsely predict that larger economies should grow faster.

Response summary:

• There is some evidence for endogenous growth models.
• Endogenous growth models do not imply that larger economies should grow faster than smaller ones.
• As well as endogenous growth models, some exogenous growth models predict that AI could bring about explosive growth by increasing the importance of capital accumulation: more output → more capital → more output →… (see more).

The rest of this section goes into the first two points in more detail.

#### 6.1.2.1.1 Semi-endogenous growth models

These are simply standard semi-endogenous growth theories. Under realistic parameter values, they predict explosive growth when you add the assumption that capital can substitute for labor (elasticity of substitution > 1).

What evidence is there for these theories?

• Semi-endogenous growth theories are inherently plausible. They extend standard exogenous theories with the claim that directed human effort can lead to technological progress.
• Appendix B argues that semi-endogenous growth theories offer a good explanation of the recent period of exponential growth.
• However, there have not been increasing returns to accumulable inputs in the recent period of exponential growth because labor has not been accumulable. This might make us doubt the predictions of semi-endogenous models in a situation in which there are increasing returns to accumulable inputs, and thus doubt their prediction of explosive growth.

#### 6.1.2.1.2 Theories of very long-run growth featuring increasing returns

Some theories of very long-run growth feature increasing returns to accumulable inputs, as they make technology accumulable and labor accumulable (in the sense that more output → more people). If AI makes labor accumulable again, these theories predict there will be explosive growth under realistic parameter values.

What evidence is there for these theories?

• These ‘increasing returns’ models seem to correctly describe the historical pattern of accelerating growth.87 However, the data is highly uncertain and it is possible that growth did not accelerate between 5000 BCE and 1500. If so, this would undermine the empirical evidence for these theories.
• Other evidence comes from Kremer (1993). He looks at five regions – Flinders Island, Tasmania, Australia, the Americas and the Eurasian continent – that were isolated from one another 10,000 years ago and had significantly varying populations. Initially all regions contained hunter gathers, but by 1500 CE the technology levels of these regions had significantly diverged. Kremer shows that the 1500 technology levels of these regions were perfectly rank-correlated with their initial populations, as predicted by endogenous growth models.

#### 6.1.2.2 Endogenous growth models are not falsified by the faster growth of smaller economies.

Different countries share their technological innovations. Smaller economies can grow using the innovations of larger economies, and so the story motivating endogenous growth models does not predict that countries with larger economies should grow faster. As explained by Jones (1997):

The Belgian economy does not grow solely or even primarily because of ideas invented by Belgians… this fact makes it difficult… to test the model with cross-section evidence [of different countries across the same period of time]. Ideally one needs a cross-section of economies that cannot share ideas.

In other words, the standard practice of separating technological progress into catch-up growth and frontier growth is fully consistent with applying endogenous growth theories to the world economy. Endogenous growth models are not falsified by the faster growth of smaller economies.

#### 6.1.3 Why think AI automation will be different to past automation?

Objection: Automation is nothing new. Since 1900, there’s been massive automation in both production and R&D (e.g. no more calculations by hand). But growth rates haven’t increased. Why should future automation have a different effect?

Response: If AI merely continues the previous pace of automation, then indeed there’s no particular reason to think it would cause explosive growth. However, if AI allows us to approach full automation, then it may well do so.

A plausible explanation for why previous automation hasn’t caused explosive growth is that growth ends up being bottlenecked by non-automated tasks. For example, suppose there are three stages in the production process for making a cheese sandwich: make the bread, make the cheese, combine the two together. If the first two stages are automated and can proceed much more quickly, the third stage can still bottleneck the speed of sandwich production if it isn’t automated. Sandwich production as a whole ends up proceeding at the same pace as the third stage, despite the automation of the first two stages.

Note, whether this dynamic occurs depends on people’s preferences, as well as on the production possibilities. If people were happy to just consume bread by itself and cheese by itself, all the necessary steps would have been automated and output could have grown more quickly.

The same dynamic as with sandwich production can happen on the scale of the overall economy. For example, hundreds of years ago agriculture was a very large share of GDP. Total GDP growth was closely related to productivity growth in agriculture. But over the last few hundred years, the sector has been increasingly automated and its productivity has risen significantly. People in developed countries now generally have plenty of food. But as a result, GDP in developed countries is now more bottlenecked by things other than agriculture. Agriculture is now only a small share of GDP, and so productivity gains in agriculture have little effect on overall GDP growth.

Again this relates to people’s preferences. Once people have plenty of food, they value further food much less. This reduces the price of food, and reduces agriculture’s share of GDP. If people had wanted to consume more and more food without limit, agriculture’s share of the economy would not have fallen so much.88

So, on this account, the reason why automation doesn’t lead to growth increases is because the non-automated sectors bottleneck growth.

Clearly, this dynamic won’t apply if there is full automation, for example if we develop AI systems that can replace human workers in any task. There would be no non-automated sectors left to bottleneck growth. This insight is consistent with models of automation, for example Growiec (2020) and Aghion et al. (2017) – they find that the effect of full automation is qualitatively different from that of partial automation and leads to larger increases in growth.

The next section discusses whether full automation is plausible, and whether we could have explosive growth without it.

#### 6.1.4 Automation limits

Objection: Aghion et al. (2017) considers a growth model that does a good job in explaining the past trends in automation and growth. In particular, their model is consistent with the above explanation for why automation has not increased growth in the past: growth ends up being bottlenecked by non-automated tasks.

In their model, output is produced by a large number of tasks that are gross complements. Intuitively, this means that each task is essential. More precisely, if we hold performance on one task fixed, there is a limit to how large output can be no matter how well we perform other tasks.89  As a result, ‘output and growth end up being determined not by what we are good at, but by what is essential but hard to improve’.

The model highlights that if there is one essential task that we cannot automate, this will ultimately bottleneck growth. Growth will proceed at the rate at which we can improve performance at this non-automated task.Response : There are two questions in assessing this objection:

1.  Will there be an essential task that we cannot automate?
2. If there is such a task, would this preclude explosive growth?

#### 6.1.4.1 Will there be an essential task that we cannot automate?

The first question cannot be answered without speculation.

It does seem possible that we make very impressive progress in AI, automating wide-ranging cognitive abilities, but that there are some essential tasks that we still cannot automate. It is unclear how stable this situation would be: with many cognitive abilities automated, a huge cognitive effort could be made to automate the remaining tasks. Further, if we can restructure workflows to remove the necessity of an un-automated task, the bottleneck will disappear.

One reason to think full automation is plausible is that humans may ultimately have a finite set of capabilities (including the capability to learn certain types of new tasks quickly). Once we’ve developed machines with the same capabilities across the board, there will be nothing more to automate. When new tasks are created, machines will learn them just as quickly as humans.

One possibility is that tasks that will not be automated because we care intrinsically about having a biological human perform the task (e.g. carers, athletes, priests). I don’t expect this to be the sole factor preventing explosive growth:

• In this scenario, if just one group didn’t have this intrinsic preference for human workers, it could grow explosively and ultimately drive explosive growth of GWP. So this scenario seems undermined by the heterogeneity of human preferences.
• In this scenario the growth model of Aghion et al. (2017) implies that the percentage of GDP spent on tasks where we prefer human workers approaches 100%.90 But this seems unlikely to happen. Tasks crucial for gaining relative power in society, e.g. control of resources and military technology, can in principle be automated in this scenario. It seems unlikely that all actors would allow their spending on these tasks to approach 0%, essentially giving up relative power and influence.
• If instead a constant fraction of output is spent on automated tasks, we could model this with a task-based Cobb Douglas production function. With this model, explosive growth then occurs if a sufficiently large fraction of output is spent on the automated tasks (see this model).

#### 6.1.4.2 If there’s an essential task we cannot automate, does this preclude explosive growth?

Slightly more can be said about the second question.

Firstly, there can be super-exponential growth without full automation ever occurring. If we automate an increasing fraction of non-automated tasks each year, there can be super-exponential growth.

For example, the total fraction of automated tasks goes 0%, 50%, 80%, 95%,… We automate 1/2 the non-automated tasks in the first year, 2/3 in the second year, 3/4 in the third year, and so on. In this scenario, the economy is asymptotically automated, but never fully automated.

• This situation implies that for any task i, that task is eventually automated. But this is also implied by the scenario favored in Aghion et al. (2017), in which a constant fraction of non-automated tasks are automated each year.
• I am not claiming here that we will automate an increasing fraction of tasks each year, but just that such a situation is plausible (and perhaps similarly plausible to automating a constant fraction each year).
• Note, super-exponential growth can only be sustained if there is some capital-augmenting technological progress happening in the background.91

What about if there’s some fixed fraction of tasks that we cannot automate?

This does rule out growth increasing without limit.92 However, it doesn’t rule out a significant but temporary increase in growth. There may be a long time before non-automated tasks become a bottleneck in practice, and growth may rise considerably during this time. For example, suppose that the number of human carers ultimately bottlenecks growth. In the long-run, most of GDP is spent on humans carers and productivity improvements elsewhere will make little difference to GDP growth. Nonetheless, there can be an interim period where human carers are still only a small share of GDP but the quantities of other goods and services are growing extremely rapidly, driving explosive growth of GDP. This explosive growth would end once spending on human carers is a large fraction of GDP.

Indeed, the authors of Aghion et al. (2017) acknowledge that even if there’s a limit to automation, ‘growth rates may still be larger with more automation and capital intensity’. Whether growth gets as high as 30% depends on how quickly the other tasks are automated,93 how quickly we increase the stock of capital,94how important the non-automated task is to the economy,95 and how well we initially perform the non-automated task.96

#### 6.1.4.3 A drawback of the model

The model does not seem well suited for thinking about the introduction of new tasks. In their model, introducing a new task can only ever decrease output.97

#### 6.1.4.4 Conclusion

This objection correctly highlights the possibility that very impressive progress in AI doesn’t lead to explosive growth due a few non-automatable tasks. This is a plausible scenario. Nonetheless, explosive growth could occur if we will eventually automate all tasks, or if we automate an increasing fraction of tasks each year, or if growth increases significantly before bottlenecks kick in.

Objection: Even if we automate both goods and ideas production, Aghion et al. (2017) raises the possibility that physical limits could constrain growth.98  In particular, they consider a model where each task has its own productivity. If there’s an absolute limit on the productivity of any essential task, then this ultimately limits overall TFP and can prevent explosive growth.

Response: This objection is correct: ultimately the growth process will come up against physical limits and TFP will reach an absolute ceiling. However, this doesn’t give us much reason to rule out explosive growth.

Firstly, even once TFP reaches its ceiling we could have fast exponential growth. If we automate all tasks Y = AmaxK; reinvestment is ΔK = sY – δK; Amax is the ceiling for TFP fixed by physical limits. The growth rate of the system is Amaxs – δ, which could be very high indeed.

Secondly, we may be a long way from achieving the maximum possible TFP. Before we reach this point, there could be super-exponential growth. The model raises the possibility that we may be closer to the ceiling than we think: if just one essential task hits a limit then this will limit total TFP. However, we should be wary of placing too much weight on this perspective. TFP has not yet been permanently limited by an essential but hard to improve task, despite the economy containing a huge array of tasks and experiencing lots of TFP growth. This is somewhat surprising to an advocate for Baumol tasks: surely just one of the many essential tasks should have hit a limit by now? The evidence to the contrary speaks to our ability to increase productivity in essential tasks despite physical limits, or to replace them with new tasks that avoid these limits.

#### 6.1.6 What about diminishing returns to technological R&D?

Objection: There is good evidence that ideas are getting harder to find, at least when these ideas are weighted by their effects on economic growth.

Economists often understand ‘ideas’ in units such that a constant flow of ideas leads to constant exponential growth in A; each idea raises income by a constant percentage.

It is common to represent this effect using the parameter φ in the equation Ȧ = AφX, where X measures the amount of research effort (e.g. number of scientists) and A represents TFP. If ideas are getting harder to find, this means that φ < 1. This condition is important; it implies that X must increase exponentially to sustain exponential growth in A.

Bloom et al. (2020) observes steeply diminishing returns in 20th century R&D; they estimate φ = -2.1. Such steeply diminishing returns will surely prevent explosive growth. Perhaps they also explain the end of super-exponential growth in the 20th century.

Response: The feedback loop between output and inputs can be powerful enough to overcome these diminishing returns, especially if there are increasing returns to accumulable inputs. This is because the feedback loop can be strong enough for X to grow super-exponentially, leading to super-exponential growth in A.

This happens if increasing returns to accumulable inputs are powerful enough to overcome the diminishing returns to R&D.99 If labor is accumulable, or capital is substitutable with labor (elasticity of substitution > 1), models with plausible parameter values suggest there will be super-exponential growth despite the sharply diminishing to R&D observed by Bloom et al. More on the conditions for super-exponential growth in these models.

Consistent with this, various endogenous growth models suggest that the period of super-exponential growth did not end because the diminishing returns to R&D became too steep. Rather, they suggest that the demographic transition, which meant labor was no longer accumulable (in the sense that more output → more labor), was the key factor (see more).

Lastly, even if 20th century diminishing returns did rule out explosive growth, it is possible that returns will diminish less steeply in the future (the value of φ could increase).100 There could be an uneven technological landscape, where progress is slow for a time and then quicker again.

Further objection: Aghion et al. (2017) consider a model in which ideas production is fully automated, Ȧ = AφK, but growth still does not increase due to ‘search limits’. Importantly, in their model goods production is bottlenecked by labor, Y = AL.101 If φ > 0, the growth rate increases without limit, but if φ < 0, the growth rate decreases over time. φ < 0 is plausible. Theoretically, it could be explained by a fishing-out process, in which fewer and fewer good ideas remain to be discovered over time. Empirically, Bloom et al. (2020) estimates φ = -2.1 based on 80 years of US data.

Response: This correctly highlights the possibility that we fully automate R&D without seeing explosive growth. However, I still expect that full R&D automation would lead to explosive growth.

Firstly, in this model there would still be a temporary boost in growth while the ideas production was being automated. The automation process would cause research effort X to increase, perhaps very rapidly, leading to much faster growth temporarily.

Secondly, full automation of ideas production might facilitate full automation of the goods production (e.g. if it allows us to automate the process of automating tasks), Y = AK. Automating tasks is naturally thought of as a research activity. Full automation of goods production would lead to super-exponential growth, no matter what the value of φ.102This is the response I find most convincing.

Thirdly, even if φ < 0 in the economy on aggregate, it may be that >0 in certain important subsectors of the economy and this is sufficient for explosive growth. Of particular importance may be subsectors relating to how efficiently output can be reinvested to create more AI systems. If φ > 0 in these subsectors then, even if φ < 0 on aggregate, the number of AI systems can grow super-exponentially. This could in turn drive super-exponential growth of technology in all sectors, and thus drive explosive growth of output. I describe a toy model along these lines in this technical appendix.

Is φ > 0 in the relevant subsectors? The subsectors relating to how efficiently output can be reinvested to make AI systems are likely to be computer hardware and AI software. Bloom et al. (2020) find φ = 0.8 for a measure of computer hardware performance, and data from Besiroglu (2020) finds φ = 0.85 for a measure machine learning software performance. Of course this doesn’t show that this scenario is likely to happen, but reinforces the point that there is no easy inference from ‘φ < 0 in the aggregate’ to ‘AI automation of R&D wouldn’t drive explosive growth’.

Lastly, some papers find φ > 0. Even if it is currently below 0, it may change over time, and rise above 0.

#### 6.1.7 Explosive growth is so far out of the observed range

Summary of objection: No country has ever grown at anywhere near 30%. Even when China was at its peak rate of catch-up growth, benefitting significantly from adopting advanced western technology, it grew at 8%. Never in history has a country grown faster than 10%. Explosive growth is so far out of the observed range that it should be regarded as highly improbable.

Response: This is a very natural objection, but ultimately I find it unconvincing.

The same kind of reasoning would have led people in 1750, when growth had never been higher than 0.3%, to rule out growth of 3%. And the same reasoning again would have led hypothetical economists alive in 5000 BCE, when the rate of growth had never been higher than 0.03%, to rule out growth of 0.3%. Growth rates have increased by two orders of magnitude throughout history, and so the reasoning ‘growth rates will stay within the historically observed ranges’ would have repeatedly led to false predictions.

It is true that a 30% growth by 2100 would involve a ten-fold increase in growth happening more quickly than any comparable increase in history. The increase from 0.3% to 3% took more than 150 years to occur and there are only 80 years left until 2100. But historically, increases in the growth rate have happened over progressively shorter time periods. For example, the increase from 0.03% to 0.3% took 6000 years. In 1700 it would have been a mistake to say ‘it took thousands of years for growth rates to increase ten-fold from 0.03% to 0.3%, so it will be thousands of years before growth increases ten-fold again to 3%’. This reasoning would ignore the historical pattern whereby growth increases more quickly over time. Similarly, it would be a mistake now to reason ‘it took hundreds of years for growth rates to increase from 0.3% to 3%, so it will be hundreds of years before growth could reach 30%’.103

So the fact that growth has never previously been anywhere near as high as 30% is not by itself a good reason to rule out explosive growth.

Relatedly, it would be unreasonable to assign an extremely low prior to 30% growth occurring.104Priors assigning tiny probabilities to GWP growth increasing well above its observed range would have been hugely surprised by the historical GWP trend. They should be updated to assign more probability to extreme outcomes.

#### 6.1.8 Models predicting explosive growth have implausible implications

Summary of objection: The very same endogenous growth models that predict explosive growth by 2100 also predict that GWP will go to infinity in finite time. This prediction is absurd, and so the models shouldn’t be trusted.

This objection is in the spirit of a comment from economist Robert Solow:

It is one thing to say that a quantity will eventually exceed any bound. It is quite another to say that it will exceed any stated bound before Christmas.105

Response: Ultimately, I find this objection unconvincing.

Clearly, the economy cannot produce infinite output from a finite input of resources. And indeed this is exactly what certain endogenous growth models predict. But there are two ways to interpret this result.

1. These models’ description of super-exponential growth is not realistic in any circumstances.
2. Endogenous growth models’ description of super-exponential growth is only realistic up to a certain point, after which it ceases to be realistic.

I favor the second explanation for two reasons.

Firstly, it is very common for scientific theories to be accurate only in certain bounded regimes. This is true of both the hard sciences106 and the social sciences.107 As such, pointing out that a theory breaks down eventually only provides a very weak reason to think that it isn’t realistic in any circumstances. So the first explanation seems like an overreaction to the fact that theory breaks down eventually.

Secondly, it is independently plausible that the mechanism for super-exponential growth will break down eventually in the face of physical limits.

The mechanism is more output → more capital → better technology → more output →… But this cycle will eventually run up against physical limits. Eventually, we will be using the fixed input of physical resources in the best possible way to produce output, and further increases in output will be capped. At this stage, it won’t be possible to reinvest output in such a way as to significantly increase future output and the cycle will fizzle out.

In other words, we have a specific explanation for why we will never produce infinite output that leaves open the possibility that explosive growth occurs in the medium term.

So the fact that super-exponential growth must approach limits eventually – this particular objection – is itself only weak evidence that we have already reached those limits.

In addition to the above, many models predict explosive growth without implying output rises to infinity in finite time. For example, Nordhaus (2021) and Aghion et al. (2017) consider a model in which good production is fully automated but technological progress is still exogenous. This leads to a ‘type 1 singularity’ in which the growth rate increases without limit but never goes to infinity. Similarly, the models in Lee (1993) and Growiec (2020) both predict significant increases in growth but again the growth rate remains finite.

#### 6.2.1 The ancient data points used to estimate long-run explosive models are highly unreliable

Objection: We have terrible data on GWP before ~1500, so the results of models trained on this ‘data’ are meaningless.

Response: Data uncertainties don’t significantly affect the predictions of the long-run explosive models. However, they do undermine the empirical support for these models, and the degree of trust we should have in their conclusions.

#### 6.2.1.1 Data uncertainties don’t significantly alter the predictions of long-run explosive models

Despite very large uncertainties in the long-run GWP data, it is clearly true that growth rates used to be much lower than they are today. This alone implies that, if you fit endogenous growth models to the data, you’ll predict super-exponential growth. Indeed, Roodman fit his model to several different data sets, and did a robustness test where he pushed all the data points to the tops and bottoms of their uncertainty ranges; in all cases the median predicted date of explosive growth was altered by < 5 years. This all suggests that data uncertainties, while significant, don’t drive significant variation in the predictions of long-run explosive models.

Using alternative data series, like GWP/capita and frontier GDP/capita, change the expected year of explosive growth by a few decades, but they still expect it before 2100.108

I did a sensitivity analysis, fitting Roodman’s univariate model to shortened GWP data sets starting in 10,000 BCE, 2000 BCE, 1 CE, 1000 CE, 1300 CE, 1600 CE, and 1800 CE. In every case, the fitted model expects explosive growth to happen eventually. (This is no surprise: as long as growth increases on average across the data set, long-run explosive models will predict explosive growth eventually.) The median predicted date for explosive growth is increasingly delayed for the shorter data sets;109 the model still assigns > 50% probability to explosive growth by 2100 if the data starts in 1300 CE or earlier. Sensitivity analysis on shortened data sets.

So the predictions of explosive growth can be significantly delayed by completely removing old data points; the obvious drawback is that by removing these old data points you lose information. Apart from this, the predictions of long-run explosive models do not seem to be sensitive to reasonable alterations in the data.

#### 6.2.1.2 Data uncertainties undermine the empirical support for long-run explosive models

The long-run explosive models I’ve seen explain very long-run growth using the increasing returns mechanism. This mechanism implies growth should increase smoothly over hundreds and thousands of years.110

The data seems to show growth increasing fairly smoothly across the entire period 10,000 BCE to 1950 CE; this is a good fit for the increasing returns mechanism. However, I think the uncertainty of pre-modern data is great enough that the true data may show the growth in the period 5000 BCE to 1600 CE growth to be roughly constant. This would undermine the empirical support for the long-run explosive models, even if it wouldn’t substantially change their predictions.

Doubts about the goodness of fit are reinforced by the fact that alternative data series, like GWP/capita and frontier GDP/capita are less of a good fit to the increasing returns mechanism than the GWP series.

As an alternative to the increasing returns mechanism, you might instead place weight on a theory where there’s a single slow step-change in growth rates that happens between 1500 and 1900 (Ben Garfinkel proposes such a view here). Though a ‘slow step-change’ view of long-run growth rates will have a lesser tendency to predict explosive growth by 2100, it would not rule it out. For this, it would have to explain why step change increases in growth rate have occurred in the past, but more could not occur in the future.

Despite these concerns, it still seems likely to me that the increasing return mechanism offers an important role in explaining the long-run growth data. This suggests we should place weight on long-run explosive models, as long as population is accumulable.

#### 6.2.2 Recent GWP growth shows that super-exponential growth has come to an end

Objection: Recently, GWP growth has been much lower than long-run explosive models have predicted. This shows that these models are no longer useful for extrapolating GWP

Response: Roodman (2020) does a careful analysis of how ‘surprised’ his model is by the recent data. His model is somewhat surprised at how slow GWP growth has been since 1970. But the data are not in very sharp conflict with the model and only provide a moderate reason to distrust the model going forward.

We can assess the size of the conflict between the model and the recent data in three ways: eyeballing the data, quantifying the conflict using Roodman’s model, and comparing the recent slowdown to historical slowdowns.

(Note, by ‘slowdown’ I mean ‘period where growth either remains at the same level or decreases’. This is a ‘slowdown’ compared to the possibility of super-exponential growth, even if growth remains constant.)

#### 6.2.2.1 Eyeballing how much the recent data conflicts with Roodman’s model

First, here’s the graph we saw earlier of GWP against time. Though the recent points deviate slightly from Roodman’s trend, the difference is not significant. It looks smaller than previous historical deviations after which the trend resumed again.

A representation that highlights the deviation from the expected trend more clearly is to plot GWP against its average growth in the following period:

The last five data points indicate the growth after 1970 is surprisingly low. But again they do not seem to be in very sharp conflict with the trend.

#### 6.2.2.2 Quantifying how much the recent data conflicts with Roodman’s model

It’s possible to quantify how surprised Roodman’s model is by a data point, given the previous data points (more). The results are that:

• 1980 GWP is between the 40th and 50th percentiles, so isn’t surprising.
• 1990, 2000, 2010, and 2019 GWP are between the 20th and 30th percentiles, so are surprising but not hugely surprising. If Roodman’s model incorporated serial correlation between random deviations from the underlying trend, the surprise would be smaller still.

#### 6.2.2.3 The recent slowdown is large compared to other slowdowns in GWP growth

Growth in the period 1970 – 2020 has been slower than previously. During this time the economy has increased in size by a factor of 5.4. We can compare this to previous slowdowns after which the long-run super-exponential trend reasserted itself. If the recent growth slowdown is similar in size or smaller, this weakly suggests that the super-exponential trend will reassert itself once again, by analogy with previous slowdowns.

There are a couple of other slowdowns in GWP growth in the historical data:

• Growth in the period 200 BCE – 1000 CE was consistently slower than in the previous thousand years. In this time the economy increased in size by a factor of 1.7.
• Growth in the period 1200 CE – 1400 CE was slower than the previous period. In this time the economy did not increase in size.

So it seems the recent slowdown is shorter than previous slowdowns in terms of calendar years but longer when measured by the fractional increase of GWP.111 This weakly suggests the slowdown is not just random, but rather the result of some systematic factor. The return to super-exponential growth after past slowdowns is not a strong indicator that we’ll return to super-exponential growth after the current one.

The next section aims to strengthen this evidence further, by focusing on the growth of frontier economies (e.g. US, UK, France), rather than just merely GWP growth.

#### 6.2.2.4 So what?

If we think the demographic transition explains the recent slowdown, we may not be moved by this objection. I argued in the main report that we can think of highly substitutable AI as reversing the demographic transition, after which we would expect super-exponential growth to resume. The report’s basic thesis that sufficiently advanced AI could lead to explosive growth is consistent with the recent data.

Alternatively, we might have a more agnostic approach to the causes of long-run growth and the recent slowdown (i.e. the ignorance story). In this case, the recent data provides a stronger reason to reduce the probability we assign to explosive growth. However, it doesn’t provide a decisive reason: the recent data is not hugely improbable according to Roodman’s model.

#### 6.2.3.1 Summary of objection

The prolonged lack of super-exponential growth of GDP per capita in frontier countries is striking. US per capita income has grown steadily at 1.8% for 150 years (since 1870), and other frontier countries show similar trends. The only reason GWP data doesn’t show the same pattern is catch-up growth. The lack of super-exponential growth over such a long period is strong evidence against long-run explosive models.

Even the trend in frontier GDP/capita may be overly generous to long-run explosive models. Frontier GDP/capita has recently been boosted from a number of one-off changes: e.g. the reallocation of people of color from low wage professions to high wage professions, the entry of women into the workforce, and improved educational achievement. Hsieh et al. (2013) estimates that improvements in the allocation of talent may explain a significant part of U.S. economic growth over the last 60 years.112 If we adjusted for these factors, the trend in frontier GDP/capita would likely be even more at odds with the predictions of long-run explosive models.

This strengthens the objection of the previous section.

#### 6.2.3.2 Elaboration of objection

This objection is hard to spell out in a conceptually clean way because endogenous growth models like Roodman’s are only meant to be applied to the global economy as a whole, and so don’t necessarily make explicit predictions about frontier growth. The reason for this is that the growth of any part of the global economy will be influenced by the other parts, and so modeling only a part will necessarily omit dynamics relevant to its growth. For example, if you only model the US you ignore R&D efforts in other countries that are relevant to US growth.

Nonetheless, I do feel that there is something to this objection. GWP cannot grow super-exponentially for long without the frontier growing super-exponentially.

In the rest of this section I:

• Suggest the size of the ‘frontier growth slowdown’ is about twice as big as the already-discussed GWP slowdown.
• Suggest that the most natural application of Roodman’s univariate model to frontier growth allows the objection to go through.

(Again, I use ‘slowdown’ to refer to a period of merely exponential growth, which is ‘slower’ than the alternative to super-exponential growth.)

#### 6.2.3.2.1 How much bigger is the frontier growth slowdown than the GWP slowdown?

I have briefly investigated the timescales over which frontier growth has been exponential, rather than super-exponential, by eyeballing GDP and GDP/capita data for the US, England, and France. My current opinion is that the frontier shows clear super-exponential growth if you look at data from 1700, and still shows super-exponential growth in data from 1800. However data from about 1900 shows very little sign of super-exponential growth and looks exponential. So the slowdown in frontier growth is indeed more marked than that for GWP growth. Rather than just 50 years of slowdown during which GWP increased by a factor of 5.4, there’s more like 120 years of slowdown during which GDP increased by about 10-15X.113

My current view is that considering frontier GDP/capita data increases the size of the deviation from the super-exponential trend by a factor of 2-3 compared to just using GWP data. This is because the deviation’s length in calendar time is 2-3 times bigger (120 years rather than 50 years) and the GDP increase associated with the deviation is 2-3 times bigger (GDP increases 10-15X rather than 5X). Recent frontier growth poses a bigger challenge to the explosive growth theory than recent GWP growth.

This is consistent with the results Roodman got when fitting his model to French per capita GDP. Every observation after 1870 was below the model’s predicted median, and most lay between the 20th and 35th percentiles. The model was consistently surprised at the slow pace of progress.

#### 6.2.3.2.2 The simplest way of extending Roodman’s model to frontier countries implies they should grow super-exponentially

Roodman’s model implies that GWP should grow super-exponentially but does not say how the extent to which this growth results from frontier vs catch-up growth should change over time.

The simplest answer seems to be that both frontier and catch-up growth is super-exponential. The same story that explains the possibility of super-exponential growth for the total world economy – namely increasing returns to endogenous factors including technology – could also be applied to those countries at the frontier. If frontier countries invested their resources in helping others catch up we might expect something different. But on the realistic assumption that they invest in their own growth, it seems to me like the story motivating Roodman’s model would predict super-exponential growth at the frontier.

The lack of frontier super-exponential growth is especially surprising given that frontier countries have been significantly increasing their proportional spend on R&D.114 Roodman’s model assumes that a constant fraction of resources are invested and predicts super-exponential growth. How much more surprising that we see only constant growth at the frontier when the fraction of resources spent on R&D is increasing! The expansion of the size of the frontier (e.g. to include Japan), increasing the resources spent on frontier R&D even further, strengthens this point.

Response: deny the frontier should experience smooth super-exponential growth

A natural response is to posit a more complex relationship between frontier and catch-up growth. You could suggest that while GWP as a whole grows at a fairly smooth super-exponential rate, progress at the frontier comes in spurts. The cause of GWP’s smooth increase alternates between spurts of progress at the frontier and catch-up growth. The cause of this uneven progress on the frontier might be an uneven technological landscape, where some advances unlock many others in quick succession but there are periods where progress temporarily slows.

I think that accepting this response should increase our skepticism about the precise predictions of Roodman’s model, moving us from the explosive-growth story towards the ignorance story. It would be a surprising coincidence if GWP follows a predictable super-exponential curve despite frontier growth being the result of a hard-to-anticipate and uneven technological landscape.115 So, for all we know, the next spurt of frontier progress may not happen for a long time, or perhaps ever.

#### 6.2.3.3 So what?

Again, this objection may not move you much if you explain the slowdown via the demographic transition. The recent data would not undermine the belief that super-exponential growth will occur if we get sufficiently substitutable AI.

If you are more agnostic, this will provide a stronger reason to doubt whether explosive growth will occur. The length of the slowdown suggests a structural break has occurred, and the super-exponential trend has finished (at least temporarily). Still, without an explanation for why growth increased in the past, we should not rule out more increases in the future. 120 years of exponential growth, after centuries of increasing growth rates, suggests agnosticism about whether growth will increase again in the next 80 years.

#### 6.2.4 Long-run explosive models don’t anchor predictions to current growth levels

Objection: The models predicting explosive growth within a few decades typically expect growth to already be very high. For example, the median prediction of Roodman’s model for 2020 growth is 7%. Its predictions aren’t anchored sufficiently closely to recent growth. I analyze this problem in more detail in an appendix.

Response: I developed a variant of Roodman’s model that is less theoretically principled but models a correlation between growth in adjacent periods. This ‘growth differences’ model anchors its predictions about future growth to the current GWP growth rate of 3%.

The model’s median predicted year for explosive growth is 2082 (Roodman: 2043), a delay of about 40 years; its 80% confidence interval is [2035, 2870] (Roodman: [2034, 2065]). This suggests that adjusting for this problem delays explosive growth but still leaves a significant probability of explosive growth by 2100. Explanation of the model I developed.

I find this model most useful as an ‘outside-view’ that projects GWP based solely off past data, without taking into account specific hypotheses like ‘the demographic transition ended the period of super-exponential growth’, or ‘we’d only expect to see super-exponential growth again once advanced AI is developed’. If we embrace specific inside-view stories like these, we’d want to make adjustments to the model’s predictions. (For the examples given, we’d want to further delay the predicted dates of explosive growth based on how far we are from AI that’s sufficiently advanced to boost the growth rate.)

How might we adjust the model’s predictions further based on our beliefs about AI timelines?

Suppose you think it will be (e.g.) three decades before we have AI systems that allow us to increase the rate of growth (systems before this point might have ‘level effects’ but not noticeably impact growth). You could make a further adjustment by assuming we’ll continue on our current growth trajectory for three decades, and then growth will change as shown in the graph. In other words, you’d delay your median predicted year for explosive growth by another 30 years to about 2110. However, you’ll still assign some probability to explosive growth occurring by the end of the century.

I plotted the 10th, 50th, and 90th percentiles over GWP from three methods:116

• Surveying economists about GWP/capita and combining their answers with UN population projections to forecast GWP (‘ ’).
• Fitting David Roodman’s growth model to long-run historical GWP data (‘ ’).
• Fitting my variant on Roodman’s model to long-run GWP data (‘ ’).

I am currently inclined to trust the projections somewhere in between growth differences and Roodman’s model if we develop highly substitutable117 AI systems (though I don’t think any model is a reliable guide to growth in this scenario), and the projections of the standard story if we don’t.

See code producing these plots at the bottom of this notebook. (If the link doesn’t work, the colab file can be found in this folder.)

#### 6.2.5 Long-run explosive models don’t discount pre-modern data

Objection: For example, Roodman’s model downweights ancient data points for their uncertainty, but does not additionally downweight them on the basis that they are less relevant to our current growth regime. But more recent data is more likely to be relevant because the underlying dynamics of growth may have changed.

Response: My ‘growth-differences’ model allows the user to specify the rate at which ancient data points are discounted.118 For my preferred discount rate,119 this delays explosive growth by another 15 years to ~2090; it still assigns a 10% chance of explosive growth by 2040. Adjusting for this problem delays explosive growth further but leaves a significant probability of explosive growth by 2100.120 Again, if you think AI won’t start to affect growth for several decades, you would need to delay your median projection further (see more).

I also perform a sensitivity analysis on the effects of removing pre-modern data points. I find that the prediction of explosive growth by 2100 is robust to removing data points before 1300, but not to removing data points before 1600 (see more).

#### 6.2.6 Long-run explosive models don’t seem to apply to the time before the agricultural revolution; why expect them to apply to a growth regime in the future?

Summary of objection: Roodman (2020) does the most sophisticated analysis on the fit of his model to data before 10,000 BCE. He finds that if he fits his model to data from 1 million years ago to the modern day, the estimated model is not a good fit to the data series. It confidently predicts that civilization will collapse within the first few 100,000 years, with a 98% chance of eventual collapse. Given that Roodman’s model did not describe a previous era – that of hunter gatherers – we should not trust its predictions about a future era of supposed explosive growth.

Response: I think this objection might potentially justify agnosticism about explosive growth, but it doesn’t confidence that it will not occur.

Let’s distinguish between three attitudes towards explosive growth:

1. Confidence that explosive growth will occur (explosive growth story).
2. Ignorance about whether explosive growth will occur (ignorance story).
3. Confidence that explosive growth won’t occur (standard story).

I think that, at most, this objection might move you from Attitude 1 towards Attitude 2. It’s not an argument for Attitude 3. The objection provides a reason to doubt the predictions of Roodman’s model, but doesn’t provide any specific reason to rule out explosive growth.

I personally regard this objection as only a weak argument against Attitude 1. This is because a key part of technological progress, the driver to super-exponential growth, is the ability for new ideas to spread throughout society. But human societies with natural language only developed 50,000 – 150,000 years ago.121 So we wouldn’t expect Roodman’s model to be accurate before this point. As Roodman points out:

Through language, humans could share ideas more efficiently and flexibly than any organism before. Arguably, it was then that technology took on its modern, alchemical character as a force in economic development. Before, hominins had developed important technologies such as handaxes. But it is not obvious that those intellectual mutations spread or evolved any faster than the descendants of those who wrought them. After, innovations could diffuse through natural language, the first new medium of arbitrary expressiveness on Earth since DNA.

In addition, humans couldn’t accumulate capital until we became sedentary. This happened around the neolithic era, giving another reason to think growth dynamics would be different before 10,000 BCE.

## 7. Appendix B: Constant exponential growth is a knife-edge condition in many growth models

The growth literature has found it very difficult to find a satisfactory theoretical explanation for why long-term growth would be exponential, despite decades of effort. In many endogenous growth models, long-run growth is only exponential under knife-edge conditions. This means that constant exponential growth only occurs when some parameter is exactly equal to some value; the smallest disturbance in this parameter leads to a completely different long-run behavior, with growth either going to infinity or to 0. Further, it seems that these knife-edge conditions are problematic: there’s no particular reason to expect the parameter to have the precise value that leads to constant exponential growth.

I argue the best candidates for addressing this problem are semi-endogenous models. Here the ‘knife-edge condition’ is merely that the population grows exponentially.122 For this and other reasons discussed in this section, I place more weight on semi-endogenous models (~75%) than on any other models in explaining the recent trend of exponential growth.

The UN forecast that population growth will slow over the 21st century. When you plug this assumption into semi-endogenous growth models, they predict that GDP/capita growth will slow. This raises my probability that 21st century growth will be sub-exponential. The difficulty of finding a non-knife edge explanation of exponential growth also raises my credence that the pattern of exponential growth is a transitional rather than the beginning of a steady state regime.123 Nonetheless, I still assign substantial probability (~20%) that there is some mechanism generating exponential growth that will continue to function until 2100, although I’m not sure what it would be.

The rest of this section is as follows:

• I explain my intuitive understanding of the claim that constant exponential growth is an unmotivated knife-edge condition (here).
• I review the knife-edge conditions in a number of endogenous growth models (here).
• This section also makes some other objections to certain models, explaining my preference for semi-endogenous models.
• I briefly review the sub literature that claims that a very large class of models have knife-edge conditions for exponential growth (here).
• I discuss a recent model that claims to produce exponential growth without knife-edge condition (here).

#### 7.1 An intuitive explanation for why exponential growth might be a knife-edge condition in endogenous growth models

Let’s focus on the endogenous factor of technology. Assume that we invest a constant fraction of output into technology R&D. This investment causes the level of technology to improve by a certain percentage each year. We’re interested in how this percentage changes over time, as technology advances. In other words, we’re interested in how the rate of technological progress changes over time, with this progress measured as a percentage.

As technology improves, there are (at least) two things that might affect the rate of future progress. Firstly, in the future there may be less low hanging fruit as we have made all the easy technological discoveries and only difficult ones remain. Call this the fishing out effect. Secondly, we can use the new technology in our future research, increasing the effectiveness of future R&D efforts (e.g. use of the internet). Call this the standing on shoulders effect.

These two effects point in opposite directions but there is no reason to expect them to cancel out exactly. The fishing out effect relates to the landscape of technological discoveries, and how quickly the easy discoveries dry up; the standing on shoulders effect relates to the extent to which we can harness new technologies to improve the process of R&D. The two effects relate to very different things. So by default, we should expect these factors not to cancel out exactly. And so we should expect the rate of technological progress to either speed up or to slow, depending on which effect is more powerful. But there’s no reason to think that the rate of progress should stay exactly constant over time. This would be like giving one tennis player a broken arm and their opponent a broken leg, and expecting the two effects to cancel out exactly.

More nuanced models add additional factors that influence the rate of technological progress (e.g. the ‘stepping on toes effect’). But these additional factors don’t make it any more plausible that everything should cancel out and growth should be exponential.

The conclusion of this line of thinking is that, theoretically speaking, we shouldn’t expect technology to grow exponentially.

A similar argument can be applied to output as a whole, rather than just technology. Consider a growth model where all inputs are endogenous. The intuition behind the argument is that some factors suggest growth should increase over time, other factors suggest growth should slow over time; further, there’s no particular reason to expect these factors to cancel out exactly. So we should expect growth to either slow down, or speed up over time.

More precisely, we’re interested in the percentage increase in the total output each year. We want to know how this percentage changes over time as total output increases. There are again (at least) two effects relevant to this question.

The first effect is that, as the endogenous inputs to production increase over time, they become harder to increase by a fixed percentage. This is true because i) a fixed percentage is an increasingly large absolute amount, ii) there may be diminishing marginal returns to efforts to improve the factor, and iii) because of other complex factors.124 If inputs are harder to increase by a fixed percentage, then output as a whole is also harder to increase by a fixed percentage. Let’s call this effect percentage improvements become harder; it roughly corresponds to the fishing out effect in the previous section.

The second effect is that, as the endogenous inputs increase, we have more resources to invest in increasing the inputs. This increased investment allows greater absolute increases to be made to the inputs, and so to output as a whole. Call this effect greater investment; it corresponds to the standing on shoulders effect from the previous section.

Again, these two effects point in opposite directions. The percentage improvements become harder effect suggests growth will slow over time, the greater investment effect suggests that growth will increase. Again, I know of no reason to think these effects should exactly cancel out.125 If they don’t cancel, growth won’t be exponential.

To be clear, I do not think that this intuitive argument is itself sufficient to establish that exponential growth is a knife-edge condition and highly surprising. I include because it generalizes the specific argument I make below in the context of specific models.

#### 7.2 Knife-edges in popular endogenous growth models

Most endogenous growth models can be broadly divided into two camps: accumulation based models and idea-based models.126 In the former, the ultimate source of growth in GDP/capita is the accumulation of physical or human capital. In the latter, the ultimate source of growth is targeted R&D leading to technological progress; although there is capital accumulation, it isn’t the ultimate source of growth.127

I will discuss the knife-edge conditions in popular growth models of both types. I think the knife-edge conditions are more problematic in the idea-based models; although accumulation based models face further objections.

Very little of the content here is original; knife-edge critiques of endogenous models are discussed in Cesaratto (2008), Jones (1999), and Jones (2005). The problem is often discussed with different terminology, referring to the difficulty of avoiding ‘scale effects’ or the ‘linearity critique’ of endogenous growth models. I expect all economists familiar with endogenous growth models will be aware that knife-edge assumptions are typically needed for constant exponential growth. I expect most of them won’t draw my conclusion: that the best account that avoids knife-edge conditions implies that 21st century growth will be sub-exponential.

One strong objection to accumulation based models that I don’t discuss in this report on is their tension with growth accounting exercises, e.g. Fernald and Jones (2014). These empirical exercises decompose growth into its constituent parts, and typically find that TFP growth accounts for the majority of growth rather than the accumulation of physical or human capital. I think this gives us a good reason to prefer idea-based models.

#### 7.2.1 Accumulation based models

Perhaps the most standard mechanism for growth here is the accumulation of physical capital. This is the strategy of the AK model, and variants thereof. I’ll start by discussing the model of Frankel (1962) and the variant proposed by Arrow (1962). Then I’ll briefly comment on some other capital accumulation models.

#### 7.2.1.1 Frankel (1962)

The production function in Frankel (1962) starts out as:

$$Y=AK^α(BL)^{1−α}$$

where B is labor augmenting technology. Technological progress is endogenous and happens as a by-product of capital accumulation. The equation for B is:

$${B}= (\frac {K}{L})^γ$$

Frankel assumes γ = 1. In other words, labor augmenting technology is the capital per worker. Twice as much capital per worker makes workers twice as productive. With this assumption production is simply:

$$Y=AK$$

Here and in all other models in this section, I assume the standard reinvestment equation for capital: = sYδK. This implies that growth is exponential.

The knife-edge condition is γ = 1. To simplify the analysis, assume L is constant. If γ > 1, there are increasing returns to K and Y goes to infinity in finite time.128If γ < 1, there are diminishing returns to K and growth tends to 0.

Is this knife-edge condition problematic? I think so. It claims that doubling the amount of capital per worker exactly doubles the productivity per worker. But why not think it would increase productivity by a factor of 1.9, or 2.1?

The problem becomes more acute when we realize that there are two distinct mechanisms by which capital accumulation increases labor productivity. The first is that each worker has more machinery to work with, increasing their productivity.129 The second mechanism is that capital accumulation leads to new technologies via the process of ‘learning by doing’. These improvements have spillover effects as new technologies can be adopted by all firms. But it is mysterious why these two very different mechanisms should combine such that γ = 1 exactly. If the spillover effects were ever so slightly bigger or smaller, or if the benefits of having more machinery were ever so slightly bigger or smaller, growth would go to 0 or infinity rather than being constant.

Robert Solow comments, on this topic, ‘This version of the endogenous-growth model is very unrobust. It can not survive without exactly constant returns to capital. But you would have to believe in the tooth fairy to expect that kind of luck.’130 In support of this comment, I argue in this technical appendix that even the constancy of 20th century growth wouldn’t convince us that long-run growth would be constant if we believed that Frankel’s growth model was literally correct.

There is another problem with Frankel’s AK model. In order to get constant returns to capital accumulation, but avoid increasing returns to capital and labor in combination, the model removes the effect of labor on output entirely. The seemingly absurd implication is that adding more workers won’t increase output. A defense might be that the model is intended for a simplified setting where labor is constant. If so, then the model doesn’t seem to be appropriate for explaining the recent period of growth, during which there has been significant population growth.

One last thing to note about this AK model is that if there is any capital augmenting technological progress (e.g. an increase in A), this will increase growth.131

#### 7.2.1.2 Arrow (1962)

Arrow (1962) develops a similar AK model.132 His definition of labor augmenting technology depends on the total capital accumulated rather than the capital accumulated per person.

$$B=K^γ$$

with γ < 1.133 This leads to:

$$Y=AK^α(BL)^{1−α}=AK^μL^{1−α}$$

with μ = γ(1 – α) < 1.

This model does not, in my view, have a problematic knife-edge. However, it does imply that growth will be sub-exponential over the 21st century.

The growth rate of y = Y/L turns out to be:

$$g_y=g_{L} \frac {γ}{(1−γ)}$$

If the labor force doesn’t grow, then neither will GDP/capita. This prediction is not actually falsified by observation, as the population has grown continuously since the industrial revolution. In fact, I think that exponential population growth is the most plausible root explanation for the historical observed pattern of exponential growth.

This model is structurally very similar to the semi-endogenous model developed by Jones that I discuss later. In both models, the ultimate driver of exponential income growth is exponential growth in labor. Both models imply that growth over the 21st century will be sub-exponential, as population growth is expected to slow.

(A quick aside: if capital were perfectly substitutable with labor – the AI robot scenario – then this model predicts explosive growth. In this scenario, capital can play the same role as labor in production and so, if AI robots are cheaper than human labor, the model will ultimately approximate: Y = AK1 + γ(1-α). There are increasing returns to capital accumulation and so super-exponential growth. This is just to demonstrate that some accumulation models do imply that this scenario would lead to explosive growth.)

#### 7.2.1.3 Other capital accumulation stories

Jones and Manuelli (1990) develop a model in returns to capital fall, but rather than falling to 0 as in most models falls to a constant and then stays at that constant. This means that capital accumulation is sufficient for sustained growth. Growth from capital accumulation will be sub-exponential as the returns to capital diminish towards the constant, and afterwards it will be exponential.

For this model to explain the recent period of exponential growth, then, it must claim that returns to capital have long ago diminished to their lowest possible value, and are now constant. Intuitively, this claim doesn’t seem plausible: returns to capital would diminish further if we equipped every worker with the highest quality equipment possible. Putting that aside though, the model in essence behaves the same way as AK in the regime where returns to capital are constant. So the same problems we saw above will apply.

Indeed, the knife-edge analogous to the one considered above applies. In the limit where returns to capital are constant we have:

$$\frac {dY}{dK}=K^ϕ$$

with φ = 0. If φ > 0, growth from capital accumulation is super-exponential; if φ < 0, growth goes to 0. We can ask why φ = 0. The value of φ is again plausibly the product of two mechanisms: additional capital can be used directly to produce more output; accumulating capital involves some ‘learning by doing’ and produces new technologies that can be copied by others. I can see no reason for these two mechanisms to lead to exactly constant returns. Ultimately, I think Jones and Manuelli (1990) faces the same objections as the AK model; its main advantage is that it formally acknowledges diminishing returns to capital (though not during the regime where exponential growth is occurring).

Another way capital accumulation can lead to sustained growth is by using a CES production function where the elasticity of substitution between capital and labor is above 1. In this case, as with Jones and Manuelli, the returns to capital diminish initially and then approach some constant. While the returns are diminishing, growth from capital accumulation is sub-exponential; in the limit where these returns are constant, growth from capital accumulation is exponential. In the limit the model faces the same ‘knife-edge objection’ as Jones and Manuelli (1990): why would the direct and spillover effects of capital accumulation net out at exactly constant returns?134

There is another problem for the CES production function approach. In the limit where growth is exponential, the capital share is 1. The capital share has been around 0.3 for the last 50 years (although it has recently increased somewhat), so this model wouldn’t offer a good explanation of the recent period of exponential growth.

#### 7.2.1.4 Human capital accumulation

Lucas (1988) suggests the ultimate driver of growth is not physical but human capital. The model is as follows:

$$Y=AK^α(lhL)^{1−α}$$

$$\dot h=ϕh(1−l)$$

where h is human capital per person, l is the proportion of time spent working, 1 – l is the proportion of time spent increasing h, φ is a constant, and A is a constant.

The knife-edge here is that = constant × hφ with φ = 1 exactly. If φ < 1, there would be diminishing returns to human capital accumulation and growth would fizzle out; if φ > 1 growth would go to infinity in finite time.

Is this knife-edge problematic? Again, I think so. There are two possible interpretations of h;135 I think the condition is problematic in both cases.

The first interpretation is that h is the knowledge and skills of an individual agent; 1 – l is the proportion of their time they spend studying.136 Here, the knife-edge φ = 1 means that if you know twice as much, you can learn and teach exactly twice as quickly. But why not think it allows me to learn only 1.9 times as quickly, or 2.1 times as quickly? Why is my learning speed exactly proportional to my knowledge? As with physical capital, there are both direct and spillover benefits of increasing h. The direct benefit is that I leverage my knowledge and skills to learn more effectively in the future. The spillover effect is that others may copy my discoveries and knowledge; this can help their future learning. It is again problematic that these two distinct effects combine to give φ = 1 exactly.

There’s another problem with this first interpretation. In addition, our minds and capabilities are limited by our finite minds and lifespans. Our knowledge and skills can’t grow exponentially without limit, but ultimately hit diminishing returns.

The second interpretation is that h represents all the accumulated technical and scientific knowledge of humanity; 1 – lis the proportion of people who are scientists.137φ = 0 would mean that each absolute increase in knowledge was equally difficult. φ = 1 means that if humanity knows twice as much, an absolute increase in our knowledge becomes exactly twice as easy to achieve. This is a very particular degree of increasing returns.138 There are (at least) two relevant effects. If we know more, then perhaps we’ve made all the easy discoveries and new ideas will be harder to find (‘fishing out’). Or perhaps our knowledge will make our future learning more effective (‘standing on shoulders’). I see no reason to think these forces should net out so that φ = 1 exactly.

The second interpretation faces another severe problem: the rate of knowledge discovery depends on the fraction of people who are scientists but not the absolute number. If we alter this so that increases with L, then (still assuming φ = 1), an exponentially growing population would lead to an exponentially increasing growth rate.

(A quick aside: if capital were perfectly substitutable with labor – the AI robot scenario – then this model would display constant returns to accumulable inputs L and K. If h – which would then be interpreted as ‘AI robot capital’ rather than ‘human capital’ – continues to increase, then output will grow super-exponentially. This is again to demonstrate that some accumulation models do imply that this scenario would lead to explosive growth. However, if the model was adjusted to include a fixed factor so that there were slightly diminishing returns to capital and labor, then AI robots would not lead to explosive growth. Instead it would lead to a one-off step-change in growth rates, assuming that continued to grow exponentially.)

#### 7.2.2 Idea-based models

I’ve argued that some central capital accumulation and human accumulation models don’t provide compelling explanations of the observed pattern of exponential growth, partly because they make problematic knife-edge assumptions.

One general drawback of accumulation-based models is that they don’t directly engage with what seems to be an important part of the rise in living standards over the last 100 years: discovery of new ideas through targeted R&D. Private and public bodies spend trillions of dollars each year on developing and implementing new technologies and designs that are non-rival and can eventually be adopted by others.

Idea-based models represent this process explicitly, and see it as the ultimate source of growth. Whereas accumulation models emphasize that growth involves increasing the number of physical machines and gadgets per person (perhaps with technological progress as a side-effect), Idea-based models emphasize that it involves purposely developing new (non-rival) designs for machines, gadgets, and other technologies.

This section is heavily based on Jones (1999). I simply pull out of relevant points.

Jones groups idea-based models into three camps based on important structural similarities between them:

1. R / GH / AH
2. Y/P/AH/DT
3. J / K / S
• These are from Jones (1995), Kortum (1997) and Segerstrom (1998). These are known as semi-endogenous growth models.
• The knife-edge condition is that there’s exactly exponential growth in the number of workers.

I think the knife-edge conditions for exponential growth for R / GH / AH and Y / P / AH / DT models are just as problematic, if not more problematic, than those for accumulation based models discussed above.

For semi-endogenous models (J / K / S), the knife-edge condition is much less problematic. Indeed we know empirically population growth has been roughly exponential over the last 100 years.139 However, this will not continue until 2100. The UN projects that population growth will slow; J / K / S semi-endogenous models imply GDP/capita growth will slow as a result.

#### 7.2.2.1 R / GH / AH models

Output is given by:

$$Y=A^σK^α{L_Y}^{1−α}$$

LY is the number of workers in goods production. There are constant returns to K and LY, and increasing returns to K, LYand A.

New ideas are produced via:

$$\dot A=δA^ϕL_A$$

for some constant δ. LA is the number of workers in knowledge production. A constant fraction of people do research: LA = fL, LA + LY = L.

The knife-edge assumption is φ = 1. If φ ≠ 1, then growth over time either goes to 0 or infinity, as in the above examples. To repeat my comments on Lucas (1988): there are (at least) two relevant mechanisms affecting φ. If A is larger, then perhaps we’ve made all the easy discoveries and new ideas will be harder to find (‘fishing out’). This suggests lower value of φ. Conversely, perhaps we can leverage our knowledge to make our future learning more effective (‘standing on shoulders’). I see no reason to think these forces should net out so that φ = 1 exactly.

#### 7.2.2.2 Y / GH / AH models

$$Y=NZ^σK^α{L_Y}^{1−α}$$

where N is the number of product lines and Z is the average level of technology per product line.

The number of products increases with the size of the total population:

$$N=L^β$$

The rate of technological progress depends on the number of researchers per product line:

$$\dot Z= \frac {δZ^ϕL}{N}=δZ^ϕ{L_A}^{1−β}$$

It turns out that exponential growth relies on two knife-edge conditions in this model: β = 1 and φ = 1.

If φ ≠ 1 , then growth over time either goes to 0 or infinity, as above. And again, the assumption that φ = 1 involves a very specific degree of increasing returns to knowledge accumulation despite plausible mechanisms pointing in different directions (‘fishing out’ and ‘standing on shoulders’).

If β ≠ 1, the number of researchers per firm changes over time, and this changes the growth rate.

#### 7.2.2.3 J / K / S models

We can represent these models as:

$$Y=A^σK^α{L_Y}^{1−α}$$

$$\dot A=δ{L_A}^{λ}A^ϕ$$

$$\dot L=nL$$

with n > 1, φ < 1 and λ < 1. As before, we assume that a constant fraction of people do research: LA = fL, LA + LY = L.

The exponential growth in L drives exponential growth in A: φ < 1 implies each new % increase in A requires more effort than the last, but exponentially growing labor is able to meet this requirement. Exponential growth in A then drives exponential growth in Y and K and thus of GDP/capita.

Often L is made exogenous, but Jones (1997) makes it endogenous, using fertility equations such that population growth tends to a positive constant in the long-run.

The knife-edge condition here is the exponential growth of labor: = nLφ and φ = 1 exactly.

Jones (1997) justifies this by appealing to biology: ‘it is a biological fact of nature that people reproduce in proportion to their number’. Indeed, population growth was positive throughout the 20th century for the world as a whole or for the US.140 So it does seem that the model matches the rough pattern of 20th century growth.

Population growth fell over the 20th century.141  If there was no lag between research effort and productivity improvements, perhaps this theory implies we should have seen a more noticeable slowdown in frontier GDP/capita growth as a result. However, some lag is realistic, and there does seem to have been such a growth slowdown since 2000. In addition, numerous factors may have offset slowing population growth: increases in the fraction of people doing R&D, more countries on the economic frontier (and so a higher fraction of scientists doing R&D pushing forward that frontier), increased job access for women and people of color (reduced misallocation), increased educational attainment, and possibly random fluctuations in the economic returns to R&D (e.g. the IT boom).

Growth accounting exercises suggest that these other factors are significant. Fernald and Jones (2014) suggest that the growing fraction of people doing R&D accounts for 58% of the growth since 1950, and education improvements account for 20%. Hsieh et al. (2013) estimates that improvement in talent allocation can account for more than 20% of income increases since 1950.

Given these other factors, the juxtaposition of slowing population growth and steady income growth during the 20th century is only weak evidence against semi-endogenous growth theories. (Indeed, high quality empirical evidence on growth theories is very hard to come by.)

Overall, it seems that semi-endogenous growth theory does a good job of explaining the general pattern of 20th century growth and that it’s hard to adjudicate beyond this point due to the effects of numerous other important factors.142

What does semi-endogenous growth theory imply about 21st century growth? It’s The UN population projections – which have a fairly good track record – over the 21st century imply that population growth will slow significantly. In addition, the historical growth of the fraction of people doing R&D cannot be maintained indefinitely, as it is bounded below 1. Both these trends, the slowing of population growth and the slowing growth of the fraction of researchers, imply that the growth of the number of researchers will slow. When you plug this into semi-endogenous growth theory, it predicts that the GDP/capita growth rate will also slow.

Where does this prediction come from? Semi-endogenous models imply each % increase in GDP/capita requires more research than the last. If the number of researchers is constant, each % increase in GDP/capita will take longer to achieve and growth will slow. If the number of researchers does grow, but at an ever slower rate, the model still predicts that GDP/capita growth will slow.

Jones draws just this implication himself in Jones (2020); Fernald and Jones (2014) discuss how slowing growth in educational achievement and the fraction of workers doing R&D, as well as population, might slow future GDP/capita growth. Kruse-Andersen (2017) projects growth out to 2100 with a semi-endogenous model and predicts average GDP/capita growth of 0.45%, without even taking into account slowing population growth.

So J / K / S theories offer plausible explanations of 20th century exponential growth and ultimately suggest that 21st century growth will be sub-exponential.143

#### 7.2.2.3.1 Additional knife-edges in J / K / S models?

J / K / S models make use of the ‘knife-edge’ claim that the number of researchers has grown exponentially. I argued that this is not problematic for explaining the past as the empirical evidence shows that the assumption is approximately true.

But it could be argued that the power-law structure of J / K / S models is an additional knife-edge. Consider the knowledge production:

$$\dot A=δ{L_A}^{λ}A^ϕ$$

The model assumes that φ is constant over time. If φ rose as A increased, then exponential growth in researchers would lead to super-exponential growth. If φ fell as A increased, then exponential growth in researchers would lead to sub-exponential growth.

To explain sustained exponential growth, J / K / S must assume that φ is constant over time, or at least asymptotes towards some value.

In my mind, this knife-edge is considerably less problematic than those of other models considered.

Firstly, a small deviation from the assumption does not cause growth to tend to 0 or infinity. If φ changes slightly over time, the rate of exponential growth will vary but it will not tend to 0 or infinity. For this to happen, φ would have to increase enough to exceed 1 (growth then tends to infinity) or decrease without bound (growth then tends to 0). But both these trajectories for φ are extreme, and so there is a vast region of possibilities where growth remains positive but bounded. I.e. a less idealized model might claim that φ varies over time but typically stays within some region (e.g. -3 < φ < 1). This broad assumption avoids extreme growth outcomes.

Secondly, all the endogenous models considered in this section use some sort of power-law structure like the J / K / Smodel. They are all guilty of some ‘knife-edge’ assumption equivalent to assuming that φ is constant over time. However, the other models in the section additionally assume that the power takes a particular value. In addition to assuming that φ is constant over time, they assume that φ takes a particular value. And I’ve argued that the particular value chosen is without good justification, and that changing that value ever so slightly would cause growth to go to 0 or infinity.

#### 7.3 An economic sub-literature claims constant exponential growth is a knife-edge condition in a wide class of growth models

Growiec (2007) proves144 that:

Steady-state growth… necessarily requires some knife-edge condition which is not satisﬁed by typical parameter values. Hence, balanced growth paths are fragile and sensitive to the smallest disturbances in parameter values. Adding higher order diﬀerential/diﬀerence equations to a model does not change the knife-edge character of steady-state growth.

It generalizes the proof of Christiaans (2004), which applies to a more restricted setting.

My own view is that these proofs suggest that knife-edge problems are generic and hard to avoid, but do not establish that the knife-edge conditions of all models are problematic. Growiec has agreed with me on this point in private discussions, and in fact helped me understand why.

The reason is that not all knife-edges are problematic. Here are a few examples:

• It’s plausible that there are constant returns to labor, capital, and land taken together, holding technology constant. This is supported by a thought experiment. Double the number of factories, the equipment inside them, and the workers in them; this should double output as you can make twice as much of each item. If this was the same knife-edge that was required for exponential growth, it would be less problematic than the knife-edges considered above (which roughly speaking requires constant returns to capital and technology holding labor constant).
• Galor and Weil (2000) use a negative feedback loop to explain exponential growth. The more people there are, the more R&D effort there is and the faster the economy grows. In addition, when growth is faster people have fewer kids, instead focusing on education. This leads to the following dynamic: higher growth → lower fertility → lower growth. And conversely: lower growth → higher fertility → higher growth. This negative feedback loop stabilizes growth. It doesn’t involve any problematic knife-edge conditions, even though the theory satisfies the axioms of Growiec (2007). I don’t find this particular story convincing, as I trust the UN forecast that fertility will indeed fall over the century. Nonetheless, it is an existence proof of a theory without a problematic knife-edge condition.
• There may be an alternative framework in which the ‘knife-edge’ case occurs for a thick set of parameter values. Indeed I discuss an attempt to do this for Y / GH / AH models in the next section, though I know of no other explicit attempts to do this.
• The knife-edge may not be problematic at all if it involves the introduction of a completed new unwarranted term to the equations.145 Some of the knife-edges discussed above involved introduced a new exponent φ that was implicitly set to 1 in the original model. How problematic the knife-edge is depends on whether the new class of theories introduced is a natural extension of the original. In other words, are other values of φ plausible, or is φ = 1 a privileged case that we can expect to hold exactly? I argued that other values are plausible on a case by case basis above. But this is a matter of judgement; more of an art than a science.146

#### 7.4 Might market dynamics eliminate the need for a knife-edge condition?

In the ambitious 2020 paper Robust Endogenous Growth, Peretto outlines a fully endogenous growth model that (he claims) achieves constant growth in equilibrium without knife-edge conditions. I consider the paper to be a significant technical contribution, and a very impressive attempt to meet the knife-edge challenge. However, I doubt that is ultimately successful.

The mechanism for achieving stable growth is somewhat complex – indeed the model as a whole is extremely complex (though well-explained). Very briefly, the economy is split into N firms, and the average quality of technology at a firm is denoted by Z. N increases when individuals decide to invest in creating new firms, Z increases when individuals decide to invest in improving their firm’s technological level. These decisions are all made to maximize individual profit.

There are increasing returns to investment in Z. This means that if N were held fixed and a constant share of output were invested in increasing Z then growth would explode (going to infinity in finite time). In this sense, the system has explosive potential.

However, this explosive potential is curbed by the creation of new firms. Once new firms are created, subsequent investment in Z is diluted, spread out over a greater number of firms, and Z grows more slowly. Creating new firms raises output in the short-term but actually reduces the growth of the economy in the long run.147

There are diminishing returns to N, so creation of new firms does not lead to explosive growth. We can think of the diminishing returns of N as ‘soaking up’ the excess produced from the increasing returns to Z.

I believe that if the growth rate of N was slightly faster or slower then long-run growth would diverge (either be explosive or tend to 0). If so, there should be a robust explanation for why N grows at exactly the rate that it does.

So the key question from the perspective of the knife-edge critique is:

Why does N grow just fast enough to curb the explosive growth potential of Z, but not fast enough to make long-run growth sub-exponential (tending to 0 in the long run)?

Despite studying the paper fairly closely, and it being well explained, I don’t have a fully satisfactory answer to this question. I discuss my best answer in this appendix.

Does the model fulfill its promise of avoiding knife-edge conditions? A recent review article answers with an emphatic ‘yes’, and I couldn’t see any papers disputing this result. However, the paper was only published in 2020, so there has not been much time for scrutiny. Although there seem to be no knife-edge conditions in the production function, it is possible that they are located elsewhere, e.g. in the equations governing firms’ profits. Indeed, in private correspondence Growiec has indicated that he believes there must be a knife-edge condition somewhere that Peretto does not explicitly discuss and may not even be aware of.

My own guess is that a knife-edge is present in the expression for the fixed cost a firm must pay to produce goods. This fixed cost is assumed to be proportional to Z. I believe that if it were proportional to Zφ with φ ≠ 1, then growth would either tend to infinity or to 0. If so, φ = 1 would be a knife-edge condition. Indeed, Peretto confirmed in private correspondence that if instead the fixed cost were proportional to Z0.9, the model would not produce exponential growth, and he thought the same was likely true if they were proportional to Z1.1. Growiec also thought this seemed like a plausible candidate for such a knife-edge condition. However, no-one has worked through the maths to confirm this hypothesis with a high degree of confidence. Further, this ‘knife-edge’ may not be problematic: φ = 1 may be the only assumption that prevents fixed costs from tending to 0% or 100% of the total costs of production.

Putting the knife-edge issue aside, the model seems to have two implausible problems:

1. Problem 1. Though the model avoids knife-edge conditions, it has a perplexing implication. In particular, like all Schumpeterian growth models,148 it implies that if no new products were introduced – e.g. because this was made illegal – and we invested a constant fraction of output in improving technology then there would be explosive growth and output would approach infinity in finite time. This means that there is a huge market failure: private incentives to create new companies massively reduce long-run social welfare.
2. Problem 2. In addition, it is not clear that market fragmentation happens as much as the model implies. A small number of organizations have large market shares of industries like mass media, pharmaceuticals, meat packing, search engines, chip production, AI research, and social networks.149 Indeed, in some areas market concentration has been increasing,150 and market concentration is one of the stylized facts of the digital era.151

Overall, this impressive paper seems to offer a fully endogenous growth model in which constant growth is not knife-edged. Though I doubt it is ultimately successful, it does identify a mechanism (individual incentives) which can cause an apparent knife-edge to hold in practice. The paper slightly raises my expectation that long-run growth is exponential.

#### 7.5 Conclusion

It seems that many, and perhaps all, endogenous growth models only display constant exponential growth only under problematic knife-edge conditions that we have little reason to suppose hold exactly. The main exception is semi-endogenous growth models J / K / S, but these imply that 21st century growth will be sub-exponential given the projected slowing population growth.

There are a few important takeaways from the perspective of this report:

• Theoretical considerations, combined with the empirical prediction that population growth will slow, implies 21st century growth will not be exponential, but rather sub-exponential.
• The semi-endogenous models that I argue give better explanations for 20th century growth also imply that full automation of goods and knowledge production would lead to explosive growth. In particular, when you add to these models the assumption that capital can substitute for labor, they predict explosive growth. (See the endogenous models discussed here.)
• It’s surprisingly hard to find a robust theoretical explanation of the empirical trend of exponential growth that implies it will continue until 2100. This suggests that exponential may be transitory, rather than a steady state. This in turn should raise our probability that future growth is sub- or super-exponential.

There are three caveats to these conclusions.

Firstly, a very recent endogenous growth model seems to allow for constant growth that does not depend on knife-edge conditions. Although I’m not convinced by the model, it highlights possible mechanisms that could justify a seemingly problematic knife-edge condition in practice.

Secondly, I have not done a review of all growth models. Perhaps an existing endogenous growth model avoids problematic knife-edge conditions and delivers exponential growth. I would be surprised if this is the case as there is a sub-literature on this topic that I’ve read many papers from (linked during this section), and they don’t mention any such model. For example, this review article on knife-edge problems doesn’t mention any such model, and argues that only Peretto’s 2020 paper solves the knife-edge problem.

Thirdly, perhaps there is a mechanism producing exponential growth that growth theorists aren’t aware of. The process of economic growth is extremely complex, and it’s hard to develop and test growth theories. If there is such a mechanism, it may well continue to produce exponential growth until 2100.

Based on these caveats, I still assign ~25% probability to ~2% exponential growth in frontier GDP/capita continuing until 2100, even if there’s sub-exponential growth in population.

## 8. Appendix C: Conditions for super-exponential growth

This section lays out the equation for various growth models, and the conditions under which super-exponential growth occurs. I don’t make derivations or explain the results. Its purpose is to support some key claims made in the main report.

There are two high-level sections, each of which support a key claim in the main report:

• Long-run explosive models:
• Key claim: Long-run explosive models assume that capital, labor and technology are all accumulable. Even if they include a fixed factor like land, there are increasing returns to accumulable inputs. This leads to super-exponential growth as long unless the diminishing returns to technology R&D are very steep. For a wide range of plausible parameter values, these models predict super-exponential growth.
• Standard growth models adjusted to study the effects of AI:
• Key claim: The basic story is: capital substitutes more effectively for labor → capital becomes more important → larger returns to accumulable inputs → faster growth. In essence, the feedback loop ‘more output → more capital → more output → …’ becomes more powerful and drives faster growth.

I also use this section to evidence my claims about the AI robot scenario, in which AI substitutes perfectly for human labor (the AI robot scenario):

Indeed, plugging this [AI robot] scenario into a range of growth models, you find that super-exponential growth occurs for plausible parameter values, driven by the increased returns to accumulable inputs.

This third claim is evidenced at the bottom of both the high-level sections, here and here.

Lastly, I use this section to evidence one further claim:

This suggests that the demographic transition, not diminishing returns, explains the end of super-exponential growth.

I evidence this final claim here.

#### 8.1 Long-run explosive models

Long-run explosive models are endogenous growth models fit to long-run GWP that predict explosive growth will occur in a few decades.

In the main report I claim:

Long-run explosive models assume that capital, labor and technology are all accumulable. Even if they include a fixed factor like land, there are increasing returns to accumulable inputs. This leads to super-exponential growth as long unless the diminishing returns to technology R&D are very steep. For a wide range of plausible parameter values, these models predict super-exponential growth.

I support these claims by analysing some long-run explosive models.

#### 8.1.1 Roodman (2020)

I analyze a simplified version of the model.152

The equations for the model:

$$Y=AK^αL^βW^{1−α−β}$$

$$\dot K={s_K}Y−{δ_K}K$$

$$\dot L={s_L}Y−{δ_L}L$$

$$\dot A={s_A}A^{ϕA}Y−{δ_A}A$$

A is technology, K is capital, L is labor; all three of these inputs are accumulable. W is the constant stock of land (fixed factor), φA controls the diminishing return to technology R&D, and δi controls the depreciation of the inputs.153

There are increasing returns to accumulable inputs. If you double A, K and L then Y more than doubles. (In Cobb Douglas models like this, there are increasing returns to some inputs just when the sum of the exponents of those inputs exceeds 1. In this case 1 + α + β > 1.)

A sufficient condition for super-exponential growth (deduced here) is:154

$$α+β> \frac {−ϕA}{1−ϕA}$$

This inequality reflects the claim ‘there’s super-exponential growth if the increasing returns to accumulable factors [α + β] is strong enough to overcome diminishing returns to technological R&D’.

If α + β = 0.9 (the fixed factor has exponent 0.1) then the condition on φA is φA > -9. Even the cautionary data of Bloom et al. (2020) suggests φA = -3.155 So there is super-exponential growth for a wide range of plausible parameter values.

#### 8.1.2 Kremer (1993)

I analyze the version of the model in Section 2:156

$$Y=Ap^{α}W^{1−α}$$

$$\dot A=δA^{ϕ}p^{λ}$$

A is technology, p is population, W is the fixed factor land. δ is constant, φ and λ control the diminishing return to technology R&D.

Kremer assumes GDP/capita is fixed at some Malthusian level ȳ:

$$p= \frac {Y}{ \bar y}$$

So larger Y → larger p: population is accumulable. Further, larger Y → larger p → larger Ȧ: technology is also accumulable. There are increasing returns to accumulable factors: 1 + α > 1.

A sufficient condition for super-exponential growth (deduced here):

$$α> \frac {−λ}{1−ϕ}+1$$

Again it depends on whether increasing returns to accumulable factors can overcome diminishing returns to technology R&D.

Bloom et al. (2020) derive φ = -2.1, on the assumption that λ = 1. This estimate of φ is conservative compared to others. The condition then reduces to α > 2/3. This is plausible given that 1- α is the exponent on the fixed factor land.157 (To look at it another way, if we added capital to the model – Y = ApαKβW1-α-β – the condition would become something like α + β > 2/3.)

#### 8.1.3 Lee (1988)

$$Y=Ap^{α}W^{1−α}$$

$$\frac {\dot A}{A}=δlog(p), A_0 \, given$$

$$\frac {\dot p}{p}=[log ( \frac {Y}{p})−log(\bar y)]×constant, p_0 \, given$$

Constants have the same meaning as in Kremer (1993). Both population and technology are accumulable, and there are increasing returns to both in combination (1 + α > 1). The system grows super-exponentially.158 There is no parameter describing diminishing returns to R&D efforts, so no inequality.

#### 8.1.4 Jones (2001)

$$Y=A^{σ}{L_Y}^{α}W^{1−α}$$

$$\dot A=δA^ϕ{L_A}^λ$$

LY is the amount of labor spent on producing output, LA is the amount of labor spent on research.159 Other symbols are as in Kremer (1993).

Changes in total labor L depend on GDP/capita, Y/L. The exact relationship is complex, but /L is an upside-down U-shaped function of income Y/L. (Initially /L increases with income, then it decreases.) In the initial period, L is output bottlenecked: higher Y → higher income → higher /L.

A is also accumulable: higher Y → higher L → higher Ȧ.

The system cannot be solved analytically, but the system grows super-exponentially if the following condition holds (as explained here):160

$$α> \frac {−λσ}{1−ϕ}+1$$

This is very similar to the condition in Kremer. Again, we have super-exponential growth as long as increasing returns to accumulable factors (α, σ) are sufficiently powerful enough to overcome diminishing returns.

Bloom et al. (2020) derive φ = -2.1, on the assumption that λ = 1 and σ = 1. The condition then reduces to α > 2/3. This is plausible given that 1 – α is the exponent on the fixed factor land.

#### 8.1.5 How does the case of perfect substitution (‘AI robots’) relate to these models?

AI is naturally thought of as a form of capital, and most of the above models do not contain capital. However, I suggest above that we can also think of AI as making the labor accumulable (the ‘AI robot’ scenario). With this assumption, all the above models predict super-exponential growth under a range of plausible parameter values.

#### 8.1.6 Can diminishing returns to innovative effort explain the end of super-exponential growth?

Perhaps the diminishing returns to innovative effort have become steeper over time. Jones (2001) estimates φ = 0.5 from population and GDP/capita from the last 10,000 years.161 Bloom et al. (2020) estimate φ = -2 from 20th century data on US R&D efforts and TFP growth. Could increasingly steep returns to innovative effort explain the end of super-exponential growth?

Summary

The models considered above suggest that the answer is ‘no’. When labor is accumulable, they predict super-exponential growth even with the conservative estimate of φ from Bloom et al. (2020). By contrast, when labor is notaccumulable (it grows exponentially) they predict exponential growth for a wide range of φ values. In other words, changing φ from 0.5 to -2 doesn’t change whether growth is super-exponential; for any φ in this range (and indeed a larger range), growth is super-exponential just if labor is accumulable.

In these models, the key factor determining whether growth is super-exponential is not the value of φ, but whether labor is accumulable. While diminishing returns to innovative effort may be part of the story, it does not seem to be the key factor.

Analysis

We’ve seen above that when labor is accumulable, these models comfortably predict super-exponential growth even with the conservative estimate of φ = -2 from Bloom et al. (2020); they also predict super-exponential growth higher larger values of φ. Growth is super-exponential under a wide range of values for φ.

By contrast, when labor is not accumulable, but instead grows exponentially regardless of output, these models predict exponential growth for a wide range of φ values.

• Jones (2001) and Kremer (1993) Part 3 make exactly this assumption. They specify fertility dynamics leading to exponential population growth, and GDP/capita growth is exponential as long as φ < 1. Growth is exponential for a wide range of φ.
• We can also see this in the case of Roodman (2020). When labor grows exogenously, there’s exponential growth if:
$$α< \frac {−ϕA}{1−ϕA}$$ where α is the exponent on capital. The capital share suggests α = 0.4, This implies there’s exponential growth as long as φA < -0.67. (This threshold is much higher than the estimate φA = -3 derived from Bloom et al. (2020) data.) Again, for a wide range of φA values, growth is exponential when labor isn’t accumulable. You can get a similar result for the endogenous growth model inspired by Aghion et al. (2017) discussed below.

• Roodman (2020) estimates φA = 0.2 based on data going back to 10,000 BCE. This implies super-exponential growth, even with exogenous labor.
• However, the absence of super-exponential growth over the last 120 seems like strong evidence against such high values of φA being accurate in the modern regime. Indeed, if you restrict the data set to start in 1000 AD, Roodman’s methodology implies φA = -1.3. With this value we again predict exponential growth when exogenous labor.
• It is possible Roodman’s estimate unintentionally includes the effect of one-off changes like improved institutions for R&D and business innovation, rather than just estimating the diminishing returns to R&D.

#### 8.2 Standard growth models adjusted to study the effects of AI

This section looks at standard growth models adjusted to study the possible growth effects of AI. These models treat AI as a form of capital. Some have their roots in the automation literature.

In the main report I claim that in many such models:

The basic story is: capital substitutes more effectively for labor → capital becomes more important → larger returns to accumulable inputs → faster growth. In essence, the feedback loop ‘more output → more capital → more output → …’ becomes more powerful and drives faster growth.

Here I look at a series of models. First I consider endogenous growth models, then exogenous ones, then a task-based model. Within each class, I consider a few different possible models.

#### 8.2.1 Endogenous growth models

Explosive growth with partial automation, Cobb-Douglas

First consider a Cobb-Douglas model where both goods production and knowledge production are produced by a mixture of capital and labor:

$$Y=A^ηK^α{L_Y}^{γ}W^{1−α−γ}$$

$$\dot A =A^{ϕ}K^{β}{L_A}^λW^{1−β−λ}$$

$$\dot K=sY−δK$$

A is technology and K is capital – both of these factors are accumulable. LA and LY are the human labor assigned to goods and knowledge production respectively – they are either constant or growing exponentially (it doesn’t affect the result either way). W is a fixed factor that can be interpreted as land or natural resources (e.g. a constant annual supply of energy from the sun).

The model is from Aghion et al. (2017), but I have added the fixed factor of land to make the model more conservative.

It is essentially a simple extension of the standard semi-endogenous model from Jones (1995), recognizing the roles of capital and natural resources as well as labor.

There is super-exponential growth, with growth rising without bound, if:

$$\frac {ηβ}{1−α}>1−ϕ$$

(This claim is proved in this technical appendix.)

Intuitively, the condition holds if the increasing returns to accumulable factors (represented by α, β , η) are stronger than the diminishing returns to technology R&D (represented by 1 – φ).

How far is this condition from being satisfied? Bloom et al. (2020) estimates φ = -2 on the assumption that η = 1 (which can be seen as a choice about the definition of A).162 This estimate of φ is more conservative than other estimates. The condition becomes:

$$\frac {β}{1−α}>3$$

Recent data puts the capital share at 40%, suggesting α = β = 0.4:

$$\frac {0.4}{0.6}>3$$

The condition is not satisfied. It would be satisfied, however, if the capital share rose above 0.75 in both goods and knowledge production.163 At current trends, this is unlikely to happen in the next couple of decades,164 but could happen by the end of the century. (This condition can be thought of as an empirical for whether explosive growth is near, like these discussed in Nordhaus (2021). It lowers my probability that TAI will happen in the next 20 years, but not far beyond that.)

(Note: Arrow (1962) is another Cobb Douglas endogenous model which implies advanced AI can drive explosive growth – see here.)

Explosive growth with full automation, CES production function.

Cobbs Douglas models assume that the elasticity of substitution = 1. Constant Elasticity of Substitution (CES) production functions provide a more general setting in which the elasticity of substitution can take on any value. The expression KαL1-α is replaced by:

$$F_σ(K,L)=(αK^ρ+(1−α)L^ρ)^{\frac {1}{ρ}}, with \, ρ=\frac {σ−1}{σ}$$

We can use this to generalize the above model as follows:

$$Y=A^ηF_{σY}(K,L)^αW^{1−α}$$

$$\dot A=A^ϕF_{σA}(K,L)^βW^{1−β}$$

$$\dot K=sY−δK$$

where σY and σA are the elasticities of substitution between capital and labor in goods and knowledge production respectively. When σY = σA = 1, this reduces to the Cobb-Douglas system above. α and β now represent the returns to doubling both labor and capital. It is standard to assume α = β = 1 but I continue to include W so that the model is conservative.

(Again this model is a generalization of the endogenous growth model in Aghion et al. (2017). A similar model is analyzed very carefully in Trammell and Korinek (2021) Section 5.2.)

In this setting, σY and σA are the crucial determinants of whether there is explosive growth. The tipping point is when these parameters rise above 1. This has an intuitive explanation. When σ < 1, Fσ(K, L) is bottlenecked by its smallest argument. If L is held fixed, there is limit to how large Fσ(K, L) can be, no matter how large K becomes. But when σ > 1, there is no such bottleneck: capital accumulation alone can cause Fσ(K, L) to rise without limit, even with fixed L.

The conditions for sustained super-exponential growth depend on whether and are above or below 0. I discuss four possibilities.

When σY < 1, σA < 1, there is not super-exponential growth unless φ > 1, as shown here.

When σY > 1, σA < 1 the condition is φ > 1 or α ≥ 1, as shown here. In other words, if there are constant returns to labor and capital in combination, a standard assumption, then increasing σY above 1 leads to super-exponential growth. (Note: even if α < 1, there may be an increase in growth when σY rises above 1. I discuss this dynamic more in the task-based model below.)

When σY < 1, σA > 1, a sufficient condition is (as deduced here):

$$ηβ>1−ϕ$$

Super-exponential growth occurs if increasing returns are sufficient to overpower diminishing returns to technology R&D. Aghion et al. (2017) analyze the standard case where η = 1 and β = 1. The condition becomes:

$$ϕ>0$$

(I discuss the related ‘search limits’ objection to explosive growth in a previous section.)

When σY > 1, σA > 1, a sufficient condition is (as deduced here):

$$\frac {ηβ}{1−α}>1−ϕ$$

Remember, α and β now represent the returns to doubling both labor and capital, so values close to 1 are reasonable. Let’s take α + β =0.9. Bloom et al. (2020) estimate φ = -2 on the assumption that η = 1; let’s use these values. The condition is satisfied:

$$9>3$$

(The latter two conditions can be derived from the from Cobb-Douglas condition using the following substitutions:

$$F_{σ<1}(K,L)→L$$

$$F_{σ>1}(K,L)→K$$

These substitutions can also be used to derive super-exponential growth conditions when σA = 1, σY ≠ 1, or when σA ≠ 1, σY = 1.)

The takeaway is that if AI increases the substitutability between labor and capital in either goods or knowledge production, this could lead to super-exponential growth. Reasonable parameter values suggest doing it in both would lead to super-exponential growth, but doing so in just one may not be sufficient.

Trammell and Korinek (2021) Section 5.1. discusses an endogenous ‘learning by doing’ model where a similar mechanism can lead to super-exponential growth.

#### 8.2.2 Exogenous growth models

No fixed factor

Nordhaus (2021) considers the following model:

$$Y=F_σ(AK,L)$$

$$\dot K=sY−δK$$

$$A=A_0e^{gt}$$

The key differences with the endogenous growth model considered above are:

• No ideas production function: technology is exogenous.
• No fixed factor W in the goods production. We add this later.
• Technology only augments capital. This doesn’t affect the result.

If σ > 1 then the capital share rises to unity and the model approximates the following:

$$Y=AK$$

$$\dot K=sAK−δK$$

$$A=A_0e^{gAt}$$

(This approximation, as well as the case σ < 1, is discussed in detail in this technical appendix.)

Now the growth rate of capital itself grows exponentially:

$$gK=sA−δ=sA_0e^{gAt}−δ≃sA_0e^{gAt}$$

The growth rate of output follows suit:

$$gY=gK+gA=sA_0e^{gt}−δ+gA≃sA_0e^{gAt}$$

Growth is super-exponential. (Note: although growth increases without bound, output does not go to infinity in finite time.) Again the pattern of explanation is: capital becomes more substitutable with labor → capital becomes more important → growth increases.

Even if technological progress halts altogether, growth is still:

$$gY=gK=sA_f−δ$$

where Af is the final level of technology. This growth could be very fast.

How robust is this result to our initial assumptions?

• We would have the same result if the model had been Y = AFσ(K, L) rather than Y = Fσ(AK, L). If the model was Y = Fσ(AK, L), we would not have unbounded growth.165
• You get the same result in the human-capital accumulation model of Lucas (1988) – see here.
• The result really depends on constant returns to K and L, combined with some form of capital augmenting technological progress.
• The next section relaxes the assumption of constant returns to K and L.

With a fixed factor

Let’s consider a more conservative case, where there are diminishing returns to labor and capital in combination due to some fixed factor and where full automation doesn’t occur. This model is inspired by Hanson (2001):166

$$Y=(AK)^αL^βW^{1−α−β}$$

The equations for A and K are as above. We assume L is constant.

The steady state growth rate is (proof here):

$$g_y= \frac {αg_A}{1−α}$$

If there is an increase in the capital share due to AI, growth will increase.

Suppose AI increases the capital share from α to α + fβ. (In a task-based model this corresponds to automating fraction f of tasks.) Production becomes:

$$Y=(AK)^{α+fβ}L^{(1−f)β}W^{1−α−β}$$

Growth increases to (proof here):

$$g_Y= \frac {(α+fβ)g_A}{1−α−fβ}$$

Again, the basic story is that the importance of (accumulable) capital increases, and growth increases as a result.167

If α + β is close to 1, and f = 1 (full automation) the new growth rate could be very high. If α + β = 0.9 then:

$$g_Y=9g_A$$

Hanson uses a more realistic model of AI automation. He separates out standard capital from computer capital, and assumes the productivity of computer capital doubles every two years, in line with Moore’s law. He finds that fully automating labor with computer capital can cause growth to rise from 4.3% a year to 45%.

Trammell and Korinek (2021) Section 3.4 discusses other exogenous growth models where a similar mechanism causes growth to increase.

So far all the models have treated the economy as a homogenous mass, and talked about how well AI substitutes for human labor in general. Really though, there are many distinct tasks in the economy, and AI might substitute better in some tasks than others. Aghion et al. (2017) develops a model along these lines. In the model tasks are gross complements. Technically, this means that the elasticity of substitution between tasks is below one. Intuitively, it means that each task is essential: total output is bottlenecked by the task we perform least well.

I will not describe the mathematics of the model (interested readers can read the paper), but rather its implications for growth.

Firstly, it no longer makes sense to talk about the substitutability of capital and labor in general. Rather the substitutability varies between tasks. This is sensible.

Secondly, we can permanently increase growth in the model by automating a constant fraction of non-automated tasks each year. Automating a task requires the elasticity of substitution to exceed 1 for that task. Presumably we are already automating some tasks each year and this is contributing to growth. But if advanced AI unleashes a process by which the rate of task automation itself increases, this would increase growth.168 The quicker the pace of automation, the higher the growth rate. If we automate an increasing fraction of tasks each year we can maintain super-exponential growth.169 However, this prediction assumes we can seamlessly reallocate human labor to the remaining tasks. If this isn’t possible (which seems likely!), then the actual boost to growth would be lower than that predicted by the model.

This path to higher growth is consistent with the basic story discussed in the main report: AI increases the substitutability of capital → capital is increasingly important (it performs an increasingly large fraction of tasks) → super-exponential growth.

Thirdly, if some fixed set of essential tasks remain unautomated, they will eventually bottleneck growth. Growth will fall back down to the background growth rate (that doesn’t depend on automation). I discuss whether this undermines the prospect of explosive growth here.

#### 8.2.4 How does the case perfect substitution (‘AI robots’) relate to these models?

The case of perfect substitution corresponds to σ = ∞. So it corresponds to σ > 1 in the CES models. In the Cobb-Douglas models it corresponds to the share of capital rising to what was previously the joint share of capital and labor. This case leads to faster growth in all the above models with plausible parameter values, and to super-exponential growth in all the models except the conservative exogenous model.

#### 8.3 What level of AI would be sufficient for explosive growth?

Given all of the above growth models, what’s our best guess about the level of AI that would likely be sufficient for explosive growth? Here I ignore the possibility that growth is bottlenecked by a factor ignored in these models, e.g. regulation. A better statement of the question is: if any level of AI would drive explosive growth, would level would be sufficient?

Answering this question inevitably involves a large amount of speculation. I will list the main possibilities suggested by the models above, and comment on how plausible I see them. It goes without saying that these predictions are all highly speculative; they may be ‘the best we have to go on’ but they’re not very ‘good’ in an absolute sense.

Here are three main answers to the question: ‘What level of AI is sufficient for explosive growth?’:

1. AI that allows us to pass a ‘tipping point’ in the capital share. The Cobb-Douglas models typically suggest that as the capital share in goods production and knowledge production rises, growth will be exponential until a ‘tipping point’ is passed. (We imagine holding the diminishing returns to R&D fixed.) After this point, growth is super-exponential and there will be explosive growth within a few decades.

I put limited weight on this view as the ‘tipping points’ are not reproduced in the CES setting, which generalizes Cobb-Douglas. Nonetheless, Cobb-Douglas provides a fairly accurate description of the last 100 years of growth and shouldn’t be dismissed.

2. AI that raises the elasticity of substitution σ between capital and labor above 1. When σ < 1 there is a limit to how large output can be, no matter how much capital is accumulated. Intuitively, in this regime capital is only useful when there’s labor to combine it with. But when σ > 1, capital accumulation alone can cause output to rise without limit, even with a fixed labor supply. Intuitively, in this regime capital doesn’t have to be combined with labor to be useful (although labor may still be very helpful). When this condition is satisfied in goods or knowledge production, explosive growth is plausible. When it’s satisfied in both, explosive growth looks likely to happen.

I put more weight on this view. However, these models have their limits. They assume that the degree of substitutability between labor and capital is homogenous across the economy, rather than depending on the task being performed.

3. AI that allows us to automate tasks very quickly. (This could either be because an AI system itself replaces humans in many tasks, or because the AI quickly finds ways to automate un-automated tasks.) In the task-based model of Aghion et al. (2017), automating a task provides a temporary boost to growth (a ‘level effect’). If we automate a constant fraction of un-automated tasks each year, this provides a constant boost to growth. If we automate a large enough fraction of non-automated tasks sufficiently quickly, growth could be boosted all the way to 30%.170 A special case of this story is of course full-automation.

I put the most weight on this third view. Nonetheless, it has some drawbacks.

• It doesn’t address the process by which tasks are automated and how this might feed back into the growth process.
• It doesn’t seem to be well-positioned to consider the possible introduction of novel tasks. In their model, introducing a new task can only ever decrease output.
• Like any model, it makes unrealistic assumptions. Most striking is the assumption that human workers are seamlessly reallocated from automated tasks to un-automated tasks. Friction in this process could slow down growth if we haven’t achieved full automation.
• It emphasizes the possibility of growth being bottlenecked by tasks that are hard to automate but essential. But it may be possible to restructure workflows to remove tasks that cannot be automated. This should reduce the weight we place on the model.

One common theme that I’m inclined to accept is that explosive growth would not require perfectly substitutable AI. Some weaker condition is likely sufficient if explosive growth is possible at all. Overall, my view is that explosive growth would require AI that substantially accelerates the automation of a very wide range of tasks in production, R&D, and the implementation of new technologies.

## 9. Appendix D: Ignorance story

According to the ignorance story, we’re simply not in a position to know what growth will look like over the long-term. Both the standard story (predicting roughly exponential growth) and the explosive growth stories are suspect, and we shouldn’t be confident in either. Rather we should place some weight in both, and also some weight in the possibility that the pattern of long-run growth will be different to the predictions of either story.

The ignorance story is primarily motivated by distrusting the standard story and the explosive growth story. This leaves us in a position where we don’t have a good explanation for the historical pattern of growth. We don’t know why growth has increased so much over the last 10,000 years, so we don’t know if growth will increase again. And we don’t know why frontier per-capita growth has been exponential for the last 150 years, so we don’t know how long this trend will continue for.

We shouldn’t confidently expect explosive growth – this would require us to trust the explosive growth story. But nor can we confidently rule it out – we’d either have to rule out sufficient AI progress happening by the end of the century, or rule out all of the growth models that predict explosive growth under the assumption that capital substitutes for labor. I discuss some of these here, and Trammell and Korinek (2021) discusses many more.

#### 9.1 The step-change story of growth

This report focuses on the possibility that GWP grew super-exponentially from 10,000 BCE to 1950, with some random fluctuations. The increasing returns mechanism, important in some other prominent theories of long-run growth, provides a plausible explanation for historical increases in growth..

However, the pre-modern GWP data is poor quality and it is possible that GWP followed a different trajectory. More precisely, GWP may have grown at a slow exponential rate from 10,000 BCE to 1500, and then there may have been a one-off transition to a faster rate of exponential growth. If this transition is allowed to last many centuries, from 1500 to 1900, this ‘step change’ story is consistent with the data.

Let a ‘step-change’ model be any that doesn’t use the mechanism of increasing returns to explain very long-run growth, but instead focuses on a one-off structural transition around the industrial revolution.171

Step-change models are typically complex, using many parameters to describe the different regimes and the transition between them. This isn’t necessarily a drawback: perhaps we should not expect economic history to be simple. Further, the step-change model is more consistent with the academic consensus that the industrial revolution was a pivotal period, breaking from previous trends.

#### 9.2 The step-change story of growth lends itself to the ignorance story

What should you think about explosive growth, if you accept the step-change story?

Hanson (2000) is an example of the step-change story. Hanson models historical GWP as a sequence of exponential growth modes. The Neolithic revolution in 10,000 BCE was the first step-change, increasing growth from hunter-gatherer levels to agricultural society levels. Then the industrial revolution in 1700, the second step-change, increased growth from agricultural levels to modern levels. (In some of Hanson’s models, there are two step-changes around the industrial revolution.)

If we were in the final growth mode, Hanson’s model would predict a constant rate of exponential growth going forward.172 However, Hanson uses the pattern of past step-changes to make predictions about the next one. He tentatively predicts that the next step-change will occur by 2100 and lead to GWP doubling every two weeks or less (growth of ≫ 100%).173

But we should not be confident in our ability to predict the timing of future step-changes in growth from past examples. Plausibly there is no pattern in such structural breaks, and it seems unlikely any pattern could be discerned from the limited examples we have seen. Someone embracing Hanson’s view of long-run GWP should see his predictions about future step-changes as highly uncertain. They may be correct, but may not be. In other words, they should accept the ignorance story of long-run GWP.

Could you accept the step-change theory and rule out explosive growth? You would need to believe that no more step changes will occur, despite some having occurred in the past. What could justify having confidence in this view? A natural answer is ‘I just cannot see how there could be another significant increase in growth’. However, this answer has two problems. Firstly, it may not be possible to anticipate what the step-changes will be before they happen. People in 1600 may not have been able to imagine the industrial processes that allowed growth to increase so significantly, but they’d have been wrong to rule out step-changes on this basis. Secondly, mechanisms for a faster growth regime have been suggested. Hanson (2016) describes a digital economy that doubles every month and various economic models suggest that significant automation could lead to super-exponential growth (more).

#### 9.3 How ignorant are we?

I think the ignorance story is a reasonable view, and put some weight on it.

Ultimately though, I put more weight on a specific view of the long-run growth. This is the view offered by models of very long run growth like Jones (2001): increasing returns (to accumulable inputs) led to super-exponential growth of population and technology from ancient times until about 1900. Then, as a result of the demographic transition, population grew exponentially, driving exponential growth of technology and GDP/capita.

Of course, this view omits many details and specific factors affecting growth. But I think it highlights some crucial dynamics driving long-run growth.

This view implies that 21st century growth will be sub-exponential by default: population growth is expected to fall, and so GDP/capita growth should also fall. However, if we develop AI that is highly substitutable with labor, then models of this sort suggest that increasing returns (to accumulable inputs will once again lead to super-exponential growth (more).

## 10. Appendix E: Standard story

This is not one story, but a collection of the methods used by contemporary economists to make long-run projections of GWP, along with the justifications for these methodologies.

In this section I:

• Briefly describe three methods that economists use to project GWP, with a focus on why they judge explosive growth to be highly unlikely (here).
• Show a probability distribution over future GWP that, from my very brief survey, is representative of the views of contemporary economists (here).
• Summarize the strengths and potential limitations of this collection of methods (here).

Note: this section focuses solely on the papers I found projecting GWP out to 2100. It does not cover the endogenous growth literature which contains various explanations of the recent period of exponential growth. I discuss these explanations in Appendix B.

#### 10.1 Methods used to project GWP

I have only done an extremely brief review of the literature on long-term GWP extrapolations. I have come across three methods for extrapolating GWP:

1. Low frequency forecasts – use econometric methods to extrapolate trends in GDP per capita, usually starting 1900 or later (more).
2. Growth models – calculate future growth from projected inputs of labor, capital and total factor productivity (more).
3. Expert elicitation – experts report their subjective probabilities of various levels of growth (more).

I’m primarily concerned with what these views say about the prospect of explosive growth. In summary, all three methods assign very low probabilities to explosive growth by 2100. My understanding is that the primary reason for this is that they use relatively modern data, typically from after 1900, and this data shows no evidence of accelerating growth – during this time the rate of frontier GDP per capita growth has remained remarkably constant (source, graphs).

#### 10.1.1.1 How does it work?

Low-frequency forecasting174 is a econometrics method designed to filter out short-horizon fluctuations caused by things like business cycles and pick up on longer-term trends.

I’ve seen two applications of low-frequency forecasting to project GWP until 2100175. The first176 simply takes a single data series, historical GWP per capita since 1900, and projects it forward in time. The second177 fits a more complex model to multiple data series, the historical GDP per capita of various countries. It can model complex relationships between these series, for example the tendency for certain groups of countries to cluster together and for low-income countries to approach frontier countries over time. Both models essentially project low-frequency trends in GDP per capita forward in time, without much reference to inside-view considerations.178

Econometric models of this kind have the benefit of providing explicit probability distributions.. E.g. these projections of US and Chinese GDP/capita from Muller (2019).

#### 10.1.1.2 Relation to the possibility of explosive growth

The structure of the model leads it to assign very low probabilities to the growth rate increasing significantly. So it assigns very low probabilities to explosive growth. In particular, the model assumes that the long-run growth rate oscillates around some constant.

More precisely, the models I’ve studied assume that per capita GWP growth179 is given by:

$$g_t=μ+u_t$$

μ is a constant and ut is a (possibly random) component whose expected long-run average is 0. gt either follows a random walk centered on μ180, or oscillates around μ deterministically. Either way, μ is the long-run average growth rate. Growth in successive periods is correlated and can differ from μ for some time, but in the long run average growth will definitely tend towards μ. These models assume that long-run growth rate is constant; they assume that long-run growth is exponential.

The only way that these models represent the possibility of explosive growth is through the hypothesis that the long-run growth rate μ is very large but, by a large coincidence, the random component ut has always canceled this out and caused us to observe low growth. The resultant probability of explosive growth is extremely small. In both the papers the estimate of average GWP growth until 2100 was about 2% with a standard deviation of 1%. Explosive growth would be > 25 standard deviations from the mean!

Models with this structure essentially rule out the possibility of an increasing growth rate a priori.181 This could be a valid modeling decision given that post-1900 GWP data, and certainly the frontier GDP data, shows no pattern of increasing per capita growth, and it is in general reasonable for a model’s assumptions to foreclose possibilities that have no support in the data. The problem, as we shall discuss later, is that pre-1900 data does show a pattern of super-exponential growth. Either way, it is fair to say that the low-frequency models are not designed to assess the probability of explosive growth, but rather to model the probability of hypotheses that are plausible given post-1900 data.

Could we use the low-frequency methodology to get a more accurate idea of the probability of explosive growth? It should in principle be possible to fit a low-frequency model that, like Roodman’s, contains a parameter that controls whether long-run growth is sub- or super-exponential.182 The possibility of explosive growth would then be represented by our uncertainty over the value of this parameter (as in Roodman’s model). I suspect that this model, trained on post-1900 data, would conclude that growth was very probably sub-exponential, but assign some small probability to it being slightly super-exponential. Explosive growth would eventually follow if growth were super-exponential. So I suspect that this methodology would conclude that the probability of explosive growth was small, but not as small as in the low-frequency models I have seen.

#### 10.1.2.1 How do they work?

Growth models describe how inputs like labor, capital and total factor productivity (TFP) combine together to make output (GDP). They also describe how these inputs change over time.

Here I’ll just describe how an extremely simple growth model could be used to generate GWP projections. Then I’ll list some ways in which it could be made more realistic.

Output Y in a year is given by the following Cobb-Douglas equation:

$$Y=AK^αL^β$$

where

• A is TFP.
• K is the capital, a measure of all the equipment, buildings and other assets.
• L is labor, a measure of the person-hours worked during the timestep.
• α and β give the degree of diminishing returns to capital and labor; it’s often assumed that α + β = 1, meaning that a doubling the number of workers, buildings and equipment would double the amount of output.

The inputs change over time as follows:

• A grows at a constant exponential rate – the average rate observed in the post-1900 data.
• L in each year is given by UN projections of population growth.
• The change in K between successive years is ΔK = sY – δK, where s is the constant rate of capital investment and δ is the constant rate of capital depreciation.
1. The value of K in year n can be calculated from the values of K and Y in the year n – 1

You generate GWP projections as follows:

• Identify Y with GWP.
• Get starting values of Y, A, K and L from data.
• Project A and L for future years as described above.
• Project K and Y for future years as follows:
1. Predict next year’s K using the current values of K and Y.
2. Predict next year’s Y using your projections for A, K, and L next year. Now you have K and Y for next year.
3. Repeat the above two steps for later and later years.

The above model is very basic; there are many ways of making it more sophisticated. Perhaps the most common is to project each country’s growth separately and model catch-up effects.183 You could also use a different production function from Cobb-Douglas, introduce additional input factors like human capital and natural resources184, use sophisticated theory and econometrics to inform the values for the factors185 and constants186 at each timestep, control for outlier events like the financial crisis, and model additional factors like changing exchange rates. These choices can significantly affect the predictions, and may embody significant disagreements between economists. Nonetheless, many long-run extrapolations of GWP that I’ve seen use a growth model that is, at its core, similar to my simple example.

My impression is that these models are regarded as being the most respected. They can incorporate wide-ranging relevant data sources and theoretical insights.

One down-side of these models is that the ones I’ve seen only provide point estimates of GWP in each year, not probability distributions. Uncertainty is typically represented by considering multiple scenarios with different input assumptions, and looking at how the projections differ between the scenarios. For example, scenarios might differ about the rate at which the TFP of lower-income countries approaches the global frontier.187 The point estimates from such models typically find that average per capita GWP growth will be in the range 1 – 3%.

#### 10.1.2.2 Relation to the possibility of explosive growth

Most of the long-run growth models I’ve seen set frontier TFP exogenously, stipulating that it grows at a constant rate similar to its recent historical average.188 While individual countries can temporarily grow somewhat faster than this due to catch-up growth, the long-run GDP growth of all countries is capped by this exogenous frontier TFP growth (source).

The structure of most of these models, in particular their assumption of constant frontier TFP growth, rules out explosive growth a priori.189 This is supported by the relative constancy of frontier TFP growth since 1900, but is undermined by earlier data points.

A few models do allow TFP to vary in principle, but still do not predict explosive growth because they only use post-1900 data. For example, Foure (2012) allows frontier TFP growth to depend on the amount of tertiary education and finds only moderate and bounded increases of TFP growth with tertiary education.190

The more fundamental reason these models don’t predict explosive growth is not their structure but their exclusive use of post-1900 data, which shows remarkable constancy in growth in frontier countries. This data typically motivates a choice of model that rules out explosive growth and ensures that more flexible models won’t predict explosive growth either.

#### 10.1.3.1 How does it work?

GWP forecasts are made by a collection of experts and then aggregated. These experts can draw upon the formal methods discussed above and also incorporate further sources of information and the possibility of trend-breaking events. This seems particularly appropriate to the present study, as explosive growth would break trends going back to 1900.

I focus exclusively on Christensen (2018), the most systematic application of this methodology to long-run GWP forecasts I have seen.

In this study, experts were chosen by ‘a process of nomination by a panel of peers’ and the resultant experts varied in both ‘field and methodological orientation’.191 Experts gave their median and other percentile estimates (10th, 25th, 50th, 75th, 90th percentiles) of the average annual per-capita growth of GWP until 2100. For each percentile, the trimmed mean192 was calculated and then these means were used as the corresponding percentile of the aggregated distribution.

As well as providing aggregated quantile estimates, Christensen (2018) fits these estimates to a normal distribution. The mean per capita growth rate is 2.06% with a standard deviation of 1.12%. This provides a full probability distribution over GWP per capita for each year.

#### 10.1.3.2 Relation to the possibility of explosive growth

If any expert believed there was > 10% chance of explosive growth, this would have shown up on the survey results in their 90th percentile estimate. However, Figure 7 of their appendix shows that no expert’s 90th percentile exceeds 6%. Strictly speaking, this is compatible with the possibility that some experts think there is a ~9% probability of explosive growth this century, but practically speaking this seems unlikely. The experts’ quantiles, both individually and in aggregate, were a good fit for a normal distribution (see Figure 7) which would assign ≪ 1% probability to explosive growth.

Nonetheless, there are some reasons to think that the extremely high and extremely low growth is somewhat more likely than these the surveys suggest:

• There is a large literature on biases in probabilistic reasoning in expert judgement. It suggests that people’s 10 – 90% confidence intervals are typically much too narrow, containing the true value much less than 80% of the time. Further, people tend to anchor their uncertainty estimates to an initial point estimate. These effects are especially pronounced for highly uncertain questions. The survey tried to adjust for these effects193, but the same literature suggests that these biases are very hard to eliminate.
• The experts self-reported their level of expertise as 6 out of 10, where 5 indicates having studied the topic but not being an expert and 10 indicates being a leading expert.194 The authors ‘take this as suggestive that experts do not express a high level of confidence in their ability to forecast long-run growth outcomes’. It also seems to suggest that there is no clear body of experts that specialize in answering this question and has thought deeply about it. This increases the chance that there are legitimate ways of approaching the problem that the experts have not fully considered.

#### 10.2 Probability distribution over GWP

I want an all-things-considered probability distribution over GWP that is representative of the different views and methodologies of the standard story. This is so I can compare it with distributions from the other big pictures stories, and (at a later time) compare it to the economic growth that we think would result from TAI. If you’re not interested in this, skip to the next section.

I’ve decided to use the probability distribution constructed from above-discussed expert elicitation in Christensen (2018). It has a mean of 2.06% and a standard deviation of 1.12%. I chose it for a few reasons:

• The experts can use the results of the other two methods I’ve discussed (econometric modeling and growth models) to inform their projections.
• Experts can take into account the possibility of trend-breaking events and other factors that are hard to incorporate into a formal model.
• The experts in Christensen (2018) were selected to represent a wide-range of fields and methodologies.
• The central aim of Christensen’s paper was to get accurate estimates of our uncertainty, and its methodology and survey structure was designed to achieve this goal.
• The expert elicitation distribution is consistent with point estimates from growth models.195 This is important because I believe these growth models incorporate the most data and theoretical insight and are consequently held in the highest regard.
• One possible drawback of this choice is that the distribution may overestimate uncertainty about future growth and assign more probability to > 3% than is representative.
• The 90th percentile of the distribution is higher than any point estimates I’ve seen.196
• The 10 – 90th percentile range is wider than the equivalent range from econometric methods.
• This may be because the expert elicitation methodology can incorporate more sources of uncertainty than the other models.

The expert elicitation probability distribution is over GWP per capita. To get a distribution over GWP I used the UN’s median population projections (which have been accurate to date).197

#### 10.3 Strengths and limitations

Advocates of standard story use a range of statistical techniques and theoretical models to extrapolate GWP, that are able to incorporate wide-ranging relevant data sources. If we were confident that the 21st century would resemble the 20th, these methods would plausibly be adequate for forecasting GWP until 2100.

However, I do believe that the methodologies of the standard story are ill-equipped to estimate the probability of a regime-change leading to explosive growth. This is due to a couple of features:

• The papers I’ve seen exclusively use post-1900 data, and often only post-1950 data.198 While reasonable for short-term growth forecasts, this becomes more questionable when you forecast over longer horizons. The post-1900 data is silent on the question of whether 21st century growth will follow a similar pattern to 20th century growth and of what it might look like if it does not.
• Its models typically foreclose the possibility of explosive growth by assuming that the long-run frontier growth rate is constant. This assumption is supported by the post-1900 data but not, we shall see, by endogenous growth theory or by data sets that go back further in time. As a result of this assumption, its models do not assess the probability that 21st century GWP growth is super-exponential, a critical question when assessing the plausibility of explosive growth.
• An important caveat is that expert elicitation does seem well placed to anticipate a regime-change, but experts assign < 10% to explosive growth and probably < 1%. I find this the most compelling evidence against explosive growth from the standard story. It is hard to fully assess the strength of this evidence without knowing the reasons for experts’ projections. If they have relied heavily on the other methods I’ve discussed, their projections will suffer from drawbacks discussed in the last two bullet points.

These limitations are not particularly surprising. The methods I’ve surveyed in this section were originally developed for the purposes of making forecasts over a few decades, and we saw above that even the most expert people in this area do not consider themselves to have deep expertise.

## 11. Appendix F: Significant probability of explosive growth by 2100 seems robust to modeling serial correlation and discounting early data points

The model in Roodman (2020) assigns 50% probability to explosive growth happening by 2044, 10% by 2033, and 90% by 2063. However, there are reasons to think that Roodman’s model may predict explosive growth too soon, and its confidence intervals may be too narrow.

An appendix discusses two such reasons:

• The growth rates in nearby periods are correlated, but Roodman’s model implies that they are independent.
• Recent data is more relevant to predicting 21st century growth than ancient data points, but Roodman’s model doesn’t take this into account.

(Note: there are other reasons to think explosive growth will happen later than Roodman predicts. In particular, population is no longer accumulable, where accumulable means more output → more people. This section does not adjust Roodman’s model for this objection, but only for the two reasons listed.)

How much would accounting for these two factors change the predictions of Roodman’s model? Would they delay explosive growth by a decade, a century, or even longer? To get a rough sense of the quantitative size of these adjustments, I built a simple model for projecting GWP forward in time. I call it the growth multiplier model. (At other places in the report I call it the ‘growth differences’ model.)

The growth multiplier model retains some key features of Roodman’s univariate endogenous growth model. In particular, it retains the property of Roodman’s model that leads it to predict sub- or super-exponential growth, depending on the data it is fit to. The justification for these features is the same as that for Roodman’s model: long-run GDP data displays super-exponential growth and endogenous growth models predict such growth.

At the same time, the growth multiplier model aims to address some of the drawbacks of Roodman’s model. Most significantly, it incorporates serial correlation between growth at nearby periods into its core. In addition, the user can flexibly specify how much extra weight to give to more recent data points. The model also incorporates randomness in a simple and transparent way. The cost of these advantages is that the model is considerably less theoretically principled than the endogenous growth models.

With my preferred parameters, the model assigns a 50% chance of explosive growth by 2093 and a 70% chance by 2200.199 There is still a 10% chance of explosive growth by 2036, but also a 15% chance that explosion never happens200. While I don’t take these precise numbers seriously at all, I do find the general lesson instructive: when we adjust for serial correlation and the increased relevance of more recent data points we find that i) the median date by which we expect explosion is delayed by several decades, ii) there’s a non-negligible chance that explosive growth will not have occurred within the next century, and iii) there is a non-negligible chance that explosive growth will occur by 2050. In my sensitivity analysis, I find that these three results are resilient to wide-ranging inputs.

The rest of this section explains how the model works (here), discusses how it represents serial correlation (here), compares its predictions to the other big-picture stories about GWP (here), does a sensitivity analysis on how its predictions change for different inputs (here), and discusses its strengths and weaknesses (here).

The code behind the growth multiplier model, Roodman’s model, and this expert survey is here. (If the link doesn’t work, the colab file can be found in this folder.)

#### 11.1 How does the growth multiplier model work?

Put simply, the model asks the question ‘How will the growth rate change by the time GWP has doubled?’, and answers it by saying ‘Let’s look at how it’s changed historically when GWP has doubled, and sample randomly from these historically observed changes’. Historically, when GWP has doubled the growth rate has increased by about 40% on average, and so the model’s median prediction is that the growth rate will increase by another 40% in the future each time GWP doubles.

The model divides time into periods and assumes that the growth rate within each period is constant. The length of each period is the time for GWP to increase by a factor r – this choice is inspired by the properties of Roodman’s univariate model.

So we divide the historical GWP data into periods of this kind and calculate the average growth rate within each period. Then we calculate the change in average growth rate between successive periods. Again inspired by Roodman’s univariate model, we measure this change as the ratio between successive growth rates: new_growth_rate / old_growth_rate.201 Call these ratios growth multipliers. The growth multiplier of a period tells you how much the average growth rate increases (or decreases) in the following period. For example, if 1800-1850 had 2% growth and 1850-1900 had 3% growth, then the growth multiplier for the period 1800-1850 would be 1.5.

Here’s an example with dummy data, in which r = 2.

To extrapolate GWP forward in time, we must calculate the growth rate g of the period starting in 2025. We do this in two steps:

• Randomly sample a value for the previous period’s growth multiplier. In this example, gm is the growth multiplier of the period finishing in 2025. gm is randomly sampled from the list [2, 2, 1.5, 0.5].202 All items on the list need not be equally likely; we can specify a discount rate to favor the sampling of more recent growth multipliers. This discount rate crudely models the extra weight given to more recent data points.
• Multiply together the growth rate and growth multiplier from the previous period. In this example, g = 1.5 × gm.
• Calculate the duration of the next period from its growth rate. In this example, we calculate YYYY from g.203 Notice that we already know the GWP at the end of the next period (in this example \$25,600b) as we defined periods as the time taken for GWP to increase by a factor of r.

We’ve now calculated the growth rate and end date of the next period. We can repeat this process indefinitely to extrapolate GWP for further periods.

The two seemingly arbitrary assumptions of this model – defining each period as the time for GWP to increase by a factor of r, and calculating the next growth rate by multiplying the previous growth rate by some growth multiplier – are both justified by comparison to Roodman’s univariate model. The former assumption in particular corresponds to a core element of Roodman’s model that drives its prediction of super-exponential growth. I discuss this in greater detail in this appendix.

#### 11.2 How does the growth multiplier model represent serial correlation?

In Roodman’s model, the median predicted growth for 2020-40 is higher than the observed growth in 2000-20 for two reasons:

1. The model believes, based on historical data, that when GWP increases growth tends to increase.
2. Growth in 2000-20 was below the model’s median prediction; it treats this as a random and temporary fluctuation, uncorrelated with that of 2020-40; it expects growth to return to the median in 2020-40.

It is Factor 2 that causes the model to go astray, failing to capture the serial correlation between growth in the two periods. Factor 2 alone raises the model’s median prediction for 2019 growth to 7.1%.204

The growth multiplier model addresses this problem by predicting growth increases solely on the basis of Factor 1; Factor 2 has no role. Unlike Roodman’s model, it does not track a ‘median’ growth rate as distinct from the actual growth rate; rather, it interprets the current growth rate (whatever it is) as ‘the new normal’ and predicts future growth by adjusting this ‘new normal’ for increases in GWP (Factor 1).

As a result, the growth multiplier model builds in serial correlation between the growth in different periods. If the current growth rate is ‘surprisingly low’ (from the perspective of Roodman’s model) then this will directly affect the next period’s growth rate via the formula new_growth_rate = old_growth_rate × growth_multiplier.205 In this formula, the role of ‘× growth_multiplier’ is to adjust the growth rate for the increase in GWP (Factor 1). The role of old_growth_rate is to link the next period’s growth directly to that of the previous period, encoding serial correlation. A single period of low growth affects all subsequent periods of growth in this way. A single period of low growth affects all subsequent periods of growth in this way. Further, this effect does not diminish over time, as the growth of period i + n is proportional to the growth of period i for all n.

There are possible models that display degrees of serial correlation intermediate between Roodman’s model and the growth multiplier model. I think such models would be more realistic than either extreme, but I have not attempted to construct one. I discuss this possibility more in this appendix. So while I regard Roodman’s predictions as overly aggressive, I regard those of the growth multiplier model as adjusting too much for serial correlation and in this sense being overly conservative. We should expect some return to the longer-run trend.

#### 11.3 What are the model’s predictions for my preferred parameters?

The following table describes the two inputs to the growth difference model and what my preferred values for these inputs are:

INPUT MEANING PREFERRED VALUE CONSIDERATIONS THAT INFORMED MY CHOICE OF R
r r controls the lengths of the periods that the model divides GWP into. A smaller value for rmeans we look at how growth has changed over shorter periods of time, and extrapolate smaller changes into the future.

Its value is fairly arbitrary; the division into discrete periods is done to make the model analytically tractable. My sensitivity analysis suggests the results are not very sensitive to the value of r – predicted dates for explosive growth change by < 10 years.

1.6 If r is too small, the GWP data is too coarse-grained to contain successive data points where GWP only differs by a factor of r. For example, GWP increases by a factor of 1.5 between some successive ancient data points.

If r is too large the assumption that growth is constant within each period is less plausible, and we lose information about how growth changes over shorter periods. For example, if r > 1.6 we lose the information that growth was slower from 2010-19 than from 2000 to 2010.

Discount rate How much we discount older data points. A discount of 0.9 means that when GWP was half as big we discount observations by a factor of 0.9, when GWP was 1/4 the size the discount is 0.92, when it was 1/8 the size the discount is 0.93, and so on. 0.9 This discount means that, compared to a 2000 observation, the 1940 observation has 73% of the weight, the 1820 observation has 53% of the weight, and the 3000 BCE observation has 23% of the weight.

With these inputs the model’s percentile estimates of the first year of explosive growth (sustained > 30% growth) are as follows:

These probabilistic GWP projections can be shown alongside those of Roodman’s model and the standard story.

See code producing this plot at the bottom of this notebook. (If the link doesn’t work, the colab file can be found in this folder.)

I believe the probabilities from the growth multiplier model are closer than Roodman’s to what it’s reasonable to believe, from an outside-view perspective, conditional on the basic ideas of the explosive growth story being correct.206

If we trust the standard story’s view that growth will continue at roughly its current level (1 – 3%) over the next decade or so, then we should decrease the probability of explosive growth by 2100 relative to these plots.

#### 11.4 Sensitivity analysis: how do the growth difference model’s predictions change for different inputs?

I investigated how changing both inputs affects the model’s projections. Full details are in this appendix, but I summarize the key takeaways in this section. For reference, Roodman’s percentile predictions about the first year of explosive growth are as follows:

 PERCENTILE EXPLOSIVE GROWTH DATE 10 2034 30 2039 50 2043 70 2050 90 2065

When I used my preferred inputs, the growth multiplier model differs from Roodman’s in two ways:

• It models serial correlation. This is implicit in the model’s structure.
• It places a larger discount on older data points. This is via my choice of discount rate.

We’ll now investigate the effect of each factor in turn, including how sensitive these are to the choice of r.

#### 11.4.1 Serial correlation alone could delay explosive growth by 30-50 years

We can isolate the impact of the first factor by choosing not to discount older data points (discount rate = 1). In this case, still using r = 1.6, the percentiles of the growth multiplier model are as follows:

A further sensitivity analysis on r shows that using different values of r between 1.05 and 3 could change the median date by ± 10 years in either direction, change the 10th percentile by ± 5 years in either direction, and change the 90th percentile by ± 100s of years.

#### 11.4.2 A reasonable discount can delay explosive growth by 20 years

The following table shows information about different discount rates. It shows how severely each discount downweights older data points, and how many years it delays the median predicted date of explosive growth.

 WEIGHT OF OLDER DATA POINTS (A 2000 DATA POINT HAS WEIGHT 100%) DELAY TO MEDIAN DATE OF EXPLOSIVE GROWTH (YEARS) DISCOUNT RATE 1940 1820 3000 BCE R = 1.6 R = 2 0.95 86% 74% 49% 4 1 0.9 73% 53% 23% 10 4 0.85 61% 38% 10% 21 10 0.8 51% 26% 4% 46 19 0.75 34% 12% 0.6% 89 29 0.7 22% 5% 0.1% 190 34

I consider values of discount rate equal or lower than 0.8 to be unreasonable. They place overwhelming importance on the last 50 years of data when forecasting GWP over much longer periods of time than this. For long-range forecasts like in this report, I favor 0.9 or 0.95. For reasonable discounts, explosive growth is delayed by up to 20 years.

The effect on the 10th percentile is much smaller (< 10 years), and the effect on the 70th and 90th percentiles is much larger. See this appendix for more details.

Even with very steep discounts, long term growth is still super-exponential. The recent data, even when significantly upweighted, don’t show a strong enough trend of slowing GWP growth to overwhelm the longer term trend of super-exponential growth.

Smaller values of r are slightly more affected by introducing a discount rate. I believe that this is because with smaller values of r the can model is fine-grained enough to detect the slowdown of GWP growth in the last ~10 years and a discount heightens the effect of this slowdown on the predictions. See more details about the interaction between r and the discount rate in this appendix.

#### 11.5 Strengths and limitations of the growth multiplier model

The growth multiplier model is really just an adjustment to Roodman’s model. Its key strength is that it addresses limitations of Roodman’s model while keeping the core elements that drive its prediction of super-exponential growth.

Its prediction of explosive growth invites many criticisms which I address elsewhere. Beyond these, its key limitation is that its modeling choices, considered in isolation, seem arbitrary and unprincipled. They are only justified via comparison to the increasing returns of endogenous growth models. A further limitation is that its description of the evolution of GWP is both inelegant and in certain ways unrealistic.207 Lastly, a somewhat arbitrary choice about the value of r must be made, and results are sensitive to this assumption within a couple of decades.

## 12. Appendix G: How I decide my overall probability of explosive growth by 2100

The process involves vague concepts and difficult judgement calls; others may not find it useful for deciding their own probabilities. I do not intend for the reasoning to be water-tight, but rather a pragmatic guide to forming probabilities.

Here are my current tentative probabilities for the annual growth of GWP/capita g over the rest of this century:

• Explosive growth, g > 30%: There’s a period, lasting > 10 years and beginning before 2100, in which g > 30%: ~30%.
• Significant growth increase, 5% < g < 30%: There’s no explosive growth but there’s a period, lasting > 20 years and beginning before 2100, in which g > 5%: ~8%.
• Exponential growth, 1.5% < g < 5%: There’s no significant growth increase and average growth stays within its recent range of values: ~25%.
• Sub-exponential growth, g < 1.5%: We never have a significant growth increase, and average annual growth is near the bottom or below its recent range: ~40%.208

I’ve rounded probabilities to 1 significant figure, or to the nearest 5%, to avoid any pretence at precision. As a result, the probabilities do not add up to 100%.

Note, the specific probabilities are not at all robust. On a different day my probability of explosive growth by 2100 might be as low as 15% or as high as 60%. What is robust is that I assign non-negligible probability (>10%) to explosive growth, exponential growth, and sub-exponential growth.

The diagram below summarizes the process I used to determine my probabilities. I use the toy scenario of ‘AI robots’discussed in the main report to help me develop my probabilities. Each AI robot can replace one human worker, and do the work more cheaply than a human worker. I use this scenario because it is concrete and easy to represent in economic models: AI robots allow capital to substitute perfectly for labour in goods production and knowledge production.

The following sections go through the diagram, explaining my decisions at each node. I recommend readers keep the diagram open in a tab to help them follow the logic. At several points, I feel I’ve been somewhat conservative about the probability of explosive growth; I indicate these as I go.

#### 12.1 Will we develop AI robots (or AIs with a similar impact on growth) in time for explosive growth to occur by 2100?

I split this into two sub-questions:

1. What level of AI is sufficient for explosive growth (assuming AI robots would drive explosive growth)?
2. Will we develop this level of AI in time for explosive growth to occur by 2100?

#### 12.1.1 What level of AI is sufficient for explosive growth (assuming AI robots would drive explosive growth)?

What’s the lowest level of AI that would be sufficient for explosive growth, assuming AI robots would be sufficient?

My view on this question is mostly informed by studying the growth models that imply AI robots would drive explosive growth. I analyze models one by one here, and draw my conclusions here. My (rough) conclusion is that ‘explosive growth would require AI that substantially accelerates the automation of a very wide range of tasks in production, R&D, and the implementation of new technologies.’ This would require very rapid progress in both disembodied AI and in robotics.

Consider a ‘virtual worker’ – AI that can do any task a top quality human worker could do working remotely (it could be one AI system, or multiple working together). I believe, for reasons not discussed in this report, that a virtual worker would probably enable us to quickly develop the level of robotics required for explosive growth.

I use a ‘virtual worker’ as my extremely rough-and-ready answer to ‘what’s the lowest level of AI that would drive explosive growth?’. Of course, it is possible that a virtual worker wouldn’t be sufficient, and also possible that a lower level of AI would be sufficient for explosive growth.

#### 12.1.2 Will we develop a ‘virtual worker’ in time for explosive growth to occur by 2100?

There are two sub-questions here.

1. By when must we develop a virtual worker for there to be explosive growth by 2100?
2. How likely are we to develop a virtual worker by this time?

I have not investigated the first sub-question in depth. In the growth models I’ve studied for this report, it seems that even in the ‘AI robot’ scenario it could take a few decades for growth to increase to 30%.209 So I provisionally treat 2080 as the answer to the first sub-question. For reasons not discussed in this report, I believe this is conservative and that developing a virtual worker would drive explosive growth within years rather than decades.

The second sub-question is then ‘How likely are we to develop a virtual worker by 2080?’.

My view on this is informed by evidence external to this report:

• Expert forecasts about when high-level machine intelligence will be developed.210
• If this was my only source of evidence I would assign ~45% by 2080.211
• A framework by my colleague Ajeya Cotra analyzing when the computation required to develop TAI will be affordable.
• Her high-end estimate assigns ~90% probability by 2080.
• Her best-guess estimate assigns ~70% probability by 2080.
• Her low-end estimate assigns ~40% probability by 2080.
• My own report on what prior we should have about when Artificial General Intelligence is developed.212
• My high-end estimate assigns ~30% probability by 2080.
• My best-guess estimate assigns ~15% probability by 2080.
• My low-end estimate assigns ~4% probability by 2080.

Personally, I put most weight on Ajeya’s framework (0.7), and roughly similar weight to the other two sources of evidence (~0.15 each). Conditional on Ajeya’s framework, I am closer to her low-end estimate than her best guess, at around 50% probability by 2080.213 Overall, I’m currently at around ~45% that we will develop a virtual worker by 2080.214

This explains my reasoning about the top-level node of the diagram. The next section looks at the nodes on the left hand side of the diagram, assuming we do develop a ‘virtual worker’, the section after looks at the right hand side of the diagram.

#### 12.2.1 Would AI robots drive explosive growth, absent any unintended bottlenecks?

Another way to understand this question is: Do AI robots have a strong tendency to drive explosive growth?

My opinion here is influenced by the history of economic growth and the choice between different growth models:

• There are broadly speaking two classes of theories: accumulation models and idea-based models. In accumulation models, the ultimate source of growth in GDP/capita is the accumulation of physical or human capital. In idea-based models, the ultimate source of growth is targeted R&D leading to technological progress.
• Idea-based models imply that AI robots would lead to explosive growth, when you use realistic parameter values.
• These models have increasing returns to inputs as a central feature, but do not predict super-exponential growth as labour is not accumulable. With AI robots there are increasing returns to accumulable inputs which can drive super-exponential growth.
• I analyze many of idea-based models in Appendix C,215 subbing in the AI robot scenario. I find that the increasing returns to accumulable inputs drives super-exponential growth when you use realistic parameter values.216
• Idea-based models offer a simple and plausible account of very long-run growth, according to which increasing returns to accumulable inputs has caused growth to increase over time.
• They are compatible with the importance of one-off structural transitions occurring around the industrial revolution.
• Appendix B argues that some idea-based theories (semi-endogenous growth models) offer the best explanation of the recent period of exponential growth.
• For accumulation-based models, the link between AI and growth is less clear but it’s still plausible that AI robots would drive explosive growth conditional on these models.
• Many of these models imply that the AI robot scenario would lead to explosive growth.
• For example, the learning by doing model of Arrow (1962) (more) or the human capital accumulation model of Lucas (1988) (more).217
• It’s possible to dismiss this prediction as an unintended artifact of the model, as the primary mechanism generating sustained growth in these models (capital accumulation) has no strong intuitive link with AI. This is in contrast to idea-based models, where there is an obvious intuitive way in which human-level AI would speed up technological progress.
• Some accumulation theories don’t imply that the AI robot scenario would cause explosive growth.
• For example, see Frankel (1962) (more), or simply a CES production function with the elasticity of substitution between labour and capital greater than 1 (more).
• I suggest these models face serious problems.
• Appendix B argues that accumulation theories require problematic knife-edge conditions for exponential growth.
• Growth accounting exercises, e.g. Fernald and Jones (2014), find that TFP growth accounts for the majority of growth rather than the accumulation of physical or human capital. This gives us reason to prefer idea-based models.
• Overall, I put ~80% weight on idea-based theories.
• Exogenous growth models can be understood as expressing uncertainty about the ultimate driver of growth. Even in a conservative exogenous growth model, where a fixed factor places diminishing returns on labour and capital in combination, capital substituting for labour in goods production can cause a significant one-time increase in growth (although this may not be sufficient for > 30% annual growth).

So, overall, would AI robots this century drive explosive growth, assuming there are no unanticipated bottlenecks? My starting point is the 80% weight I put on idea-based models, based on their explanation of very long-run growth and the recent period of constant growth. I bump this up to 90% as various exogenous models and accumulation-based models also imply that AI robots would drive explosive growth. Lastly, I cut this back to 80% based on the possibility that we can’t trust the predictions of these models in the new regime where capital can entirely replace human labour.

Most of the 20% where AI robots don’t have a tendency to drive explosive growth corresponds to none of our theories being well suited for describing this situation, rather than to any particular alternative model.

So I put ~80% on AI robots driving explosive growth, absent unanticipated bottlenecks.

#### 12.2.2 Will there be unanticipated bottlenecks?

I have done very little research on this question. Above, I briefly listed some possible bottlenecks along with reasons to think none of them are likely to prevent explosive growth. I put ~25% on a bottleneck of this kind preventing explosive growth.

This means my pr(explosive growth | AI robots this century) = 0.8 × 0.75 = ~60%. If I had chosen this probability directly, rather than decomposing it as above, I’d have picked a higher number, more like 75%. So the ‘60%’ may be too low.

#### 12.2.3 If there is an unanticipated bottleneck, when will it apply?

This corresponds to the node ‘Does the bottleneck apply before g>5%?’.

Suppose we develop AI that has a strong tendency to drive explosive growth, but it doesn’t due to some bottleneck. How fast is the economy growing when the bottleneck kicks in?

Large countries have grown much faster than 5% before,218 suggesting the bottleneck probably kicks in when g > 5%. In addition, there’s a smaller gap between the current frontier growth (~2%) and 5% than between 5% and 30%.

On the other hand, it’s possible that the unknown bottleneck is already slowing down frontier growth, suggesting it would limit growth to below 5%.

Somewhat arbitrarily, I assign 80% to the bottleneck kicking in when g > 5%, and 20% to it kicking in when g < 5%.

#### 12.2.4 If we develop a ‘virtual worker’ but it has no tendency to drive explosive growth, will growth slow down?

This corresponds to the left-hand node ‘Will growth slow down?’.

My first pass is to fall back on the scenario where we don’t make impressive advances in AI at all (I discuss this scenario below). This implies ~65% to sub-exponential growth and ~35% to exponential growth.219 I give 50% to each because highly advanced AI might help us to sustain exponential growth even if it has no tendency to produce explosive growth.

#### 12.3.1 Is there explosive growth anyway?

If we are skeptical of the explanations of why growth increased in the past, and why it has recently grown exponentially, we may be open to growth increasing significantly without this increase being driven by AI. Growth has increased in the past, perhaps it will increase again.

Even if we can’t imagine what could cause such an increase, this is not decisive evidence against there being some unknown cause. After all, hypothetical economists in 1600 would have been unlikely to imagine that the events surrounding the industrial revolution would increase growth so significantly. Perhaps we are just as much in the dark as they would have been.

Further, brain emulation technology could have similar effects on growth to advanced AI, allowing us to run human minds on a computer and thus making population accumulable. Perhaps radical biotechnology could also boost the stock of human capital and thus the rate of biotechnological progress.

I currently assign 2% to this possibility, though this feels more unstable than my other probabilities. It’s low because I put quite a lot of weight in the specific growth theories that imply that super-exponential growth was fueled by super-exponential growth in the human population (or the research population) and so wouldn’t be possible again without advanced AI or some tech that expanded the number or capability of minds in an analogous way; I’m conservatively assigning low probabilities to these other technologies. I think values as high as 5-10% could be reasonable here.

#### 12.3.2 If there isn’t explosive growth anyway, does growth slow down?

This corresponds to the right-hand node ‘Will growth slow down?’.

I put ~75% weight in semi-endogenous growth theories, which is my first-pass estimate for the probability of sub-exponential growth in this scenario.

You could try to account for further considerations. Even if semi-endogenous growth theory is correct, g could still exceed 1.5% if the fraction of people working in R&D increases fast enough, or if other factors boost growth. On the other hand, even if semi-endogenous growth theory is wrong, growth could slow for some reason other than slowing population growth (e.g. resource limitations). I assume these considerations are a wash.

I do make one more adjustment for the effect of AI. Even if we don’t develop AIs with comparable growth effects to AI robots, AI might still increase the pace of economic growth. Aghion et al. (2017) focus on scenarios in which AI automation boosts the exponential growth rate. I assign 10% to this possibility, and so give 65% to sub-exponential growth in this scenario.

## 13. Appendix H: Reviews of the report

We had numerous people with relevant expertise review earlier drafts of the report. Here we link to the reviews of those who give us permission to do so. Note: the report has been updated significantly since some of these reviews were written.

## 14. Technical appendices

#### 14.1 Glossary

GDP

• Total stuff produced within a region, with each thing weighted by its price.

GWP

• Total amount of stuff produced in the whole world, with each thing weighted by its price.

GDP per capita

• GDP of a region divided by the region’s total population.
• So GWP/capita is GWP divided by the world population.

Frontier GDP

• GDP of developed countries on the frontier of technological development. These countries have the highest levels of technology and largest GDP/capita.
• Often operationalized as OECD countries, or just the USA.

Physical capital

• Machinery, computers, buildings, intellectual property, branding – any durable asset that helps you produce output.
• I often refer to this as merely ‘capital’.
• Doesn’t include land or natural resources.

Human capital

• Human skills, knowledge and experience, viewed in terms of its tendency to make workers more productive.

Total factor productivity (TFP) growth

• Increase in output that can’t be explained by increases in inputs like labor and capital.
• If TFP doubles, but all inputs remain the same, output doubles.
• TFP increases correspond to better ways of combining inputs to produce output, including technological progress, improvements in workflows, and any other unmeasured effects.
• In the report I often don’t distinguish between TFP growth and technological progress.

Exponential growth

• Example 1: the number of cells doubling every hour.
• Example 2: the number people infected by Covid doubling every month.
• Example 3: GWP doubling every 20 years (as it does in some projections).
• Definition 1: when ‘doubling time’ stays constant.
• Definition 2: when a quantity increases by a constant fraction each time period.
• yt+1 = yt(1 + g), where g is the constant growth rate.
• US GDP / capita has grown exponentially with g = 1.8% for the last 150 years. The doubling time is ~40 years.

Super-exponential growth

• When the growth rate of a quantity increases without bound (e.g 1% one year, 2% the next year, 3% the next year…).
• One example would be yt+1 = yt(1 +kyt).
• The time taken for the quantity to double falls over time.
• Examples:
• In ancient times it took 1000s of years for GWP to double, but today GWP doubles much faster. GWP doubled between 2000 and 2019.
• Some solutions to endogenous growth models imply GWP will increase super-exponentially.
• When univariate endogenous growth models are fit to historical GWP data from 10,000 BCE, they typically imply growth is super-exponential and that GWP will go to infinity in finite time.

Sub-exponential growth

• When the growth rate of a quantity decreases over time (e.g 1% one year, 0.5% the next year, 0.2% the next year…).
• One example would be yt+1 = yt(1 +k/yt).
• Another example is simply linear growth yt+1 = yt + k.
• The time taken for the quantity to double increases over time.
• Examples:
• The world’s population has doubled since 1973, but UN projections imply it will not double again this century.
• Some solutions to endogenous growth models that imply GWP will increase sub-exponentially. In these models growth ultimately plateaus.
• When univariate endogenous growth models are fit to historical GWP data from 1950, they typically imply growth is sub-exponential and that GWP will plateau.

Constant returns to scale

• If the inputs to production all double, the output doubles.
• For example, suppose output is created by labor and capital. Mathematically, we write this as Y = F(L, K). Constant returns to scale means that F(2L, 2K) = 2Y.

Increasing returns to scale

• If the inputs to production all double, the output more than doubles.
• For example, suppose output is created by labor, capital and technology. Mathematically, we write this as Y = F(L, K, A). Increasing returns to scale means that F(2L, 2K, 2A) > 2Y.

Exogenous growth model

• Growth model where the ultimate driver of growth lies outside of the model.
• E.g. in the Solow-Swan model growth is ultimately driven by the growth of inputs that are assumed to grow exponentially. The growth of these inputs is the ultimate source of growth, but it isn’t explained by the model.
• Technological progress is not explained by exogenous growth models.

Endogenous growth model

• Growth model that explains the ultimate driver of growth.
• E.g. Jones (2001) describes dynamics governing the increase in population and of technology, and the growth of these inputs is the ultimate source of growth.
• Typically endogenous growth models explain the growth of technology.

#### 14.1.1 Classifications of growth models

I introduce some of my own terminology to describe different types of growth models.

Long-run explosive models predict explosive growth by extrapolating the super-exponential trend in very long-run growth. I argue they should only be trusted if population is accumulable (in the sense that more output → more people).

Idea-based models explain very long-run super-exponential growth by increasing returns to accumulable inputs, including non-rival technology. They include long-run explosive models and models that have a demographic transition dynamic such as Jones (2001) and Galor and Weil (2000).

Step-change models. These models of very long-run growth emphasize a structural transition occurring around the industrial revolution that increases growth. They stand in contrast to models, like long-run explosive models, that emphasize the increasing return mechanism and predict growth to increase more smoothly over hundreds and thousands of years.

Explosive growth models predict that perfect substitution between labor and capital would lead to explosive growth.

#### 14.2 Models of very long-run growth that involve increasing returns

The purpose of the literature on very long run growth is to understand both the long period of slow growth before the industrial revolution and the subsequent take-off from stagnation and increase in growth.

I focus on two models on very long-run growth – Jones (2001) and Galor and Weil (2000). They are both characterized by increasing returns to accumulable inputs until a demographic transition occurs.Both these models predict super-exponential growth before the demographic transition, and exponential growth after it.

For both models I:

• Discuss the mechanisms by which they initially produce super-exponential growth, comparing them to the mechanisms of long-run explosive models.
• Explain how these models later produce exponential growth.
• Analyze the mechanisms why these models preclude explosive growth, and suggest that highly substitutable AI could prevent these mechanisms from applyin