Are Moonshots Giant Leaps of Faith?
Many unanswered questions remain about the value of rapid increases in federal support for specific areas of R&D, but some useful lessons are emerging.
We need only compare our standards of living with those of a few generations ago, when vaccines or air travel were not widely accessible, to obtain a sense of how much the advancement of science and technology has been a boon for society. And when we realize how much of that advancement has been sponsored by the government, we develop an intuitive support for public funding of research. Although the history of federal research is primarily characterized by incremental increases, a few waves of enthusiasm have generated large surges of support for certain projects. A critical question is whether these large commitments of public resources have generated proportionally large societal benefits. Are we better off with them? Are they even necessary?
Presidents have thrown their support behind major science projects because of their promised society-wide benefits or their perceived political advantages. The most recent is President Obama’s initiative to end cancer, which echoes President Nixon’s first war on cancer in many respects. President Bush proposed a return to the moon, and President Clinton placed a winning bet on the development of nanotechnologies and a risky one on doubling the budget of the National Institutes of Health (NIH). Indeed, budget jumps are regularly proposed by policy entrepreneurs who advocate for a leap forward in scientific knowledge, the speedy development of a promising technology, or less frequently, the building of administrative capacity in a research agency to meet some future social challenge. Fitting all these three aims is the most emblematic of them all: the Apollo program. It is due to the remarkable success of this program, politically as well as technologically, that we refer to this sort of policy proposal as moonshots.
If we resist the temptation to assume that more is always better—and therefore that much more is much better—what do we really know about the effects of surges in research budgets? In other words, is every moonshot a giant leap forward for mankind? This simple and obvious question has received surprisingly little attention, and I offer below some thoughts and considerations for policy analysts and policymakers who may wish to tackle it.
Budget punctuations can be appraised at three levels: societal effects, knowledge production, and impact on the research bureaucracy. I propose some evaluative criteria for each of these categories and offer some preliminary policy recommendations.
Societal effects
Do moonshots pay off for society? An answer entails three things: one, a rigorous imagination of the universe without the moonshot (counterfactuals); two, a measure of the distance between that alternative world and ours for every key aspect that we can meaningfully connect to progress or betterment for society (outcome measures); and three, a clear understanding of how much publicly funded research contributes to those measured facets of progress (causal links). Answering our question is no small challenge because even if we could produce good counterfactuals, our current outcome measures are very limited in scope, and our best speculations of causal links are highly uncertain.
A significant effort in the scholarship of innovation has been devoted to developing good indicators of the broader impact of research—that is, good outcome measures. Most of that effort has focused on the impact on science itself—such as the number of publications and patents and their dissemination, and the generality of the findings—but some attention has been given to societal effects. Among indicators of social impact, economic ones have dominated the discussion for decades, even as recent research has shown the noneconomic value of scientific and technological advancements to be quite significant. Still, aggregate measures of income and employment continue to attract most of the attention from policymakers who authorize research and development (R&D) leaps. We should ask, then, if they are adequate to assess moonshots.
The contribution of scientific and technical knowledge to those economic aggregates is usually discerned by conceptualizing technical knowledge as a factor of production—such as labor, capital equipment, or land—and estimating its impact on the economy using a model of the production function, which is a stylized representation of economic activity. Alternatively, technological knowledge is conceptualized as something like managerial skill or leadership, a feature of production that enhances the productivity of all productive factors. That approach also requires a model of the economy and, like the knowledge-as-input approach, is highly sensitive to the way we design the model. Models are useful devices to explain complex systems, but the abstractions necessary for their construction makes them objects of constant scientific debate if not dispute. In other words, our current knowledge of the economic impact of new technical knowledge is highly contested and uncertain.
The ubiquitous indicator of job creation is equally problematic. Should we count the jobs directly created or jobs supporting or adjacent to the R&D project? What about the jobs created by the companies spun off from the project? Even if the total effect of research on job creation is traceable, the net effect would be far more difficult to estimate: how many destroyed jobs can be directly attributed to a single technology? The total net-new-jobs figure, if tenable, must then be adjusted by a factor of job quality. Old jobs and new jobs are not the same; there are wage differentials and changes in job security that must be accounted for. In the service sector, for instance, some technologies have replaced decently paid and stable clerical jobs, such as accountants or travel agents, with computers and armies of temps.
Figures such as the number of new drugs approved by the Food and Drug Administration are often proposed as alternative outcome measures to economic aggregates. But drug approvals are a mixed indicator of the social impact of research because the mere existence of new drugs does not necessarily advance the public interest. The drugs could be so expensive that they would be affordable to only a tiny fraction of the people who need them, or their high cost could create inflationary pressure in the whole health care system, pushing insurance premiums up and thus hurting those who can barely afford health insurance.
Spillover effects and externalities are also commonly suggested as economic outcomes of research. These are effects on actors beyond those directly involved in research. For instance, public funding of research increases the quantity and quality of national research, and this increase has the spillover effect of enhancing science education. The most important spillover of publicly funded research is when it is taken up by private-sector innovation. Spillovers are a true effect of research and by some accounts not an insignificant one—estimates range from 15% to 40% of the excess return to firms in the whole economy. However, spillovers are hard to measure at the project level because of how diffused they are across the innovation system. In other words, estimating the spillovers from Apollo, the doubling of the NIH budget, or either “war on cancer” could very quickly become an intractable problem.
Compounding the problem of incomplete measures of societal outcomes from moonshot research is how little we know about the societal impact of overall R&D spending. In other words, our theorized causal links are highly speculative. The two major schools in the economics of innovation have alternative approaches to causal explanation. The neoclassical tradition uses models of economic aggregates, and most estimates of the impact of R&D on the economy build from the work of Nobel laureate Robert Solow, who used a residual measure of output to approximate the effect of technological change. Solow’s residual is the contribution of factors not included in the model or, as economic historian Philip Mirowski puts it, a measure of our ignorance of what really drives economic growth. The evolutionary economic tradition, in turn, explains innovation as an adaptive process by individual firms and industries and therefore discriminates the return of R&D investments across different economic sectors. For example, forestry and computers are not likely to have the same return from R&D.
Economy-wide estimates of R&D returns are, by construction, not useful to assess returns at the level of projects or federal agencies, but can industry-specific estimates be useful? Perhaps the most cited estimates by industry are from a 1977 study by Edwin Mansfield and his colleagues, which found a median social return of 56% and a median private return of 25% in a relatively small sample of industries. Those estimates are highly sensitive to model assumptions—such as how fast the competition would have produced the innovation instead of imitating it, or whether the innovations improved or displaced existing products—and Mansfield himself warned that these “results should be treated with considerable caution.” It remains an open question whether this approach could be adapted to evaluate the economic impact of moonshots.
What’s more, even if we could arrive at consensus on a single method of estimating returns from R&D investments, as we have for quantifying economic growth, we could then ask the really difficult questions: Is that productivity different for public and private R&D? Is the yield constant in time or highly sensitive to volatile political and economic variables? Does the yield vary for different time horizons in which effects compound? And perhaps the most pressing question for budget leaps: How widely does the yield vary for each possible allocation of the federal R&D portfolio?
Still another challenge to understanding the societal impact of moonshots is to find the proper method for the construction of counterfactuals. Evolutionary economists note that economy-wide returns may be misleading, but are industry-specific estimates a good starting point to assess the economic impact of large R&D projects? Are these events so unique that each must be studied separately? The original moonshot was and still is the symbolic height of US technological prowess, and during the Cold War it was a major victory over the Soviet Union. Ironically, it was not a victory of the free market; rather, it was one of central planning and government sponsorship. But that is beside the point. It was a show of strength on the international stage as well as a needed victory in domestic politics at a time when social tensions had called into question the national character. The space program was not merely a symbolic victory; it promoted significant development in the defense industry that later found application in civilian technologies of widespread use. The production of a counterfactual in this case would be a daunting task for the historian’s imagination. But even leaving this black swan aside, we cannot assume that moonshots are all homogeneous in their effects. Their industry-specific impact is only one facet of their uniqueness; they are also politically and administratively unique. The federal agencies that sponsor them serve different missions and respond to different political dynamics. Moonshots, it appears, are unique historical events rather than a class of phenomena; consequently, individual case studies are more likely to yield plausible counterfactuals for each event.
Do moonshots pay off for science?
The question of whether budget leaps lead to scientific leaps is subsidiary to the larger question about the pace and direction of scientific advancement under government sponsorship.
The effect of government funding on the pace of scientific advancement is often imagined as a production question, where a proportionality of outputs to inputs is assumed. More precisely, the question becomes how much additional output is obtained for every additional tax dollar invested in research. Given that the coin of the realm is peer-reviewed publications, counts of published papers are the most common output measure. More sophisticated measures adjust publication quantities by some factor of quality, such as forward citations. The problem is that those productivity ratios are at best suggestive and at worst misleading of scientific advancement. Consider the problem of disciplinary fads. A paper that is hyped at publication could generate great citation excitement before it reveals itself to be a scientific cul-de-sac. A parallel problem resides in the political economy of publications: editors of high-impact-factor journals give preference to eye-catching research, thus inflating the premium for sexy topics and inadvertently skewing the allocation of talent away from more pedestrian but productive research programs. Measures of quality-adjusted quantities also suffer from serious methodological limitations, such as the problem of truncation. When the density of forward citations is a measure of quality, we must truncate the number of years considered after publication. Thus, two papers of equal quality that display different maturation timelines will appear to be of unequal quality.
An elite group of bibliometricians has adopted a rather humble position in this respect under the banner of the Leiden Manifesto. The group acknowledges the limitations of indicators of science productivity, particularly when they are used to guide science management decisions, and recommends the use of quantitative indicators only in combination with other forms of expert judgment, even if those are more subjective assessments of quality. This epistemic modesty is of course a sign of wisdom in the community of quantitative analysts, but it posits a moral hazard problem because the final judgment on the productivity of a field of research is reserved to the experts of that field themselves.
The solution is to anchor the measurement of the productivity of science outside the boundaries of science itself. The measuring rod must be some instrumental use of science. The proof of the pudding is in the eating, and we indeed eat, consume, and use technology. Technology is a good measure of the advancement of knowledge because technology encapsulates cause-effect relations that matter to people other than scientists. Technology is, of course, not the only instrumental use of science. Science cultivates the habit of rational thought among students. In addition, people defer to scientific authority (and the bureaucracy that implements it) on matters as crucial to their lives as basic sanitation, the safety of food or medicines, weather forecasting, and nutrition. Decision makers in the public and private sectors also defer to scientific expertise on matters of import such as the unemployment rate, the speed of epidemic outbreaks, or estimates of natural oil and gas reserves. Although the pedagogical and cultural value of science is no less important than its partnership with technology, changes in technology are easier to measure.
Patents are better indicators of instrumental uses of research than are publications. Therefore, patenting by scientific researchers should offer an adequate first approximation of the productivity of research. A few caveats are nevertheless in order. First, only a fraction of the universe of ready-for-use research is patentable; therefore, patenting activity should be considered the lower bound of any estimate of research productivity. Second, some portion of research-based patents are spurious. This could be due to universities’ eagerness to signal productivity to their political patrons or to researchers responding to their employer’s incentives for promotion, or even industrial patenting where firms take title to patents not for their commercial potential but as bargaining chips in litigation with their competition. Third, even when ready-for-use research is patentable, those findings may not be patented for lack of commercial interest; effective technologies may not be marketable because of the modest purchasing power of those who would demand them. To the extent that the noise of spurious patenting can be separated, patenting activity could be a useful signal of the productivity of the research spurred by a moonshot.
If the level of public support is not sustained for at least a decade, the large number of young researchers hired the year of the leap are likely to face a very tight labor market when they try to launch their careers.
The impact of research funding leaps is felt not only in the productivity of research but in the process of its production itself. The production of technical knowledge is labor intensive and requires well-functioning organizations to train, employ, and support that labor force. A significant part of the impact of budget leaps on research is precisely their effect on the organization of science. The training of scientists is a long process taking in principle four years of undergraduate work and five years of graduate school, but in practice the average time spent in graduate school to earn a PhD is more than seven years, and several more years of postdoctoral training is the norm before individuals are able to conduct independent research. A budget leap for research in universities and national laboratories translates into a sudden expansion of existing research groups that must hire more doctoral and postdoctoral students. If the level of public support is not sustained for at least a decade, the large number of young researchers hired the year of the leap are likely to face a very tight labor market when they try to launch their careers. This is precisely what occurred with the NIH doubling. Young researchers found themselves stuck in low-paying postdoctoral positions for many years, and the majority were not able to find tenure-track research positions.
Another hangover effect of budget leaps is the drop in the rate of grant approval and the overburdening of the peer-review system. The population of researchers grows with the leap in funding, and this larger group then has to compete for relatively flat or declining post-leap funding. In the aftermath of the NIH doubling, the success rate for grant applications fell from 30% to 12%. These effects on the economy of science pose a danger for the advancement of any discipline: a risk-averse population is likely to play it safe by keeping research proposals within conventional parameters, pursuing questions that are relatively easy to answer and avoiding the more ambitious questions where failure is more likely. Budget leaps could be self-defeating if in the long run they result in more research along well-trodden paths and less progress along new avenues of exploration.
Administrative capacity
The struggle for the annual budget increase is more than an instrumental ritual for federal departments and agencies. Budget gains also signal the relative political power of the governmental offices in any given political moment. But are large budget increases good for the health of research agencies?
One way to answer this question is by examining the long-term health of the agency budget itself. Stationary growth occurs when an agency performs no better on average than total discretionary spending. We can test if leaps forward in the budget help agencies perform better in the long run than they would by maintaining stationary growth. If a given budget jump places the agency on a different trend line from which it can continue growing at a stationary pace, that would mean a significant gain for the agency. But what happens if the leap exhausts the political capital of an agency, and its budget freezes following the leap? This has been the experience of NIH, where the budget has been virtually stagnant in the years following the doubling that occurred from 1998 to 2003.
An historical analysis of the budget time series that I conducted with my colleague Ben Clark revealed that federal R&D has advanced more by gradual increments than budget leaps, both at the total budget level and at the agency level. What is more, the budget jolts that have taken place peter out in time, and agencies return to their long-term stationary growth trend. Just as at NIH, this happened at the National Aeronautics and Space Administration (NASA) at the end of the Apollo program. The decline at NASA was even steeper, and it appears that its current budget is lower than it would have been with stationary growth. Nevertheless, it would be hard to argue in retrospect that the US government should not have undertaken the feat of placing a man on the moon for the reasons suggested earlier.
Budget requests for capacity building in federal agencies are the least likely to succeed and ironically are the most likely to yield large social impacts, because strengthening an agency’s technical ability to deliver on its mission also strengthens the hand of its political backers.
If the effects of a budget leap dissipate in time, agencies may still be justified in pursuing them if they can transform the short-term financial gains into long-term sustainability, both technically and politically. In effect, a cash injection could be used to restructure and reenergize the agency so that it secures future political favor in the most legitimate way: by acquiring better capabilities to manage new challenges and better serve its mission. The doubling of NIH allowed it to build the infrastructure to sustain research in the new frontier of biomedicine: genetics. Those investments helped NIH in no small measure to prepare for new resource-expensive missions such as precision medicine and the new cancer initiative.
Budget requests for capacity building in federal agencies are the least likely to succeed and ironically are the most likely to yield large social impacts, because strengthening an agency’s technical ability to deliver on its mission also strengthens the hand of its political backers. They are least likely to succeed because they do not have the appeal of exciting technological or scientific breakthroughs.
Likewise, a technological moonshot is likely to have a longer life and consequently greater impact if the technology in question is a platform on which several other technological applications can be built—what economists call a “general purpose technology.” Of course, it is hard to anticipate what will be the next microchip. The sort of diversification that hedges the political bet is attained by seeking a technological class or a cluster of technologies that have a multiplicity of applications and uses. This heterogeneity within a technological project has the additional benefit of attracting the participation of additional agencies that will then share the responsibility for delivering tangible results from the budget leap. A good example of this internal diversification strategy is the National Nanotechnology Initiative. Nano is so many things that it is more accurate to refer to it in the plural, and it is this plurality that enabled multiple federal agencies to incorporate it in their research portfolio without stepping on each other’s toes. Another example of internal diversification is the Obama administration’s Clean Energy Savings for All Americans Initiative; federal funding for research is not the central piece, but it covers a range of research programs and technology development efforts that extend beyond the scope of the Department of Energy.
A technological moonshot is likely to have a longer life and consequently greater impact if the technology in question is a platform on which several other technological applications can be built.
The question about the health of the bureaucracy seems cynical when we recall that the true motivation of public investments in research is to deliver public goods, such as achieving a technological feat or significantly expanding our knowledge of nature. But we should reject the apparent cynicism because these organizations deliver public goods without which life in contemporary societies would hardly be recognizable: a well-functioning bureaucracy is a public good itself. In an ideal world, the health of the public administration is aligned with the fulfillment of its mission; however, such an aspiration is not always realized, and the bureaucracy must perform a calculus of subsistence where its political health and sustainability must carry some weight.
Modest promises are better promises
I have argued that assessing the effects of large jumps in R&D funding faces serious hurdles. There is a lack of well-specified outcome measures of the societal impact of publicly funded research. There is also a dearth of well-established causal links that ascribe research to specific societal outcomes. Taking stock of our current knowledge, we find ourselves in possession of no more than an informed intuition that may be enough to mobilize political support for governmental R&D subsidies but is certainly not sufficient to support big bets on R&D projects. What can be derived from our current knowledge is no more than modest prescriptions or rules of thumb. For instance:
- Investments in administrative capacity, particularly those aimed at better enabling agencies to deliver on their mission and meet future challenges, will likely be sustained longer and thus have the greatest social impact.
- It is better to invest in the development of general purpose technologies than more specific technological projects, not only because of the potential of broader impacts but also because they are politically stronger propositions.
- Consequently, investments aimed at specific goals, such as the cure of a disease, being more politically vulnerable, should be designed to deliver their largest impact between electoral cycles.
- Agencies could hedge the political risk of seeking a moonshot by pooling together their political capital with other agencies and distributing the task. Ironically, the politics of the R&D budget pits agencies to compete with each other.
From the perspective of the health of scientific research itself, R&D leaps should be justified on measures of the instrumental value of science, such as technological achievements. Not only are the practical uses of science easy for taxpayers and legislators to recognize; they are also a legitimate justification of any burst in funding. Any evaluation of funding must speak to the various ways in which science enters into partnership with technology, not only patentable intellectual property and commercial successes, but the full array of means by which knowledge production meets people’s needs.
It seems clear, then, that we cannot easily extrapolate our justifications for the R&D subsidy to moonshots and consequently we cannot tell whether large bets in R&D really translate into social net-benefits. Until the history of major moonshots provides evidence rather than intuition of their success, policy entrepreneurs will be in the awkward position of advocating moonshots as leaps of faith.