Monique Verdin, "Headwaters : Tamaracks + Time : Lake Itasca" (2019), digital assemblage. Photograph taken in 2019; United States War Department map of the route passed over by an expedition into the Indian country in 1832 to the source of the Mississippi River.

Embracing Intelligible Failure

In “How I Learned to Stop Worrying and Love Intelligible Failure” (Issues, Fall 2023), Adam Russell asks the important and provocative questions: With the growth of “ARPA-everything,” what makes the model succeed, and when and why doesn’t it? What is the secret of success for a new ARPA? Is it the mission? Is it the money? Is it the people? Is it the sponsorship? Or is it just dumb luck and then a virtuous cycle of building on early success?

I have had the privilege of a six-year term at the Department of Defense Advanced Research Projects Agency (DARPA), the forerunner of these new efforts, along with a couple of years helping to launch the Department of Homeland Security’s HSARPA and then 15 years at the Bill & Melinda Gates Foundation running and partnering with international development focused innovation programs. In the ARPA world, I have joined ongoing success, contributed to failure, and then helped launch new successful ARPA-like organizations in the international development domain.

During my time at the Gates Foundation, we frequently asked and explored with partners the question, What does it take for an organization to be truly good at identifying and nurturing new innovation? To answer, it is necessary to separate the process of finding, funding, and managing new innovations through proof-of-concept from the equally challenging task of taking a partially proven innovative new concept or product through development and implementation to achieve impact at scale. I tend to believe that Russell’s “aliens” (described in his Prediction 6 about “Alienabling”) are required for the early innovation management tasks, but I also believe that they are seldom well suited to the tasks of development and scaling. Experts are good at avoiding mistakes, but it is a different challenge to take a risk that is likely to fail and is in your own field of expertise, where you “should have known better” and where failure might be seen as a more direct reflection of your skills.

What does it take for an organization to be truly good at identifying and nurturing new innovation?

Adding my own predictions to the author’s, here are some other things that it takes for an organization to be good at innovation. Some are obvious, such as having sufficient human capital and financial resources, along with operational flexibility. Others are more nuanced, including:

  • An appetite for risk and a tolerance for failure.
  • Patience. Having a willingness to bet on long timelines (and possibly the ability to celebrate success that was not intended and that you do not directly benefit from).
  • Being involved with a network that provides deeper understanding of problems that need to be and are worth solving, and having an understanding of the landscape of potential solutions.
  • Recognition as a trusted brand that attracts new talent, is valued as a partner in creating unusual new collaborations, and is known for careful handling of confidential information.
  • Engaged and effective problem-solving in managing projects, and especially nimble oversight in managing the managers at an ARPA (whether that be congressional and administrative oversight in government or donor and board oversight in philanthropy).
  • Parent organization engagement downstream in “making markets,” or adding a “prize element” for success (and to accelerate impact).

To a large degree, these organizational attributes align well with many of Russell’s predictions. But I will make one more prediction that is perhaps less welcome. A bit like Anna Karenina’s view of happy and unhappy families, there are so many ways for a new ARPA to fail, but “happy ARPAs” likely share—and need—all of the attributes listed above.

Principal, Bermuda Associates

Adam Russell is correct: studying the operations of groups built on the Advanced Research Projects Agency model, applying the lessons learned, and enshrining intelligible failure paradigms could absolutely improve outcomes and ensure that ARPAs stay on track. But not all of the author’s predictions require study to know that they need to be addressed directly. For example, efforts from entrenched external interests to steer ARPA agencies can corrode culture and, ultimately, impact. We encountered this when my colleague Geoff Ling and I proposed the creation of the health-focused ARPA-H. Disease advocacy groups and many universities refused to support creation of the agency unless language was inserted to steer it toward their interests. Indeed, the Biden administration has been actively pushing ARPA-H to invest heavily in cancer projects rather than keeping its hands off. Congress is likely to fall into the same trap.

But there is a larger point as well: if you take a fifty-thousand-foot view of the research enterprise, you can easily see that the principle Russell is espousing—that we should study how ARPAs operate—should also be more aggressively applied to all agencies funding research and development.

Efforts from entrenched external interests to steer ARPA agencies can corrode culture and, ultimately, impact.

There is another element of risk that was out of scope for Russell’s article, and that rarely gets discussed: commercialization. DARPA, developed to serve the Department of Defense, and IARPA, developed to serve the government’s Intelligence agencies, have built-in federal customers—yet they still encounter commercialization challenges. Newer ARPAs such as ARPA-H and the energy-focused ARPA-E are in a more difficult position because they do not necessarily have a means to ensure that the technologies they are supporting can make it to market. Again, this is also true for all R&D agencies and is the elephant in the room for most technology developers and funders.

While there have been more recent efforts to boost translation and commercialization of technologies developed with federal funding—through, for example, the National Science Foundation’s Directorate for Technology, Innovation, and Partnerships—there is a real need to measure and de-risk commercialization across the R&D enterprise in a more concerted and outcomes-focused manner. Frankly, one of the wisest investments the government could make with its R&D dollars would be dedicating some of them toward commercialization of small and mid-cap companies that are developing products that would benefit society but are still too risky to attract private capital investors.

The government is well-positioned to shoulder risk through the entire innovation cycle, from R&D through commercialization. Otherwise, nascent technological advances are liable to die before making it across the infamous “valley of death.” Federal support would ensure that the innovation enterprise is not subject to the economy or whims of private capital. The challenge is that R&D agencies are not staffed with people who understand business risk, and thus initiatives such as the Small Business Innovation Research program are often managed by people with no private-sector experience and are so cumbersome and limiting that many companies simply do not bother applying for funding. There are myriad reasons why this is the case, but it is definitely worth establishing an entity designed to understand and take calculated commercialization risk … intelligibly.

President

Science Advisors

As Adam Russell insightfully suggests, the success of the Advanced Research Projects Agency model hinges not only on technical prowess but also on a less tangible element: the ability to fail. No technological challenge worth taking will be guaranteed to work. As Russell points out, having too high a success rate should indicate that the particular agency is not orienting itself toward ambitious “ARPA-hard problems.”

But failing is inherently fraught when spending taxpayer dollars. Politicians have been quick to publicly kneecap science funding agencies for high-profile failures. It is notable that two of the most successful agencies in this mold have come from the national security community: the original Defense Advanced Research Projects Agency (DARPA) and the Intelligence Advanced Research Projects Activity (IARPA). The Pentagon is famously tightlipped about its failures, which provides some shelter from the political winds for an ambitious, risk-taking research and development enterprise. Fewer critics will readily pounce on a “shrimp on a treadmill” story when four-star generals say it is an important area of research for national security.

Having too high a success rate should indicate that the particular agency is not orienting itself toward ambitious “ARPA-hard problems.”

There are reasons to be concerned about the political sustainability of frequent failure in ARPAs, especially as they move from a vehicle for defense-adjacent research into “normal” R&D areas such as health care, energy, agriculture, and infrastructure. Traditional federal funders already live in fear of selecting the next “Solyndra.” And although Tesla was a success story from the same federal loan portfolio, the US political system has a way of making the failures loom larger than the successes. I’ve personally heard federal funders cite the political maelstrom following the failed Solyndra solar panel company as a reason to be more conservative in their grantmaking and program selection. And it is difficult to put the breakthroughs we neglected to fund on posterboard—missed opportunities don’t motivate political crusades.

As a society and a political system, we need to develop a better set of antibodies to the opportunism that leaps on each failure and thereby smothers success. We need the political will to fail. Finding stories of success will help, yes, but at a deeper level we need to valorize stories of intelligible failure. One idea might be to launch a prestigious award for program managers who took a high-upside bet that nonetheless failed, and give them a public platform to discuss why the opportunity was worth taking a shot on and what they learned from the process.

None of this is to say that federal science and technology funders should be immune from critique. But that criticism should be grounded in precisely the kind of empiricism and desire for iterative improvement that Russell’s article embodies. In the effort to avoid critique, we can sometimes risk turning the ARPA model into a cargo cult phenomenon, copied and pasted wholesale without thoughtful consideration on the appropriateness of each piece. It was a refreshing change of pace, then, to see that Russell, when starting up the health-oriented ARPA-H, added several new questions, centered on technological diffusion and misuse, to the famous Heilmeier Catechism questions that a proposed ARPA project must satisfy to be funded. Giving the ARPA model the room to change, grow, and fail is perhaps the most important lesson of all.

Cofounder and co-CEO

Institute for Progress

A key obsession for many scientists and policymakers is how to fund more “high-risk” research—the kind for which the Defense Advanced Research Projects Agency (DARPA) is justifiably famous. There are no fewer than four lines of high-risk research awards at the National Institutes of Health, for example, and many agencies have launched their own version of an ARPA for [fill-in-the-blank].

Despite all of this interest in high-risk research, it is puzzling that “there is no consensus on what constitutes risk in science nor how it should be measured,” to quote Pierre Azoulay, an MIT professor who studies innovation and entrepreneurship. Similarly, the economics scholars Chiara Franzoni and Paula Stephan have reported in a paper for the National Bureau of Economic Research that the discussion about high-risk research “often occurs in the absence of well-defined and developed concepts of what risk and uncertainty mean in science.” As a result, meta-scientists who study this issue often use proxies that are not necessarily measures of risk at all (e.g., rates of “disruption” in citation patterns).

I suggest looking to terminology that investors use to disaggregate various forms of risk:

Execution risk is the risk that a given team won’t be able to complete a project due to incompetence, lack of skill, infighting, or any number of reasons for dysfunctionality. ARPA or not, no science funding agency should try to fund research with high execution risk.

Despite all of this interest in high-risk research, it is puzzling that “there is no consensus on what constitutes risk in science nor how it should be measured,” to quote Pierre Azoulay.

Market risk is the risk that even if a project works, the rest of the market (or in this case, other scientists) won’t think that it is worthwhile or useful. Notably, market risk isn’t a static and unchanging attribute of a given line of research. The curious genome sequences found in a tiny marine organism, reported in a 1993 paper and later named CRISPR, had a lot of market risk at the time (hardly anyone cared about the result when first published), but the market risk of this type of research wildly changed as CRISPR’s potential as a precise gene-editing tool became known. In other words, the reward to CRISPR research went up and the market risk went down (the opposite of what one would expect if risk and reward are positively correlated).

Technical risk is the risk that a project is not technically possible at the time. For example, in 1940, a proposal to decipher the structure of DNA would have had a high degree of technical risk. What makes the ARPA model distinct, I would argue, is selecting research programs that could be highly rewarding (and therefore have little market risk) and are at the frontier of a difficult problem (and therefore have substantial technical risk, but not so much as to be impossible).

Adam Russell’s thoughtful and inventive article points us in the right direction by arguing that, above all, we need to make research failures more intelligible. (I expect to see this and some of his other terms on future SAT questions!) After all, one of the key problems with any attempt to fund high-risk research is that when a research project “fails” (as many do), we often don’t know or even have the vocabulary to discuss whether it was because of poor execution, technical challenges, or any other source of risk. Nor, as Russell points out, do we ask peer reviewers and program managers to estimate the probability of failure, although we could easily do so (including disaggregated by various types of risk). As Russell says, ARPAs (any funding agency, for that matter) could improve only if they put more effort into actually enabling the right kind of risk-taking while learning from intelligible failures. More metascience could point the way forward here.

Executive Director, Good Science Project

Former Vice President of Research, Arnold Ventures

Adam Russell discusses the challenge of setting up the nascent Advanced Research Projects Agency for Health (ARPA-H), meant to transform health innovation. Being charged with building an organization that builds the future would make anyone gulp. Undeterred, Russell drank from a firehose of opinion on what makes an ARPA tick, and distilled from it the concept of intelligible failure.

As Russell points out, ARPA programs fail—a lot. In fact, failure is expected, and demonstrates that the agency is being sufficiently ambitious in its goals. ARPA-H leadership has explicitly stated that it intends to pursue projects “that cannot otherwise be pursued within the health funding ecosystem due to the nature of the technical risk”—in other words, projects with revolutionary or unconventional approaches that other agencies may avoid as too likely to fail. Failure is not usually a winning strategy. But paired with this willingness to fail, Russell says, is the mindset that “a technical failure is different from a mistake.”

By building feedback loops, technical failures can ultimately turn into insight regarding which approaches truly work. We absolutely agree that intelligible technical failure is crucial to any ARPA’s success, and find Russell’s description of it brilliantly apt. However, we believe Russell could have added one more note about failure. There are other types of failure, aside from technical failure, that ARPAs face as they pursue cutting-edge technology. Failures stemming from unanticipated accidents, misuse, or misperception are types of failures that do need to be worried about.

By building feedback loops, technical failures can ultimately turn into insight regarding which approaches truly work.

The history of DARPA technologies demonstrates the “dual use” nature of transformative innovation, which can unlock new useful applications as well as unintentional harmful consequences. DARPA introduced Agent Orange as a defoliation compound during the Vietnam War, despite warnings of its health harms. These are types of failures we believe any modern ARPA would wish to avoid. Harmful accidents and misuses are best proactively anticipated and avoided, rather than attempting to learn from them only after the disaster has occurred.

In fact, we believe the most ambitious technologies often prove the safest ones: we should aim to create the health equivalent of the safe and comfortable passenger jet, not simply a spartan aircraft prone to failure. To do this, ARPAs should pursue both technical intelligible failure and catastrophobia: an anticipation of, and commitment to avoiding, accidental and misuse failures of their technologies.

With regard to ARPA-H in particular, the agency has signaled its awareness of misuse and misperception risks of its technologies, and has solicited outside input into structures, strategies, and approaches to mitigating these risks. We hope consideration of accidental risks will also be included. With health technologies in particular, useful applications can be a mere step away from harmful outcomes. Technicians developing x-ray technology initially used their bare hands to calibrate the machines, resulting in cancers requiring amputation. Now, a modern hospital is incomplete without radiographic imaging tools. ARPA-H should lead the world in both transformative innovation and pioneering safety.

Resident Physician, Stanford University School of Medicine

Executive Director, Blueprint Biosecurity

Cite this Article

“Embracing Intelligible Failure.” Issues in Science and Technology 40, no. 2 (Winter 2024).

Vol. XL, No. 2, Winter 2024