Qualitative Metrics in Science Policy: What Can’t Be Counted, Counts

The past half-century has ushered in a veritable revolution in the science of metrics, as the surprisingly long life of Moore’s Law and related advances in information technology have led to a vast reservoir of quantitative information ripe for study with powerful analytical tools. For instance, captured in the research literature are methods to measure the quality of one’s health, to quantify athletic performance, and to determine something as innately intangible as consumer confidence. Even in the domain of science and technology, economists have made efforts to assess the return on investment (ROI) of science and engineering research funded by government and industry, finding that university-level research is one of the best long-term investments that can be made. Yet, in the United States, academic research—and research universities more broadly—have long remained exempt from any real adherence to performance- and ROI-based metrics despite this nation’s quantitative revolution.

The reasons for this are as much historical as institutional, going beyond just the difficulty of measuring research ROI. Under the auspices of Vannevar Bush, the chief scientist-policymaker in the Roosevelt and Truman administrations, science and engineering R&D, having demonstrated their value in World War II, gained an elevated stature in the public sphere as critical and unimpeachable assets to the U.S. superpower state. A social compact of sorts was struck between science and society—the federal government would support scientific research, primarily in universities, and the benefits would flow to students (via education) and the general public (via technological innovation). Accordingly, few questioned the value that research universities offered the nation, and there seemed little need for systematic evaluation, let alone metrics.

A changing compact

Now, more than six decades after World War II and more than 20 years after the Cold War, the social compact between research universities and the federal government is being questioned. Research universities are being asked to answer two difficult questions: Is the investment in university research paying off, and is current university research well structured to meet the challenges of the future? Today, the answers to these questions are no longer being taken for granted in the affirmative.

The shift from agnosticism to skepticism of scientific research is perhaps exemplified most clearly in the appropriations data. Research funding as a portion of the federal budget and as a percentage of gross domestic product (GDP) has fallen by more than 40% since 1970, with the funding for the physical sciences and engineering cut by 50% during the same period. In recent years, even biomedical research funding, which has been popular with the public and politicians, has lost ground to inflation. Partly because of constraints on funding, federal agencies have increased pressure on the various research communities to defend funding levels and set priorities, particularly in fields such as particle and nuclear physics, astronomy, and atmospheric and ocean sciences that require expensive experimental facilities. And although rigorous application of quantitative evaluation metrics has not yet become a routine part of budget planning for federal R&D programs and their advisors in the research communities, change is on the way. Already, the Government Performance and Result Act (GPRA) of 1993 requires that all funding agencies develop strategic plans, set performance goals, define metrics, assess progress, and explain any failure to meet goals. GPRA does not require metrics for every research project that an agency funds, but it clearly has altered the landscape. For example, the National Science Foundation (NSF) revised its review criteria in 1997 to better reflect its GPRA strategic plan by including a second “broader impacts” criterion.

To be sure, research isn’t the only aspect of the modern university’s portfolio that is being questioned; academic institutions have also lately come under fire for emphasizing the quality of research over education. For example, the President’s Council of Advisors on Science and Technology (PCAST) recently issued the report Engage to Excel: Producing One million Additional College Graduates with Degrees in Science, Technology, Engineering and Mathematics (STEM), which includes specific recommendations on how teaching quality at research universities can be bolstered. The report focuses on improving STEM teaching in the first two years of university study, with one objective being the retention of larger numbers of intended STEM majors. Historically, 40% of STEM majors change to non-STEM disciplines before graduation. It’s difficult, however, to assess the efficacy of the report’s recommendations without a standardized means of evaluation. Without the development of some mechanism—through collaboration between the universities and federal agencies, not by imposing new regulations—to evaluate the effectiveness of innovative approaches to undergraduate education, it’s unlikely that they can see proper implementation.

Of course, there is an even clearer impetus for change: The current U.S. debt crisis, with federal deficits on the order of $1 trillion, promises to severely squeeze discretionary spending, especially non-defense budgets. Historically, federal research funding to universities has tended to rise and fall with overall domestic discretionary spending. President Obama and many members of Congress in both parties appreciate the special importance of investments in science and engineering research. However, they need ammunition in the form of compelling analysis that demonstrates the contribution that academic research makes to the well-being of Americans. Indeed, in this climate of tight purse strings, budgetary pressures, and anemic domestic growth, science policy is no longer exempt from data-driven accountability, and there is growing interest in identifying appropriate tools to measure and document the outcomes of investments in research universities; in short, a “science of science policy.”

A science of science policy

The perceived need for a science of science policy has not gone unaddressed. As we write, a multiagency, collaborative effort, spearheaded by NSF, is attempting to meticulously catalogue the socioeconomic effects of science R&D investments at the university level through a process called “STAR METRICS.” It is a step in the right direction and may herald the advent of powerful new tools to guide science policymaking on the federal level while demonstrating that U.S. research universities are upholding their end of the social compact. The American people are generally in favor of federal funding for academic research, but they still want to know what their tax dollars are buying, and carefully chosen metrics are a way to do that.

However, this progress in recognizing the importance of and measuring societal impacts of federally funded research universities, although substantive, requires a caveat. Finding metrics that accurately assess the true value of research universities—the qualitative contributions and long-range potential as well as more easily measured impacts—is enormously difficult.

Industry leaders and many policymakers are understandably focused on the role of research in the nation’s economic competiveness and are thus tempted to emphasize shortterm economic gains as a measure of research impact. But doing so frames this values debate in a manner that ignores the rich history of U.S. innovation and threatens to damage the nation’s leadership in science and technology and its vital role in the innovation ecosystem. Although the scientific community may not lose such a battle outright—who wants to argue against nurturing the next Google, whose pioneering algorithmic studies at Stanford University were originally funded by the NSF?—the debate becomes, at the very least, an uphill battle. By intrinsically limiting the scope of what benefits scientific research provides for the nation, we would fundamentally narrow the utility and significance of such debate and impose a handicap on research and universities where there should be none.

Broader impacts

Ultimately, the true societal value of the nation’s intellectual capital coming from scientific research and research universities cannot be monetized, packaged, and fit neatly into dollars and cents. Can biodiversity, clean oceans, and clearer skies really be expressed in terms of jobs or income? And what of revolutionary discoveries and advances in medicine and security? Shouldn’t we know more about the fundamental makeup of the universe and our place in it? Furthermore, even for discoveries and inventions that do have the promise of future commercial applications, it often takes decades to realize the results.

With respect to STEM education, there is a compelling case to be made for the broader advantages that research universities confer on the educational process. After all, in a world in which information of all kinds is readily and inexpensively accessible, process knowledge increasingly trumps content knowledge; facts, lectures, and textbookbased learning must necessarily be subordinate to handson, design-based experiences. As such, it is intuitive that the opportunity that universities provide to undergraduates for independent research—such as those funded by the NSF’s Research Experiences for Undergraduates program—is a critical aspect of the educational process of becoming a truly 21st century-ready scientist or engineer. It is difficult, however, to imagine this benefit accruing in the form of higher test scores or starting salaries.

And that’s only the lip of the test tube. The social compact inspired by Vannevar Bush’s vision and fueled by the investment of federal funds in higher education has delivered on its promise by creating a government-supported, universitydriven research system that has graduated generations of scientists, engineers, and other professionals who became the nation’s innovators and business leaders. The impact does not end at the U.S. border. Widely acknowledged as the best in the world, U.S. universities have educated countless non-native intellectuals, foreign officials, and even heads of state and their children. Given the sheer ubiquity of U.S. training among the international intelligentsia, one can only imagine the degree to which U.S. perspectives dictate world discourse, simply by virtue of cultural and intellectual diffusion. This “soft power” can lead to progress where standard diplomatic channels are blocked. During the darkest days of the Cold War, U.S. physicists and their Russian colleagues continued to collaborate in an effort to peel back the underlying mechanisms of the natural universe. This is what science diplomacy is all about. However, although the benefits seem clear, the impact is difficult to quantify.

Efforts of other nations

The challenge of evaluating the perhaps unquantifiable impacts of research universities need not prove intractable. Indeed, the United States is not alone in this endeavor, and this nation can learn from the experiences of other nations as U.S. researchers, working with the federal funding agencies, seek to develop research performance metrics most appropriate for this country.

For example, in 2008, Australia, one of the world’s leaders in the percentage of GDP it devotes to scientific research, replaced its quantitative metrics paradigm with a more qualitative “Research Quality Framework,” which includes panel assessments of “impact in the form of the social, economic, environmental and cultural returns of research beyond the academic peer community.” This national policy is based on the latest research in context-dependent metrics and places particular emphasis on the social implications and effects of university-led scientific research. It is noteworthy that the Australian approach does not jettison quantitative metrics entirely; rather, it attempts to deftly merge the subjective with the objective, with evaluations grounded firmly on the basis of expert opinion.

Australia is not alone. New Zealand and the Netherlands have developed and incorporated impact assessments of scientific research that extend beyond markets and academia. The United Kingdom has also taken steps to include broader impact evaluations and qualitative metrics for research universities alongside traditional quantitative measures with its Research Assessment Exercise and Research Assessment Framework, although not without controversy. Sweden, France, and Singapore have begun devising hybrid science policy measurement schemes of their own as well.

These examples make it clear that there is a large pool of best practices (or at least experiences) from around the world to draw on, improve on, and adapt to U.S. needs. Combining these ideas with current pilot initiatives such as STAR METRICS and the public value-mapping and sociotechnical integration research being carried out at the Consortium for Science, Policy, and Outcomes, allows for a practical and seamless transition to this framework, while leveraging federal resources to permit scalability.

A way forward

Many U.S. academic researchers are questioning whether current efforts to define research evaluation metrics are likely to be fruitful and are leery of getting involved. Academic researchers understandably worry that the likely result of efforts to define research metrics and evaluation mechanisms will be an increased emphasis on “directed research” and a consequent loss of freedom to explore fundamental aspects of nature. This is a legitimate concern because the trends in recent decades have been worrisome. But the issue of research assessment is not likely to go away. U.S. researchers, like their colleagues in other parts of the world, need to work with the funding agencies to do two things: ensure that fundamental basic research remains high on the list of priorities and help develop the most appropriate metrics and mechanisms to evaluate research effects, both those that are quantifiable and, arguably more important, those that are not. The need to act is becoming more urgent in the face of a worldwide budgetary crisis that is likely to be around for some time.

One area that deserves particular attention is the impact of research on the quality of university education, undergraduate as well as graduate. The need for such measurements is motivated by the aforementioned White House report that may come to represent a tipping point for STEM education at research universities. This may be an ideal opportunity for universities and agencies to work together by launching experiments with different approaches to educational evaluation in parallel with research-impact assessments.

Whatever evaluative processes the research communities and federal agencies select, market-based metrics must not be the primary consideration. The American people will be best served by metrics that capture broad social contributions of research universities in a holistic, contextual manner. Moreover, U.S. leadership in science, engineering, and technology will be best served by metrics that are clear, sensible, and attentive to the long-term value that can result from breakthrough research. Although research metrics will vary across federal agencies, according to their respective roles and missions, there are some fundamentals that define the quality of research and that should be reflected in standard research metrics that apply across government; indeed, that are universal.

Unless the U.S. research community engages in the process of determining appropriate quantitative and qualitative metrics as well as assessment mechanisms that are based on expert opinion, rather than columns of numbers, the nation could end up saddled with a system that suppresses innovation and drives the best minds out of science and engineering or out of the country. The American people would be the ultimate losers.

To wit, it was Einstein himself who said “Everything that can be counted does not necessarily count; everything that counts cannot necessarily be counted.”

Rahul Rekhi () is a senior in the Departments of Bioengineering and Economics at Rice University, as well as at the James A. Baker III Institute for Public Policy. Neal Lane () is the Malcolm Gillis University Professor and Senior Fellow in Science and Technology Policy at the James A. Baker III Institute for Public Policy at Rice University.

Your participation enriches the conversation

Respond to the ideas raised in this essay by writing to [email protected]. And read what others are saying in our lively Forum section.

Cite this Article

Lane, Neal, and Rahul Rekhi. “Qualitative Metrics in Science Policy: What Can’t Be Counted, Counts.” Issues in Science and Technology 29, no. 1 (Fall 2012).

Vol. XXIX, No. 1, Fall 2012