Public Service Data
A DISCUSSION OFA Vision for Democratizing Government Data
Read Responses From
In my job, I witness firsthand how the tools of data science and artificial intelligence can transform social approaches to human challenges. And I agree with Julia Lane’s vision, eloquently presented in “A Vision for Democratizing Government Data” (Issues, Fall 2022), that unlocking data access for policymakers will open multitiered opportunity—first, to create enabling environments to access, sift, and analyze information; and second, to ensure that the appropriate consumers are equipped with the policy and experiential frames to responsibly use that data for social import.
As one example of the potentials, the Invisible Institute in Chicago applies data at scale to identify racial disparities in nationwide eviction patterns. With a long history of research in urban policy, socioeconomic effects of discriminatory policies, and new access to national macro data sets, the institute advocates for a new policy framework for regulators and city officials to improve racial equity in housing. This organization and others with similar missions continue to generate tangible change by identifying a problem and scouring government data to influence policy.
To build a data marketplace that serves the needs of society, I suggest considering the following:
Build solutions that address short-term problems while unlocking long-term opportunity. A data marketplace is foundational infrastructure, but cannot stand alone. Actively engaging stakeholders to contribute to solutions requires us to first define our hypotheses through a human-centered lens. Developing evidence-based policy solutions can address short-term harms—and build long-term equity.
Civil society can help turn data into insight. Civil society organizations intimately understand societal challenges—and serve as long-term informal data stewards. By creating open access to government data for civil society, we can generate new insights on labor mobilization, humane policing, and several other issues from organizations with deep historical context and community knowledge.
Citizen participation is essential for effective policy development. We need new structures for data agency that empower individuals to both contribute to and benefit from these shared data marketplaces. Through new mechanisms of data governance, fiduciaries, or consumer protection, citizens can access transparent pathways to understand how their data are used.
Unlocking insights at the intersection of government, private-sector, and civil society data remains underexplored territory at all levels. We have the opportunity to create a new dynamic that empowers individuals to safely and confidently share their data, equips civil society organizations to best represent their interests, and accelerates more effective and targeted policy decisions for a better future. Lane’s vision of a data-enabled world is within reach—one that leverages data for effective public policy in tireless pursuit of equity, human welfare, and a thriving future for humanity.
Patrick J. McGovern Foundation
Democracy needs data as much as Julia Lane persuasively argues that data need to be democratized (responsibly). Democracy depends on public trust in government, which in turn depends on ensuring that government is making the best use of taxpayer dollars. That accountability requires analysis and data, evaluation, and the communication of the results.
Data and analysis are essential not only because they can be used to communicate what government is doing and the costs and benefits of those actions, but also because data and analysis allow for policy improvements. Whether the public is involved in that process or not, voters and taxpayers expect government to do its job well. Yet refinements cannot occur in a data vacuum; good policy requires evidence that can be obtained only from data and analysis.
If there are to be evaluations of government action and accountability, it is essential that some of that analysis be performed outside the government itself. Certainly, in today’s divisive environment, agencies taking a trust us, we’re on top of things approach seems unlikely to satisfy the public. Thus, it is essential that data be made accessible to at least some portion of the stakeholder community in a responsible manner (i.e., with suitable protections of privacy and confidentiality).
In a recent article in Harvard Data Science Review, my coauthors and I argue, like Lane, that the role of the community is often underappreciated as an element in the success of data infrastructure projects. Here I highlight two roles for community. First, as indicated, community evaluation by people outside government is important to build trust and accountability. Thus, there must be a community of practice conducting analysis on whether government policy has achieved its intended goals, vetting findings, and communicating them. In such a community, it is important that the full spectrum of viewpoints be welcomed and fostered.
Community’s second essential role is in the provision and protection of data. Namely, there is tremendous potential from integrating data into what we refer to as “linked data mosaics.” In such architectures, many datasets provided by different entities are integrated. Such a data architecture has exponential potential because the addition of one set of elements makes it possible to combine and analyze data in radically new ways. In the case of innovation in science, technology, engineering, and mathematics, integrating data on education in these fields with data on the workforce unleashes tremendous potential compared with analyzing either on its own. But obtaining buy-in from data holders to integrate their data requires fostering community among data providers as well as users.
For better or worse, we live in contentious times in which thoughtful analysis of data may not persuade everyone. Yet, I hold firmly to the belief that the responsible democratization of data and communication of evidence from their analysis is critical for good policy and public trust. In this way, the democratization of data has the potential to inform the electorate and is ultimately essential for democracy.
Bruce A. Weinberg
Eric Byron Fix-Monda Professor of Economics and Public Affairs, Ohio State University
Research Fellow, IZA Institute for the Study of Labor
Research Associate, National Bureau of Economic Research
When I started at Office of Management and Budget more than 20 years ago, examining the Department of Energy’s Office of Science and later its Advanced Research Projects Agency-Energy (ARPA-E), I was continually surprised by the near absence of output and outcome data in science policy compared with other policy domains. Debates in environmental or labor policy are routinely informed by analyses arguing about the effectiveness of this regulation or that element of program design. By comparison, the dominant tool available to science policy is allocating resources among research programs through the annual budget process—and the debate centers on marginal changes of the topline up or down.
Traditional examining practice rests upon prospectively assessing the clarity of a research program’s strategy and the strength of its ability to set clear, achievable, defensible priorities balanced against a retrospective analysis of output performance metrics for innovation, human capital production, and the dissemination of knowledge, applications, or tools. Given the paucity of robust data on the social, economic, and mission impacts of program-level research investments, much of the input driving spending priorities is derived from expert opinion—mainly reports from advisory committees, National Academy panels, or agency-convened community workshops. The processes producing these intermittent assessments of scientific or technical opportunity are slow and laborious. More is needed.
Quantitative analysis of individual research portfolio outputs and outcomes that can inform and ground that expert opinion is now becoming possible with well-designed data infrastructure, such as the UMETRICS initiative of the Institute for Research on Innovation & Science (IRIS), which Julia Lane cites in her Issues essay. The Office of Management and Budget and the White House Office of Science and Technology Policy also regularly construct cross-cutting interagency portfolios built from mission-driven research at the Department of Energy, the Department of Defense, NASA, the National Institutes of Health, and other agencies, complemented by discovery-driven research at the National Science Foundation to address national science and technology priorities.
While IRIS has made notable progress, more work needs to be done to develop analytical products for these national policy-level uses. For example, automated tracking of US representation among plenary speakers, session chairs, and individuals receiving awards at large, prestigious international scientific or technical meetings to enable timely benchmarking of fields and subfields would provide valuable information to complement the program-level production function data accessible through IRIS.
The United States faces an increasingly competitive international research and development landscape. China is approaching parity with the US level of R&D investment, and the European Union continues work to coordinate research investments among member states for greater impact. The United States needs to move more quickly and with a keener insight as to how to build effective research portfolios—and investing in a sustainable evidence infrastructure for federal decisionmaking cannot wait another 20 years.
Vice Chancellor for Science Policy and Research Strategies
University of Pittsburgh
Member, Board of Directors
Institute for Research on Innovation & Science
White House science and technology managers say that by 2025, federal agencies must make research funded by taxpayers publicly accessible. The push toward open data started a decade earlier for greater transparency and access. This has the potential to surge data toward evidence-based policymaking and transform research into beneficial collective efforts and policies, as Julia Lane points out.
On August 25, the White House Office of Science and Technology Policy (OSTP) issued new guidance requiring agencies to update their public access policies to make publications and data funded by taxpayers publicly accessible, without embargos or cost, by the end of 2025.
As part of the new mandate, agencies will develop plans to improve transparency of the authorship, funding, affiliations, and development status of federally funded research. A memo from OSTP’s Alondra Nelson describing the guidance stresses a requirement that guardrails for scientific and research integrity be in place to strengthen trust in government data. “A federal public access policy consistent with our values of equal opportunity must allow for broad and expeditious sharing of federally funded research—and must allow all Americans to benefit from the returns on our research and development investments without delay,” Nelson said.
The Institute for the Advancement of Food and Nutrition Sciences (IAFNS) funds nutrition and food safety research, and we believe this requirement for sharing research results should be expanded to include open science. Indeed, IAFNS aligns with the federal push toward open data in that we fully disclose industry funding of research and share research methods and data with the scientific community. Part of Lane’s focus on “communities of practice” applies here as we advance actionable science. We also have updated and published principles for funding food and nutrition research in which transparency is central. In fact, scientific integrity is centered on transparency, as advocates of open science often argue. By sharing details and data, we strengthen the guardrails that separate the funding from the science and reflect the shift within the scientific community toward increased transparency and open science. We also make sure each published scientific paper we support is free to read on a journal’s website—whether it’s based on federal data or not.
IAFNS’s model brings researchers from the public and private sectors together to work on science and public health issues, including building databases—the key to the insights offered by big data. Public-private collaboration, where all interests are declared and all funding is acknowledged, can advance science for public benefit.
The bounty of information that will become available with the new OSTP mandate will advance food safety and nutrition sciences—and other scientific endeavors—by saturating the sector with greater access to data and findings. This is particularly important as many organizations treat data in a proprietary way, at times limiting access.
But let’s also not forget that in the hype about Big Data, the fact remains that it relies on a lot of “small” individual researchers and teams, painstakingly collecting, recording, and curating data and making the data available to a community of practice. For example, IAFNS partners with the US Department of Agriculture in the maintenance and expansion of Food Data Central databases that serve a variety of analytical purposes and decision supports.
We look forward to other research organizations sharing data and details about use of their data for research or decisions in their communities of practice. In this way, information can be made valuable and support timely, evidence-based decisionmaking in the operating environments of public and private-sector managers alike.
Institute for the Advancement of Food and Nutrition Sciences