Question Your Algorithms

A DISCUSSION OF

Read Responses From

Richard Berk
Hannah Sassaman
Megan Stevenson

In “What is ‘Fair’? Algorithms in Criminal Justice” (Issues, Spring 2018), Stephanie Wykstra does a masterful job summarizing challenges faced by algorithmic risk assessments used to inform criminal justice decisions. These are routine decisions made by police officers, magistrates, judges, parole boards, probation officers, and others. There is surely room for improvement. The computer algorithms increasingly being deployed can foster better decisions by improving the accuracy, fairness, and transparency of risk forecasts.

But there are inevitable trade-offs that can be especially vexing for the very groups with legitimate fairness concerns. For example, the leading cause of death among young, male African Americans is homicide. The most likely perpetrators of those homicides are other young, male African Americans. How do we construct algorithms that balance fairness with safety, especially for those mostly likely to be affected by both?

An initial step is to unpack different sources of unfairness. First, there are the data used to train and evaluate an algorithm. In practice, such data include an individual’s past contacts with the criminal justice system. If those contacts have been significantly shaped by illegitimate factors such as race, the potential for unfairness becomes a feature of the training data. The issues can be subtle. For example, a dominant driver of police activity is citizen calls for service (911 calls). There are typically many more calls for service from high-crime neighborhoods. With more police being called to certain neighborhoods, there will likely be more police-citizen encounters that, in turn, can lead to more arrests. People in those neighborhoods can have longer criminal records as a result. Are those longer criminal records a source of “bias” when higher-density policing is substantially a response to citizen requests for service?

Second is the algorithm itself. I have never known of an algorithm that had unfairness built into its code. The challenge is potential unfairness introduced by the data on which an algorithm trains. Many researchers are working on ways to reduce such unfairness, but given the data, daunting trade-offs follow.

Third are the decisions informed by the algorithmic output. Such output can be misunderstood, misused, or discarded. Unfairness can result not from the algorithm, but from how its output is employed.

Finally, there are actions taken once decisions are made. These can alter fundamentally how a risk assessment is received. For example, a risk assessment used to divert offenders to drug treatment programs rather than prison, especially if prison populations are significantly reduced, is likely to generate fewer concerns than the same algorithm used to determine the length of a prison sentence. Optics matter.

Blaming algorithms for risk assessment unfairness can be misguided. I would start upstream with the training data. There needs to be much better understanding of what is being measured, how the measurement is actually undertaken, and how the data are prepared for analysis. Bias and error can easily be introduced at each step. With better understanding, it should be possible to improve data quality in ways that make algorithm risk assessments more accurate, fair, and transparent.

Richard Berk

Professor of Criminology and Statistics
University of Pennsylvania

As decision-makers increasingly turn to algorithms to help mete out everything from Facebook ads to public resources, those of us judged and sorted by these tools have little to no recourse to argue against an algorithm’s decree against us, or about us. We might not care that much if an algorithm sells us particular products online (though then again, we might, especially if we are Latanya Sweeney, who found that black-sounding names were up to 25% more likely to provoke arrest-related ad results on a Google search). But when algorithms deny us Medicaid-funded home care or threaten to take away our kids through child protective services, the decisions informed by these tools are dire indeed—and could mean life and death.

Here in Philadelphia, a large group of criminal justice reformers is working to drive down the population of our jails—all clustered on State Road in Philly’s far-Northeast. Thousands of people are held on bails they can’t afford or on violations of previous probation and parole convictions, and reformers have already been able to decrease the population by more than 30% in two years—and the city’s new district attorney has told his 600 assistant DAs to push to release on their own recognizance people accused of 26 different low-level charges. But in an effort to go further, decision-makers are working to build a risk-assessment algorithm that will sort accused people into categories: those deemed to be at risk of not showing up to court, those at risk of arrest if released pretrial, and those at risk of arrest with a violent charge.

In her compelling and helpful overview of the most urgent debates in risk assessment, Stephanie Wykstra lifts up the importance of dividing the scientific questions associated with risk assessment from the moral ones. Who truly decides what “risk” means in our communities? Wykstra includes rich commentary from leaders in diverse fields, including data scientists, philosophers, scholars, economists, and national advocates. But often, the risk of not appearing in court is conflated with risk of arrest. We need to understand that communities fighting to unwind centuries of racism in practice define what level of risk their communities might tolerate very differently from how others active in criminal justice reform do.

I would posit that no government should put a risk-assessment tool between a community and its freedom without giving that community extraordinary transparency and power over that tool and the decisions it makes and informs. Robust and longstanding coalitions of community members have prevented Philly legislators from buying land for a new jail, have pushed the city to commit to closing its oldest and most notorious jail, and have since turned their sights on ending the use of money bail. That same community deserves actual power over risk assessment: to ensure that any pretrial tool is independently reviewed and audited, that its data are transparent, that they help to decide what is risky, that tools are audited for how they are used by actual criminal justice decision-makers, and that they are calibrated to antiracist aims in this age of mass incarceration in the United States.

Our communities have understandable fear of risk assessment as an “objective” or “evidence-based intervention” into criminal justice decision-making. Risk-assessment tools have mixed reviews in practice—and have not always been focused on reducing jail populations. Because black and brown Philadelphians are so brutally overpoliced, any algorithm that weights convictions, charges, and other crime data heavily will reproduce the ugly bias in the city and the society. But we also see these tools spreading like wildfire. Our position is that we must end money bail and massively reduce pretrial incarceration, without the use of risk-assessment algorithms. But if Philly builds one as a part of its move to decrease the prison population in the nation’s poorest big city, we have the right to know what these tools say about us—and to ensure that we have power over them for the long haul.

Hannah Sassaman

Policy Director
Media Mobilizing Project

Scholars, activists, and advocates have put an enormous amount of intellectual energy into discussing criminal justice risk-assessment algorithms. Stephanie Wykstra sifts through this literature to provide a lucid summary of some of its most important highlights. She shows that achieving algorithmic fairness is not easy. For one thing, racial and economic disparities are baked into both the inputs and the outputs of the algorithms. Also, fairness is a slippery term: satisfying one definition of algorithmic fairness inherently means violating another definition.

The rich garden of observer commentary on criminal justice algorithms stands in contradiction to the disinterest or even disregard of those who are responsible for using them. As Michael Jordan put it in his recent essay on artificial intelligence, “the revolution hasn’t happened yet.” Angele Christin’s research shows significant discrepancies between what managers proclaim about algorithms and how they are actually used. She found that those using risk-assessment tools engage in “foot-dragging, gaming, and open critique” to minimize the impact of algorithms on their daily work. Brandon Garrett and John Monahan show broad-reaching skepticism toward risk-assessment among Virginia judges. A significant minority even report being unfamiliar with the state’s tool, despite it having been in use for more than 15 years. My own research shows that a Kentucky law making pretrial risk assessment mandatory led to only a small and short-lived increase in release; judges overruled the recommendations of the risk assessments more often than not.

Algorithmic risk assessments are still just a tiny cog in a large, human system. Even if we were to design the most exceptionally fair algorithm, it would remain nothing more than a tool in the hands of human decision-makers. It is that person’s beliefs and incentives, as well as the set of options available within the system, that determine its impact.

Much of the injustice in criminal justice stems from things that are big, basic, and blunt. Poor communication between police and prosecutors (and a rubber-stamp probable cause hearing) means that people can be held in jail for weeks or months before someone realizes that there is no case. Data records are atrociously maintained, so people are hounded for fines that they already paid or kept incarcerated for longer than their sentence. Public defenders are overworked, so defendants have access to only minutes of a lawyer’s time, if that. These are issues of funding and management. They are not sexy twenty-first century topics, but they are important, and they will be changed only when there is political will to do so.

Wykstra’s summary of the fairness in criminal justice algorithms is intelligent and compassionate. I’m glad so many smart people are thinking about criminal justice issues. But as Angele Christin writes in Logic magazine, “Politics, not technology, is the force responsible for creating a highly punitive criminal justice system. And transforming that system is ultimately a political task, not a technical one.”

Megan Stevenson

Assistant Professor of Law
George Mason University

Search Issues