Ground Truths Are Human Constructions

By Florian Jaton

Artificial intelligence algorithms are human-made, cultural constructs, something I saw first-hand as a scholar and technician embedded with AI teams for 30 months. Among the many concrete practices and materials these algorithms need in order to come into existence are sets of numerical values that enable machine learning. These referential repositories are often called “ground truths,” and when computer scientists construct or use these datasets to design new algorithms and attest to their efficiency, the process is called “ground-truthing.”

Understanding how ground-truthing works can reveal inherent limitations of algorithms—how they enable the spread of false information, pass biased judgments, or otherwise erode society’s agency—and this could also catalyze more thoughtful regulation. As long as ground-truthing remains clouded and abstract, society will struggle to prevent algorithms from causing harm and to optimize algorithms for the greater good.

Ground-truth datasets define AI algorithms’ fundamental goal of reliably predicting and generating a specific output—say, an image with requested specifications that resembles other input, such as web-crawled images. In other words, ground-truth datasets are deliberately constructed. As such, they, along with their resultant algorithms, are limited and arbitrary and bear the sociocultural fingerprints of the teams that made them.

Ground-truth datasets are deliberately constructed. As such, they, along with their resultant algorithms, are limited and arbitrary and bear the sociocultural fingerprints of the teams that made them.

Ground-truth datasets fall into at least two subsets: input data (what the algorithm should process) and output targets (what the algorithm should produce). In supervised machine learning, computer scientists start by building new algorithms using one part of the output targets annotated by human labelers, before evaluating their built algorithms on the remaining part. In the unsupervised (or “self-supervised”) machine learning that underpins most generative AI, output targets are used only to evaluate new algorithms.

Most production-grade generative AI systems are assemblages of algorithms built from both supervised and self-supervised machine learning. For example, an AI image generator depends on self-supervised diffusion algorithms (which create a new set of data based on a given set) and supervised noise reduction algorithms. In other words, generative AI is thoroughly dependent on ground truths and their socioculturally oriented nature, even if it is often presented—and rightly so—as a significant application of self-supervised learning.

Why does that matter? Much of AI punditry asserts that we live in a post-classification, post-socially constructed world in which computers have free access to “raw data,” which they refine into actionable truth. Yet data are never raw, and consequently actionable truth is never totally objective.

Algorithms do not create so much as retrieve what has already been supplied and defined—albeit repurposed and with varying levels of human intervention. This observation rebuts certain promises around AI and may sound like a disadvantage, but I believe that it could instead be an opportunity for social scientists to begin new collaborations with computer scientists. This could take the form of a professional social activity, people working together to describe the ground-truthing processes that underpin new algorithms, and so help make them more accountable and worthy.

Search Issues

Ground Truths Are Human Constructions

Join the Conversation