The Big Data Mindset
Machine learning and Big Data complement the managerial imperatives of modern industrial society—and risk extending them too.
Implicit in the widely held expectations of a bright technological future is the belief that a certain comprehensive knowledge about the elements of a system will allow scientists to grasp the rules for how that system operates and how to manipulate it. So, for example, we assume that by knowing the full human genome, scientists will necessarily gain insight into the language of the genes and eventually cure debilitating disease. Similarly, it is widely believed that by amassing enough information on past climate activity and atmospheric chemistry, we can confidently make predictions about Earth’s future climate.
Underlying this optimism about knowledge and power is a belief in a hidden order that governs the dynamics of even complex systems and lays waiting to be discovered. We believe that more data, and bigger data, will allow us to grasp the hidden order. This belief has been present in some form in modern science since its earliest days, and has even older roots, but today’s technology is giving it a new spin. Researchers working on “Big Data” projects believe that by exposing a computer to a sufficient number of samples representing different conditions, the computer can learn the supposed rules that govern the system. In “machine learning”—generally considered a subfield of artificial intelligence—a similar approach usually involves training a computer by having it analyze sample data sets so that it can respond to new sets of conditions not encountered in the sample sets.
A few examples of machine learning will illustrate how this is supposed to work. By exposing the sensors and software of a specially prepared vehicle to a sufficiently high number of traffic situations, researchers and engineers expect that the vehicle will be able to respond appropriately to future traffic situations not specifically encountered previously. Medical applications include familiarizing software with a large number of contoured CT images of the human liver, enabling it to discern the liver in images of future patients so that treatments can be better targeted. These examples typify the attitude that, given sufficient data, we will be able to grasp and exploit the regularity behind natural and social systems, even if that regularity is too complex for us to understand on our own without our machines.
Are the assumptions behind machine learning and Big Data correct? Do underlying patterns govern complex natural and social systems, waiting to be discovered? Is there a hidden order to complex phenomena? Do the natural and artificial systems being studied always have enough order to allow it to be practically exploited?
Whether or not it has an entirely sound basis, the philosophical assumptions behind machine learning and Big Data resonate with our wider culture. They speak to the deep human longing to find an underlying reality in the world, an order behind the chaos. In pre-modern times, this human desire for order and explanation found some satisfaction in silent gods and other hidden forces. Today, this impulse can be satisfied without resort to the supernatural: The truth can be known, so long as we have enough data and computing power.
Big Data also complements the managerial imperatives of modern industrial society—in both the private and public spheres. It offers a bevy of new possibilities to expand commerce. Businesses value predictability and prefer operations that can be routinized and measured. Good managers seek to regularize systems—to make them more predictable. Machine learning and Big Data offer businesses better predictive abilities to support operations, make internal decisions, and market their products.
In the years ahead, Big Data is also likely to play an increasing role in government. In particular, bureaucrats—whose work relies on data about demographics, economic activity, nature, and the actions of citizens—will welcome these new tools for description and prediction. The ostensible objectivity of machine learning and Big Data adds to their attractiveness to government officials wary of accusations of cultural bias—after all, these techniques promise to provide a quantitative, rational, and egalitarian basis for making decisions. In law enforcement, many jurisdictions have already embraced what has been dubbed “Big Data policing” and some are taking the first steps toward “predictive policing.”
Will machine learning and Big Data live up to their promise? Will enough Big Data—or big enough data—necessarily lead to more effective modeling of complex systems? Will it allow us to predict the future? Even if its predictions are imperfect, it will remain attractive in a managerial-oriented society. The danger of expecting too much from Big Data is not that it will fail to deliver. Whether its answers are correct or not, Big Data will deliver answers—answers that speak to a logic of its own making as a tool, self-consistent answers, giving us the sense of predictability we long for. The danger is that in yielding to that logic, we will be serving the imperatives of commerce and government, and relinquishing the freedom and responsibility that make us distinctively human. In our desperation for meaning and control in this chaotic world, we run the risk of allowing the standardizing, quantifying, digitizing, and monetizing of every aspect of our culture and our individual souls.