The Path to Better Health: Give People Their Data

Realizing the promise of AI for people with chronic conditions requires rebuilding our health information system so that patients own their own data.

As any family doctor can tell you, the greatest potential for improving health care isn’t discovering new treatments for disease, it’s getting patients to do the things we already know work—exercising, quitting smoking, eating right, and taking medications regularly. Despite astounding success in highly technical areas such as immunotherapy, genome editing, and surgical robotics, medicine has been unable to solve the most basic challenge of helping patients make good decisions for themselves. This challenge lies at the heart of the nation’s most pervasive health problems: diabetes, obesity, cardiovascular disease, depression, chronic pain, and many cancers.

In the past 10 years, the route to improved health through better individual decisions has become clear. Health information technology companies can now leverage advances in artificial intelligence (AI) and behavioral science to drive measurable improvements in how individuals manage their own health. Combining personal wellness apps, social messaging, health coaches, and peer support, companies such as Omada Health and Livongo are harnessing big data to deliver highly personalized programs that help patients adopt healthy habits.

Using such patient engagement platforms to support real behavioral change could usher in staggering improvements in public health.

In theory.

But achieving this goal would require access to an unprecedented level of information on patients and their lives. And patients rightly are wary of providing such information, absent clear guidelines around how it will be used, and confidence in a system that will safeguard their individual health data. Neither of these basic criteria are met in our current system, where health data are strewn across a myriad of disparate electronic record systems, far out of the reach—and control—of patients themselves.

Render unto patients

Realizing the promise of AI in health care requires that we wrench our current health information system off its antiquated pillars and rebuild it on a new foundation: ownership of data by patients. By firmly entrenching the principle that patients own their personal health data, we can both address patient privacy concerns and create a functioning market for health data, building an environment that invigorates patient engagement and stimulates patient- focused innovation. The foremost barrier to such a system is the complex network of relationships that currently defines who creates, who keeps, and who owns health data.

A simple visit to a primary care clinic today generates an enormous amount of data. It begins with the complaint itself—the reason the patient is being seen. Then there are vital signs, exam findings, nurse assessments, physician visit notes, diagnosis codes, lab orders, radiology images, and insurance claims. When patients leave the office, however, they leave with none of that data. At best, they leave with login information for a hospital online portal where they later can access lab results or radiology reports. The data remain with the provider, hospital, lab, or insurer.

The transition from paper files to electronic medical records offered an opportunity to transfer ownership to patients–much like an iTunes library.

In many ways, this is a vestige of our old system of care delivery and record keeping. Data previously lived on paper charts in hospitals and doctors’ offices. The transition to electronic medical records (EMRs) offered an opportunity to transfer ownership to patients. A personal electronic health chart—much like an iTunes library—could have been the central organizing theme. From cradle to grave, patients could populate their personal whole-health record, and grant permission to read, write, or edit as needed to health providers. A system of two-factor authentication would guarantee that only parties authorized by the patient could access that record.

Instead, the transition to EMRs merely recapitulated the old system, as patient data moved from paper charts to proprietary software systems. Indeed, the transition actually moved data one degree of separation further from the patient, as hospitals and providers paid exorbitant sums to EMR vendors to hold the data in archaic, poorly searchable databases.

As a result, we have entrenched what is essentially a “pull” system, where patients are forced to request their own data—either by signing into clunky, third-party portals, or, even worse, paying for faxed copies of paper records.

As a result, we have entrenched what is essentially a “pull” system, where patients are forced to request their own data—either by signing into clunky, third-party portals, or, even worse, paying for faxed copies of paper records. Empowering patients to take control of their health requires a fundamental switch to a “push” system, where providing data to patients in a usable format becomes the default.

As owners of their own data, patients become greater stakeholders in their own health. Instead of being the target of interventions aimed at them, patients who own their data become self-advocates, better able to articulate their personal preferences and achieve self-determined health goals.

Ownership of health records would enable patients to take advantage of one of the most promising breakthroughs in medicine today: predictive models arising from advances in machine learning, one of the domains of AI. The predictive power of machine learning already is having an enormous impact on both the operational and clinical spheres of health care. Companies such as Qventus, where I work, use machine learning to help hospitals optimize staffing, decrease wait times, improve patient flow, and shorten hospital length of stay. Population health analytics vendors such as Enli and HealthEC employ machine learning to identify patients most likely to visit an emergency department, so that health systems can focus resources on those patients and avoid preventable hospitalizations. Machine learning applications are helping physicians detect disease sooner, diagnose pathology more accurately, and pick therapeutic targets more precisely.

Instead of being the target of interventions aimed at them, patients who own their data become self-advocates, better able to articulate their personal preferences and achieve self-determined health goals.

Applying machine learning to clinical diagnosis and treatment decisions, however, has proven to be a more difficult undertaking. One reason is the sheer number of characteristics that can influence a patient’s potential to develop disease or respond to an intervention. For instance, in addition to traditional risk factors such as family history, diet, and exercise, we know of at least 280 genetic variants that have been correlated with a patient’s likelihood of developing hypertension.

On top of this, studies of the determinants of population health increasingly show that there is an abundance of data not typically collected by health care providers that can have a sizeable impact on patient health decisions. For example, Welltok—which describes itself as a “consumer activation company” that “motivates” people “to improve their physical, mental, social and financial wellbeing”—found that household composition and voting history correlated with visits to emergency departments. Collecting and accounting for the countless epidemiologic and lifestyle factors that may influence patient behavior—education level, diet, marital status, car ownership, job schedule, social connectedness, and so on—is a herculean task.

Another challenge lies in the nature of machine learning itself. As the number of features included in a model increases linearly, the number of examples needed to train that model increases exponentially. Where a few hundred examples might suffice to train a simple model with a handful of features (such as a real estate algorithm that predicts the ideal list price for a house based on square footage, location, and number of bedrooms), a complex behavioral model with dozens of features would require many orders of magnitude more.

Thus, developing applications that successfully support patients in achieving personal health goals will rely on access not just to that one patient’s health care data, but to medical and personal data from millions. Few health systems have that volume of patient data, and what they do have is incomplete, missing the rich personal device-captured activity and lifestyle data that are essential to creating effective prescriptive models. How, then, can data sets unifying patient and personal information of millions of people be constructed? There are two options. One is for patients to give health systems full access to their personal information. Given the well-publicized data breaches that regularly befall our public and private institutions, including hospitals, it seems unreasonable to expect patients to trust health systems with this degree of access. The second option is for health providers to turn over health data to their rightful owners, patients.

Patient-centered AI

Pulling off such a fundamental paradigm shift may seem farfetched, but the notion of ownership should be intrinsic to our identity as participants in a functioning market. There is currently no good theoretical framework that defines legal ownership of the data in our health care marketplace. As a practical matter, ownership of health care data is now a byproduct of attempts by holders of the data (hospitals, insurers, device manufacturers, and internet behemoths) to monetize the resource.

As a practical matter, ownership of health care data is now a byproduct of attempts by holders of the data (hospitals, insurers, device manufacturers, and internet behemoths) to monetize the resource.

These careless forays not only represent a legal risk to the parties involved, but are a potential mortal threat to the future of AI in health care. At the end of 2019, the Office of Civil Rights at the US Department of Health and Human Services launched an inquiry into Google and Ascension Health’s oddly named “Project Nightingale.” Without obtaining permission from patients, Ascension provided personally identifiable health information on millions of patients to Google as part of an effort to create a massive cloud-based repository of patient data. Such missteps by large market actors risk triggering public outcry and legislative responses that would lock up patient data, severely limiting access to the resource that machine learning algorithms need to successfully improve health. The strict regulations and burdens placed on hospitals with the enactment of the Health Information Portability and Accountability Act (HIPAA) by Congress in 1996 are a potent example of how well-intended efforts to protect privacy can also impair legitimate sharing of information that would otherwise benefit patients.

To avoid such pitfalls, policymakers need to foster the creation of a well-defined market—built on a firm foundation of patient ownership of health care data—that provides a reliable and efficient system for transactions. With undisputed ownership of their data and firm control of their privacy, patients could confidently engage in market transactions. In much the same way that users select privacy settings in Facebook, patients could choose what level of privacy they want for their data, and preferences for how the data could be used or shared.

Such a patient-centered paradigm would be a vast improvement over today’s system, where data are fragmented across innumerable hospital, clinic, lab, and insurance systems. We have festooned that system with patches—health information exchanges and interoperability requirements— but there is still no continuous, unified record of the patient health experience. The well-documented result is waste and medical error, as providers repeat expensive studies and miss diagnoses due to lack of access to siloed results.

A patient-centered paradigm would be a vast improvement over today’s system, where data are fragmented across innumerable hopital, clinic, lab, and insurance systems.

From a data quality standpoint, a patient-controlled whole-health record is also a vastly superior market offering. That whole-health record is inherently more complete, and enables the emergence of an industry standard for how patient data are organized and stored. A rule of thumb widely repeated in the data analytics world is that roughly 80% of the time and energy involved in a machine learning project is spent acquiring, preparing, and cleansing data. A well-defined market and standardized patient record greatly reduces both the costs and risks associated with data acquisition, and the work of data engineering.

Making patients rather than health systems the source of data has the added benefit of focusing innovation on meeting the demands and preferences of patients, rather than on serving the objectives of companies that now hold the data. Information technology companies will have to compete to prove their value to patients, in exchange for access to data. And as consumer advertisers and social messaging applications increasingly leverage machine learning and behavioral economics to their own advantage, persistent innovation will be a market imperative for personal health platforms competing for the limited commodity of user attention.

Your smartphone will see you now

To imagine the potential power of such a model, consider a patient with new onset depression in today’s health care system. That patient may or may not realize the significance of the symptoms or identify them with depression. However, after several months—and perhaps some prodding from a loved one—the patient may decide to talk to their doctor (though as many as two-thirds of patients with depression do not seek care). At the appointment, their provider will likely use a validated questionnaire to assess for depression, and, if the result is positive, suggest counseling or prescribe a medication. The patient may fill the prescription (though fewer than 90% do) and may take the medication regularly (though fewer than half will be doing this at six months).

Now imagine a patient in control of their own data, equipped with a personal wellness application focused on maximizing their individual health. The app may recognize the signs of depression even before the patient does. Sleep quality, the tone of texts and emails, social interactions with friends, activity level—all these are potent early cues to the mental health of a patient.

Recognizing these cues, their wellness app may prompt the patient with more focused questions that reflect the state of clinical understanding for identifying depression. Based on the answers, it may make some suggestions about increasing physical activity or spending time with friends. Such early interventions may actually help avoid the need for pharmaceuticals. If the patient does not respond to these interventions, the app could offer to make an appointment with the patient’s provider.

Once in the office, the patient could elect to share a summary of their personal data with their primary care doctor to help them understand the scope and severity of their symptoms— sleep duration down 12%, social activity decreased 40%, and so on. Doctors (who spend just 16 minutes on average with each patient) would find this objective data extremely useful in arriving at a diagnosis and treatment course. If a medication is prescribed, the personal health application could both help prompt medication adherence (up to 50% of unsuccessful treatment is due to noncompliance) and monitor for side effects. And, at their follow-up appointment in four weeks, the patient could share objective evidence of treatment response, giving the doctor reliable data to help further tailor treatment.

Now scale this application across millions of patients. Imagine the power of a personal wellness app that is able to analyze the data of a population, identify patterns of depression, and home in on successful treatment approaches matched to those patterns. Drawing from what is, in effect, a massive ongoing clinical trial, the app could deliver ever- improving clinical insights, even as it continually customizes care to meet the needs of the individual patient.

In much the same way that Google Maps helps you find the best route to work, your personal wellness app will help you find the best route to health, aggregating and learning from insights from millions of other patients, and customizing an approach based on your personal goals, habits, and preferences. Ownership of data by patients is the enabling structure that will support this next stage in patient engagement and create the data-rich environment where machine learning and innovation will thrive.

Prying control of patient data away from hospitals, insurers, and EMR vendors will not be easy, but it is an essential task that patient advocates, health information technology start-ups, free market thinkers, and champions of public health shouldall agree on. By rebuilding our health information system on the foundational principle of patient ownership of data, we can achieve the promise of AI in health care, giving patients the power to take control of their own health and equipping them with the tools to end the burden of preventable disease.

Your participation enriches the conversation

Respond to the ideas raised in this essay by writing to [email protected]. And read what others are saying in our lively Forum section.

Cite this Article

Cohen, Jason. “The Path to Better Health: Give People Their Data.” Issues in Science and Technology 37, no. 2 (Winter 2021).

Vol. XXXVII, No. 2, Winter 2021