Syndromic Surveillance

Population Health: The Big Picture

MICHAEL A. STOTO

Syndromic Surveillance

Public health officials have been quick to adopt this new tool for identifying emerging problems, but research is needed to assess its effectiveness.

Heightened awareness of the risks of bioterrorism since 9/11 coupled with a growing concern about naturally emerging and reemerging diseases such as West Nile, severe acute respiratory syndrome (SARS), and pandemic influenza have led public health policymakers to realize the need for early warning systems. The sooner health officials know about an attack or a natural disease outbreak, the sooner they can treat those who have already been exposed to the pathogen to minimize the health consequences, vaccinate some or all of the population to prevent further infection, and identify and isolate cases to prevent further transmission. Early warning systems are especially important for bioterrorism because, unlike other forms of terrorism, it may not be clear that an attack has taken place until people start becoming ill. Moreover, if terrorism is the cause, early detection might also help to identify the perpetrators.

“Syndromic surveillance” is a new public health tool intended to fill this need. The inspiration comes from a gastrointestinal disease outbreak in Milwaukee in 1993 involving over 400,000 people that was eventually traced to the intestinal parasite Cryptosporidium in the water supply. After the fact, it was discovered that sales of over-the-counter (OTC) antidiarrhea medications had increased more than threefold weeks before health officials knew about the outbreak. If OTC sales had been monitored, the logic goes, thousands of infections might have been prevented.

The theory of syndromic surveillance, illustrated in Figure 1, is that during an attack or a disease outbreak, people will first develop symptoms, then stay home from work or school, attempt to self-treat with OTC products, and eventually see a physician with nonspecific symptoms days before they are formally diagnosed and reported to the health department. To identify such behaviors, syndromic surveillance systems regularly monitor existing data for sudden changes or anomalies that might signal a disease outbreak. Syndromic surveillance systems have been developed to include data on school and work absenteeism, sales of OTC products, calls to nurse hotlines, and counts of hospital emergency room (ER) admissions or reports from primary physicians of certain symptoms or complaints. Current systems typically include large amounts of data and employ sophisticated information technology and statistical methods to gather, process, and display the information for decisionmakers in a timely way.

This theory was turned into a reality when some health departments, most notably New York City’s, began to monitor hospital ER admissions and other data streams. In 2001, the Defense Advanced Research Projects Agency funded four groups of academic and industrial scientists to develop the method. After 9/11, interest and activity in the method increased dramatically. The Centers for Disease Control and Prevention’s (CDC’s) BioSense project operates nationally and is slated for a major increase in resources. In addition, CDC’s multibillion-dollar investment in public health preparedness since 9/11 has encouraged and facilitated the development of syndromic surveillance systems throughout the country at the state and local levels. The ability to purchase turnkey surveillance systems from commercial or academic developers, plus the personnel ceilings and freezes in some states that have made it difficult for health departments to hire new staff, have also made investments in syndromic surveillance systems an attractive alternative. As a result, nearly all states and large cities are at least planning a syndromic surveillance system, and many are already operational.

In the short time since the idea was conceived, there have been remarkable developments in methods and tools used for syndromic surveillance. Researchers have capitalized on modern information technology, connectivity, and the increasingly computerized medical and administrative databases to develop tools that integrate vast amounts of disparate data, perform complex statistical analyses in real time, and display the results in thoughtful decision-support systems. The focus of these efforts is on identifying reliable and quickly collected data that are generated early in the disease process. Statisticians and computer scientists have adapted ideas from the statistical process control methods used in manufacturing, Bayesian belief networks, statistical pattern-recognition algorithms, and many other areas. Syndromic surveillance has also become an extraordinarily active research area. Since 2002, an annual national conference (www.syndromic.org) has drawn hundreds of researchers and practitioners from around the country and the world.

Many city and state public health agencies have begun spending substantial sums to develop and implement these surveillance systems. Despite (or maybe because of) the enthusiasm for syndromic surveillance, however, there have been few serious attempts to see whether this tool lives up to its promise, and some analysts and health officials have been skeptical about its ability to perform effectively as an early warning system. The balance between true and false alarms, and how syndromic surveillance can be integrated into public health practice in a way that truly leads to effective preventive actions, must be carefully assessed.

Practical concerns

Syndromic surveillance systems are intended to raise an alarm, which then must be followed up by epidemiologic investigation and preventive action, and all alarm systems have intrinsic statistical tradeoffs. The most well-known is that between sensitivity (the ability to detect an attack when it occurs) and the false-positive rate (the probability of sounding an alarm when there in fact is no attack). For instance, thousands of syndromic surveillance systems soon will be running simultaneously in cities and counties throughout the United States. Each might analyze data from 10 or more data series—symptom categories, separate hospitals, OTC sales, and so on. Imagine if every county in the United States had in place a single syndromic surveillance system with a 0.1 percent false-positive rate; that is, the alarm goesoff inappropriately only once in a thousand days. Because there are about 3,000 counties in the United States, on average three counties a day would have a false-positive alarm. The costs of excessive false alarms are both monetary, in terms of resources needed to respond to phantom events, and operational, because too many false events desensitize responders to real events.

Syndromic surveillance adds a third dimension to this tradeoff: timeliness. The false-positive rate can typically be reduced, but only by decreasing sensitivity or timeliness or both. Analyzing a week’s rather than a day’s data, for instance, would help improve the tradeoff between sensitivity and false positives, but waiting a week to gather the data would reduce the timeliness of an alarm.

Beyond purely statistical issues, the value of syndromic surveillance depends on how well it is integrated into public health systems. The detection of a sudden increase in cases of flulike illness—the kind of thing that syndromic surveillance can detect—can mean many things. It could be a bioterrorist attack but is more likely a natural occurrence, perhaps even the beginning of annual flu season. An increase in sales of flu medication might simply mean that pharmacies are having a sale. A surge in absenteeism could reflect natural causes or even a period of particularly pleasant spring weather.

Although the possibility of earlier detection and more rapid response to a bioterrorist event has tremendous intuitive appeal, its success depends on local health departments’ ability to respond effectively. When a syndromic surveillance system sounds an alarm, health departments typically wait a day or two to see if the number of cases continues to remain high or if a similar signal is found in other data sources. Doing so, of course, reduces both the timeliness and sensitivity of the original system. If the health department decides that an epidemiological investigation is warranted, it may begin by identifying those who are ill and talking to their physicians. If this does not resolve the matter, additional tests must be ordered and clinical specimens gathered for laboratory analysis. Health departments might choose to initiate active surveillance by contacting physicians to see if they have seen similar cases.

A syndromic surveillance system that says only “there have been 5 excess cases of flulike illness at hospital X” is not much use unless the 5 cases can be identified and reported to health officials. If there are 55 rather than the 50 cases expected, syndromic surveillance systems cannot say which 5 are the “excess” ones, and all 55 must be investigated. Finally, health departments cannot act simply on the basis of a suspicion. Even when the cause and route of exposure are known, the available control strategies—quarantine of suspected cases, mass vaccination, and so on—are expensive and controversial, and often their efficacy is unknown. Coupled with the confusion that is likely during a terrorist attack or even a natural disease outbreak, making decisions could take days or weeks.

Research questions and answers

Much of the current research on syndromic surveillance focuses on developing new methods and demonstrating how they work. Although impressive, this kind of research stops short of evaluating the methods from a theoretical or practical point of view. Comparing the promise of syndromic surveillance with practical concerns about its implementation leads to two broad research questions.

First, does syndromic surveillance really work as advertised? This includes questions about trade-offs among sensitivity, false-positive rates, and timeliness, as well as more practical concerns about what happens after the alarm goes off. Somewhat more positively, one can also ask how well syndromic surveillance works in detecting bioterrorism and natural disease outbreaks, and how this performance depends on the characteristics of the outbreak or attack. The performance likely depends on variables such as the pathogen causing the problem, the numbers of antee nor are necessarily required for timely detection. The most error-free and timely data will be useless if the responsible pathogen causes different symptoms than are represented in the data. On the other side, a sudden increase in nonspecific symptoms might indicate something worth further investigation.

The second question, and the focus of most current research, is about how the performance of syndromic surveillance systems can be improved. This includes gaining access to more, different, and timelier data, as well as identifying data streams with a high signal-to-noise ratio. Researchers are developing sophisticated statistical detection algorithms to elicit more from existing data and more accurate models that describe patterns in the data when there are no outbreaks, as well as detection algorithms that focus on particular kinds of patterns, such as geographical clusters, in the data. Other areas of exploration include methods for integrating data from a variety of sources and displaying it for decisionmakers in a way that enables and effectively guides the public health response. In response to these two broad questions, one line of research focuses on the quality and timeliness of the data used in syndromic surveillance systems. When patients are admitted to the emergency room, for instance, their diagnoses are not immediately known. How accurately, researchers can ask, does the chief complaint at admission map to diseases of concern? Are there more delays or incomplete reporting in data stream A than in B? Although such studies might help decisionmakers decide which data to include in syndromic surveillance systems, “good” data neither guarantee nor are necessarily required for timely detection. The most error-free and timely data will be useless if the responsible pathogen causes different symptoms than are represented in the data. On the other side, a sudden increase in nonspecific symptoms might indicate something worth further investigation.

A second line of research considers the epidemiologic characteristics of the pathogens that terrorists might use. Figure 2, for instance, illustrates the difference between an attack in which many people are exposed at the same time and one in which the contagious agent might cause large numbers of cases in multiple generations. Example A illustrates what might be found if 90 people were exposed to a noncontagious agent such as anthrax, and symptoms first appeared an average of 8 days after exposure. Example B illustrates the impact of a smaller number of people (24) exposed to a contagious agent such as smallpox with an average incubation period of 10 days. Two waves of cases appear, the second three times larger and 10 days after the first. The challenge— and the promise—of syndromic surveillance is to detect the outbreak and intervene by day 2 or 3. But the public health benefits of an early warning depend on the pathogen. In example A, everyone would already have been exposed by the time that the attack was detected; the benefits would depend on the ability of health officials to quickly identify and treat those exposed and on the effect of such treatment. On the other hand, if the agent were contagious, as in example B, intervention even at day 10 could prevent some or all of the second generation of cases.