Ethics statement

This study was approved and consent was waived by the Beth Israel Deaconess Medical Center Institutional Review Board (protocol number: 2008-P-000412).

Temperature monitoring

Temperature monitoring using the model TAT-5000 Exergen temporal artery thermometer (Exergen, Corp., Watertown, MA) was initiated in the triage area of an urban ED as a part of initial triage vital signs. Using this model, temperatures are measured by sliding the infrared thermometer across the forehead, a low-contact method that is expected to reduce the potential for disease transmission from the patient. Thermometers connected to data-logging modules replaced prior methods of measuring temperature at triage, such as oral and tympanic measurements. Included temperature data were collected between September 10, 2009 and August 29, 2011. Two to four data-logging thermometers were generally in use during this period (daily mean: 2.97, standard deviation [SD]: 1.08, range, 1–5), with exceptions for general maintenance. Thermometers were checked, time-stamped measurements were collected, and maintenance was performed on a roughly biweekly basis. Three thermometers were located at triage stations and one was located on a rolling unit. The data-logging thermometers were used as a surrogate to investigate the capabilities of real-time data reporting with networked wireless thermometers.

Inclusion criteria and temperature preprocessing

The study population comprised persons presenting at an ED (Boston, MA) who underwent an initial triage vital signs assessment performed with a data-logging thermometer. All such persons were included, regardless of time of day, age, gender, or chief complaint. General characteristics of presentations at the ED are summarized in Table 1. Temperature collection was not linked to other hospital records, thereby providing a conservative assessment of the value of body temperature data alone.

Table 1 General characteristics of cases presenting at the emergency department, September 2009 through August 2011 Full size table

In traditional clinical studies, inclusion criteria are used to isolate the study from factors that could reduce precision or introduce bias. However, in automated surveillance, even the smallest requirement for additional human intervention in the data collection process represents a substantial barrier to scaling and practical implementation. Accordingly, the data-logging thermometers were designed to operate exactly as standard TAT-5000 models during measurement, without requiring any additional steps. Every temperature was recorded and considered to meet the inclusion criteria, including properly measured temperatures, improperly measured temperatures, repeated measurements of the same patient, and even accidental measurements, for example of the ED floor. Instead of coming from an idealized setting supplemented with extensive patient information, the temperatures are therefore representative of data that would actually be available under a real-time surveillance program.

During the study period, 89,856 temperature recordings were collected. Before our statistical analyses, we used two approaches to filter out temperature measurements that were unlikely to be individual body temperatures. First, temperature recordings below 95.0 °F (35.0 °C) were removed (11.6 %, n = 10,466) because they are predominantly mis-measurements. The rare human temperatures <95.0 °F (<35.0 °C) constitute hypothermia [13], and are not relevant to syndromic surveillance of febrile disease. Second, we removed all but the last of any sequence of temperatures logged <15 s apart (15.8 %, n = 14,206). Temperatures taken this quickly could only be repeated measurements of the same patient, and the last measurement would most likely be the temperature that was accepted by the clinician. Therefore, only the last temperature was retained for our analysis. After removing these measurements, 71,865 temperatures (80.0 %) remained for analysis. Note that this filtering does not affect the external generalizability of our results because they could be applied in real time to temperature surveillance deployed in any ED.

Definition of fever

Following convention, common fever was defined as a body temperature ≥100.4 °F (≥38.0 °C). Hyperpyrexia (sometimes known as extreme hyperpyrexia) was defined as a body temperature ≥106.0 °F (≥41.1 °C) [14]. All temperatures were measured in degrees Fahrenheit. Celsius values appearing in this paper are rounded conversions of the Fahrenheit measurements.

Disease outbreaks

We determined to compare the fever data with outbreaks of febrile disease occurring during the study period. Although a variety of febrile diseases are monitored internationally, weekly surveillance data was not readily available in Massachusetts for febrile diseases other than influenza. We therefore compared the prevalence of fever during periods of influenza activity with non-influenza periods. The collected data included two outbreaks of influenza: the autumn–winter wave of the H1N1 pandemic and a seasonal flu outbreak.

Comparison with existing syndromic surveillance

Fever data from the ED were compared with existing syndromic surveillance for influenza. (If substantial outbreaks of other febrile diseases had occurred in Boston during the study period, comparisons would also have been made with any available surveillance data for these diseases.) The CDC reports data on the percent of outpatient visits for ILI at New England clinics participating in the Influenza-like Illness Surveillance Network (ILINet), including thresholds for periods of elevated influenza activity. Based on these thresholds, September 14, 2009–December 6, 2009 was the period of the autumn–winter wave of the 2009–2010 H1N1 pandemic in New England, and January 24, 2010–March 13, 2011 was the period of the 2010–2011 seasonal flu outbreak in New England [4]. We compared fever prevalence during these periods and during periods when influenza activity did not exceed the regional threshold (which are termed non-influenza periods for brevity).

In addition, fever data from the ED were compared qualitatively with weekly influenza reports from the Massachusetts Department of Public Health (MDPH) [15–17]. The MDPH reports provided data from the Automated Epidemiologic Geotemporal Integrated Surveillance System (AEGIS), including the percentage of total visits to EDs at 19 Massachusetts hospitals that were due to flu-like symptoms. The MDPH reports also provided weekly rates of ILI based on data from 46 (in 2009–2010) and 45 (in 2010–2011) hospitals, private physicians’ offices, and school health centers across Massachusetts [18]. Further, the MDPH reports provided weekly counts of influenza cases that were confirmed by laboratory testing (cultures and rapid tests) at the William A. Hinton State Laboratory Institute, providers’ offices, and laboratories across Massachusetts. It is worth noting that most cases of suspected influenza are not tested, and that the count of laboratory-confirmed cases is therefore much lower than the actual number of influenza cases in Massachusetts. It should also be noted that healthcare providers sometimes neglect to submit influenza reports to state surveillance systems. For example, 7 and 3 of the Massachusetts ILI reporting sources submitted reports for <16 weeks of the traditional flu reporting seasons in 2009–2010 and in 2010–2011, respectively [18], and it is likely that the AEGIS data were affected by missed reports from some EDs as well.

Detection of aberrant fever rates

Originally, a prospective validation analysis was designed to estimate the requirements for successful detection of increases in the underlying fever rate. However, after several months of data had been collected, it became apparent that the fever rates in the data were substantially higher than those that had been assumed when designing the prospective analysis, and were also subject to fewer rapid changes. The prospective analysis was therefore abandoned, and attention shifted to comparing the fever rates observed in the ED data with the CDC and Massachusetts-level data sources that are discussed above.

In addition, an outbreak detection algorithm was applied to the data as a supplemental investigation (Additional file 1: Aberrant Event Detection Analysis).

Statistical analysis

It was necessary to smooth the fever rates to make them visually interpretable. Smoothed estimates of fever rates were obtained via two methods: First, we computed the simple means of fever rates over the past week. Second, we applied exponential smoothing methods to the data. In practice, there are a variety of exponential smoothing methods, such as simple exponential smoothing and Holt’s linear method. We used a state space approach [19] as implemented in the R package forecast [20] to automatically select the exponential smoothing method and parameters that offered the lowest Akaike information criterion value for the fever data. In addition to the smoothing, we performed a simple analysis of the seasonality of fever rates, as discussed in the second supplemental appendix (Additional file 2: Seasonality). Although the results were not conclusive, we found no evidence of seasonality that was strong enough to warrant consideration in the analysis of fever rates.

Proportions were compared using the Chi-squared test and means were compared using the Mann–Whitney U test. Values of p < 0.05 were considered statistically significant and all tests were two-sided. The statistical analysis was performed in R (version 3.2.1; R Foundation for Statistical Computing, Vienna, Austria).