Abstract Background Studies have examined whether there is a relationship between drinking water turbidity and gastrointestinal (GI) illness indicators, and results have varied possibly due to differences in methods and study settings. Objectives As part of a water security improvement project we conducted a retrospective analysis of the relationship between drinking water turbidity and GI illness in New York City (NYC) based on emergency department chief complaint syndromic data that are available in near-real-time. Methods We used a Poisson time-series model to estimate the relationship of turbidity measured at distribution system and source water sites to diarrhea emergency department (ED) visits in NYC during 2002-2009. The analysis assessed age groups and was stratified by season and adjusted for sub-seasonal temporal trends, year-to-year variation, ambient temperature, day-of-week, and holidays. Results Seasonal variation unrelated to turbidity dominated (~90% deviance) the variation of daily diarrhea ED visits, with an additional 0.4% deviance explained with turbidity. Small yet significant multi-day lagged associations were found between NYC turbidity and diarrhea ED visits in the spring only, with approximately 5% excess risk per inter-quartile-range of NYC turbidity peaking at a 6 day lag. This association was strongest among those aged 0-4 years and was explained by the variation in source water turbidity. Conclusions Integrated analysis of turbidity and syndromic surveillance data, as part of overall drinking water surveillance, may be useful for enhanced situational awareness of possible risk factors that can contribute to GI illness. Elucidating the causes of turbidity-GI illness associations including seasonal and regional variations would be necessary to further inform surveillance needs.

Citation: Hsieh JL, Nguyen TQ, Matte T, Ito K (2015) Drinking Water Turbidity and Emergency Department Visits for Gastrointestinal Illness in New York City, 2002-2009. PLoS ONE 10(4): e0125071. https://doi.org/10.1371/journal.pone.0125071 Academic Editor: Jen-Hsiang Chuang, Centers for Disease Control, TAIWAN Received: October 17, 2014; Accepted: March 19, 2015; Published: April 28, 2015 Copyright: © 2015 Hsieh et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited Data Availability: NYC Syndromic Surveillance data is available through the NYC EpiQuery database which can be accessed at: https://a816-healthpsi.nyc.gov/epiquery/Syndromic/index.html. Water quality data were collected by the NYC Department of Environmental Protection: http://www.nyc.gov/html/dep/html/drinking_water/wsstate.shtml. For additional information contact the DEP at: http://www.nyc.gov/html/dep/html/contact_us/index.shtml. Funding: This work was supported by the Environmental Protection Agency Water Security Initiative grant #H1-83380501 (http://water.epa.gov/infrastructure/watersecurity/lawsregs/initiative.cfm). The EPA approved funding for the NYC Water Security Initiative, including funds to conduct a study of water quality and health outcomes. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Introduction Many studies have examined the relationship between turbidity as an indicator of drinking water quality and measures of endemic gastrointestinal (GI) illness. The methods, quality, and locations of the studies have varied, and a 2007 review showed mixed results even for studies meeting standardized quality criteria [1]. Differences in analytical methods, regional water quality, drinking water exposure, and case definitions for GI illness among other factors, may influence these results. Analyses of turbidity and healthcare visits for GI illness conducted in drinking water systems in Philadelphia, Atlanta, and Vancouver have shown small positive associations between turbidity and endemic GI illness [2–5]. Another conducted in Edmonton, CA found no association [6]. Turbidity is a standard drinking water quality indicator which is related to the amount and physical characteristics of suspended particles but does not indicate the type or source of particles. It is quickly and easily measured at low cost and can be a useful early indicator of water quality changes. Turbidity can be associated with increased runoff entering a system and microbial loading, as can occur following a precipitation event [7, 8] and particles contributing to turbidity can reduce the efficacy of chlorine in inactivating microbes [9]. The components contributing to turbidity can vary between watersheds, seasons, and years. Turbidity can increase related to changes in source water such as precipitation events or wind-driven mixing and also changes that can occur within the drinking water distribution system such as low pressure events. Increases in turbidity do not necessarily indicate a health risk as different sources and particle types contribute to turbidity and not all turbidity increases are associated with contamination. Increased turbidity has been associated with previous waterborne outbreaks including the Cryptosporidium outbreak in Milwaukee, Wisconsin and an E. coli outbreak in Walkerton, Ontario [10, 11] and increased turbidity was associated with emergency department (ED) visits for GI illness even before the large outbreak in Milwaukee [12]. Such water-borne outbreaks are rare, but evaluating whether there is an association of increased turbidity with GI illness may help enhance early detection of water quality issues. This study was conducted as part of a larger project focused on enhancing systems for the rapid detection of water contamination events using currently collected water quality and health data. Available water quality and health outcome data were reviewed for potential utility in a rapid detection system. Turbidity and ED GI illness visits were selected based on data availability, quality, timeliness, and support in the literature for use in this type of surveillance. ED data for patients presenting with a clinical syndrome consistent with GI illness (“syndromic data”) is more rapidly collected and reported to the New York City Department of Health and Mental Hygiene (NYC DOHMH) than traditional surveillance data such as positive clinical laboratory test results. ED syndromic data has been used to aid traditional surveillance. In 2003, ED syndromic analysis helped rapidly identify an increase in GI illness that was not detected by laboratory surveillance following a citywide power outage [13]. Interpretation of this signal was aided by supporting information and this analysis highlighted the utility of syndromic analysis in the context of overall public health surveillance. There have been no known outbreaks of GI illness related to New York City’s (NYC) current drinking water system. Nevertheless, surface water systems can be vulnerable to increased turbidity during extreme rain events, disturbances to the distribution system such as low pressure events and potentially to accidental or intentional contamination [7, 14, 15]. NYC has put in place a variety of systems for detection of such incidents including monitoring of water system and water quality parameters and certain public health parameters including ED GI illness. For this study, we conducted a retrospective time-series analysis of the relationship between drinking water turbidity and ED diarrhea visits in NYC during 2002–2009, to improve our understanding of how monitoring these data streams together may help enhance our systems for rapid detection of some types of water contamination events.

Materials and Methods Exposure Data The NYC drinking water system is an unfiltered drinking water system which receives surface water from 3 watersheds located North of NYC. This system includes 19 lakes and controlled reservoirs; water volume from reservoir systems varies daily and is actively managed based on quality and availability. Daily turbidity measurements for January 1, 2002 to December 31, 2009 provided by the NYC Department of Environmental Protection (NYC DEP) included data from over 375 sites within the NYC distribution system (~ 40 sampled/day) and 3 key point sites, one from each reservoir system: the Catskill, Delaware, and Croton[16]. As a daily measure of NYC distribution system turbidity, we computed the daily median turbidity among NYC distribution system sites sampled to limit the influence of localized extreme values (“NYC turbidity”). For source water turbidity we computed the daily flow-weighted average from the source water sites (“source water turbidity”). In cases where there were missing values in source water turbidity data, we imputed values by replacing missing values with the monthly average value or yearly average value if monthly average was not available. Summary statistics of daily turbidity in NYC and source water are shown in the supporting information S1 Table. One outlier (4.6 ntu) that occurred in January 2006 for the source water turbidity was removed from further analysis because a review of historical records indicated that it was associated with an interruption of normal systems operations and due to subsequent changes of system configurations that day it was not representative of water entering the distribution system (and was not reflected in the New York City distribution turbidity samples that day). Outcome Data NYC DOHMH receives daily ED visit records from NYC hospitals. These visits are categorized by syndrome type by scanning for key words in the chief complaint for each visit [17]. Keyword search for the ED diarrhea syndrome includes mention of diarrhea, enteritis, gastroenteritis, loose stools, and stomach flu. Daily counts of diarrhea syndrome ED visits were used as the outcome variable in this study. From 2002 to 2004, the number of hospitals contributing data varied, but, for the remaining study period, daily data for approximately 95% of all ED visits in NYC were available. Weather covariates The 24-hour average temperature and 24-hour cumulative precipitation data for LaGuardia airport were obtained from the National Oceanic and Atmospheric Administration, National Climatic Data Center (2009) Global Summary of the Day database. Exploratory Analysis We first conducted exploratory data analyses to characterize temporal patterns of turbidity, diarrhea ED visits, and potential confounders such as weather, to inform regression models. To characterize relative temporal variance contributions from seasonal trends, day-of-week, and random components, we conducted spectral analyses [18–20] of turbidity and diarrhea ED visits. We used modified Daniel smoothers to compute smooth season-specific periodograms, applying several spans of smoothing over frequency intervals [21], and then pooling them across years by frequency [18]. To characterize bivariate short-term temporal relationships among weather, turbidity, and diarrhea ED visits, we first removed the influence of shared trends, seasonal cycles, and day of week patterns from the short-term relationships among the variables by de-trending each time series in a generalized linear model using natural cubic splines with six degrees of freedom per season and a day-of-week indicator variable. To assess bivariate temporal relations among variables that may change across seasons, a cross-correlation function (CCF) was then computed using pairwise complete observations between the residual time series with multiple time lags (-10 to +10 days) at each of the twelve months using the multiple years’ data for the three months surrounding that month (e.g., for February CCF, the de-trended data for January, February, and March for multiple years were used). This method allowed for assessment of the changing pattern of associations, if any, between two variables across seasons with sufficient statistical power (i.e., r ~ 0.1 would be significant with six years of data in a 3-month block). Regression Models Percent excess risk (% ER) of daily diarrhea ED visits for turbidity was estimated using a quasi-liklihood Poisson time-series regression model, adjusting for temporal trends and seasonal cycles, immediate and delayed temperature effects, day of week, and holidays. The analysis was stratified by season, based on the seasonal variation in turbidity, ED outcomes and CCFs. Thus, the number of observations considered was 736 days for spring and summer, 728 days for fall, and 722 days for winter. The extent of lagged days considered for water quality and meteorological variables was based on exploratory analysis of these data described above. The main regression model had the following model specification for each season: where Y t is the number of diarrhea ED visits on day t; α is an intercept; β t-i is a regression coefficient for the turbidity on the lag i day; ns indicates a smooth function of a predictor using natural cubic splines; temp t is ambient temperature on the same day; avg.temp t-1:3 is the average of past 1 through 3 day lagged temperature; day-of-week is a class variable; and holiday_indicator t-0:1 indicates the day of or before a federal holiday. The quasi-likelihood Poisson model adjusts the standard error of the regression coefficients to accommodate possible Poisson over-dispersion. We used natural cubic splines of study days to adjust for potentially confounding temporal trends (e.g., the influence of norovirus) and sub-seasonal cycles. Since the study days here are sequence of numbers for the entire study period, this smooth term also adjusts for the long-term trends. In choosing the degrees of freedom for this temporal adjustment, we evaluated the statistical significance of the first-order autocorrelation of the residuals from the models using 2 to 14 degrees of freedom, in 2 degrees of freedom increment, per season per year. Based on this evaluation, using 8 degrees of freedom per season for each year for spring, summer, and fall, was sufficient 12 degrees per season per year for winter was required. To adjust for temperature effects, we also included in our base model natural cubic splines of the same-day and the average of past 1- through 3-day lagged temperature with 3 degrees of freedom over the range for each term. The choice of these lags was based on the CCF results (not presented). NYC precipitation was not associated with diarrhea ED visits and was not included in the regression model. Risks were estimated at lags 0 through up to 10 days for inter-quartile range (IQR) increases of turbidity for spring, summer, fall and winter seasons. For the main model described above, the model diagnostics were conducted by evaluating the models’ Pearson residuals in several ways: (1) examining time-series plots; (2) examining the residuals vs. fitted values; (3) examining the autocorrelation function (ACF) and partial autocorrelation function (PACF) of the residuals; and, (4) examining ACF and PACF of the square of the residuals. We conducted several additional analyses to examine the sensitivity of the risk estimates to alternative model specifications and to check the consistency of our main findings with the causal inference by examining the influence of different subsets of exposure and outcome data. Thus, we: (1) examined sensitivity of risk estimates by using a negative binomial model, which is also used for over-dispersed count data (Gardner et al., 1995) and compared both the regression coefficients and standard errors obtained from the quasi-likelihood Poisson models; (2) assessed effect of removal of the highest NYC turbidity values; (3) checked the sensitivity of risk estimates to alternative degrees of freedom to adjust for within-season variation in ED visits (6 through10 degrees of freedom per season were examined); (4) fit a distributed lag model to estimate the impact of multi-day associations over lag 0 through 13 days through evaluating 2nd, 3rd, 4th-order polynomial shapes using an R package ‘dlnm’ (distributed lag non-linear model; [22]); (5) examined sensitivity of risk estimates to four-year moving seasonal blocks of seasonal data subsets; and (6) examined sensitivity of risk estimates to alternative temperature model specifications. All statistical analyses were conducted using the R statistical software package (version 2.15.3: R Development Core Team 2013).

Discussion Turbidity in the NYC distribution system was positively associated with an increase in diarrhea ED visits in the spring season only, among the youngest age groups, peaking at approximately 6 to 7 day lag. This association accounted for a very small proportion of temporal variation in diarrhea ED visits, with the majority due to seasonal illness patterns unrelated to source water turbidity. Source water turbidity was the major contributor of variation in NYC distribution system turbidity and the primary driver of the association of NYC turbidity with diarrhea ED visits seen in this analysis, not turbidity originating within the distribution system. Source water turbidity primarily reflects turbidity from the Catskill and Delaware systems. There are multiple regulations governing turbidity limits and details regarding turbidity results and regulatory compliance for the study years are available in the NYC DEP Drinking Water Supply and Water Quality Reports [16]. The association was present after adjustment for temporal patterns and weather to account for seasonal illness and changes in care-seeking patterns and was robust to differences in model specifications. The pattern of the multi-day associations, which showed a steady increase, a mode around day 6, and a steady decrease, was consistent with a variable between exposure to an infectious agent and presentation to the ED with complaints of diarrhea that would be expected based on differences among individuals in exposure, susceptibility, and care seeking behaviors. This pattern was inconsistent with a chance finding. The results also suggest that the highest turbidity values contribute disproportionately to the association between NYC turbidity and diarrhea ED visits. This association was limited to the spring season, when turbidity levels are highest and most variable. Excluding the highest turbidity values in the spring also reduced the strength of associations (Fig 6). It is not clear if the greater variation in turbidity in the spring or the nature of turbidity in the spring is most relevant. While precipitation levels can be high in the spring, they tend to peak in summer or fall, suggesting that other seasonal factors may be relevant to this association such as greater surface runoff from snowmelt and limited tree foliage cover in the spring. Previous analyses of turbidity and GI illness have shown a range of results. In Mann et al.’s review (2007) they identified six “relevant good quality” studies. They were studies using data from: Philadelphia [2, 4], Milwaukee [12], Greater Vancouver, Canada [5], Edmonton, Canada, [6], and Quebec, Canada [23]. The methods, outcome, and effect measures reported in these studies varied, and therefore, Mann and her colleagues provided a qualitative summary. They concluded that “It is likely that an association between turbidity and GI illness exists in some settings or over a certain range of turbidity.” Since the Mann et al. review, Tinker et al. (2010) conducted an analysis of turbidity and ED visits related to GI illness in Atlanta, GA, which we considered as relevant based on the Mann et al.’s evaluation criteria. Overall, one study [6] found no association between turbidity and GI illness, and six of the seven found variable positive associations- some only found associations with pre-treatment turbidity. Comparing the effect size estimates reported across studies, especially based on per turbidity unit basis, is challenging because the interpretation of turbidity differs for pre- and post- filtration values. We found a positive association in the younger age groups, consistent with studies of turbidity and GI illness in children specifically, though lag times, and other factors varied between studies [2, 3, 5, 12], and similar to Tinker et al. and the Morris et al. studies, which found the strongest associations across ages in the youngest age groups. We did not find a positive association of turbidity and GI illness in older adults, in contrast to some previous studies [4, 5, 12]. While most of these previous studies controlled for seasonality, none conducted separate analyses by season as in our study. The amount and source of turbidity can vary by season in a watershed and seasonal analysis helped produce a more focused result using a general indicator. Regional variation in seasonal influences on turbidity may make this approach useful in additional locations. Some recent studies have examined the relationship between GI illness and seasonal hydrological factors, rainfall and stream flow, and found positive associations [24, 25]. While Drayna et al. analysis found that GI illness increased after 4 days, Jagai et al. found that GI illness peaks preceded stream flow peaks. Thus, further research is needed to understand the nature of the association with different hydrological parameters as some may be indirectly associated. NYC has historically been an unfiltered drinking water system, with water supplied from the extensively protected Catskill, Delaware, and Croton watersheds. Associations between turbidity and endemic GI illness reported in past studies were found in both filtered [2–4] and unfiltered [5] drinking water systems. Lim et al. and Tinker et al. examined both raw water and post-treatment water turbidity. Tinker et al. found an association with raw water only. Lim found no association with raw water or post-treatment water turbidity. There are several limitations in this study. Given available data, this study used an ecological study design. It is difficult to characterize the nature of association in terms of individual risk since we did not have individual-level exposures and individual outcomes. The association was for the daily variation of turbidity as observed in the distribution system with the citywide variation of ED visits for GI illness. The source water data, which was analyzed as a supplement to the NYC water quality data, had several limitations. Water quality data from the source water reservoir systems was limited to 3 sites—one from each reservoir system and this data varied in completeness. Flow data available for this study was measured at a point in the system after Catskill and Delaware water had already mixed so these reservoir systems were not assessed independently. There was 19.8% missing data for the Croton system, however the overall contribution of the Croton system to the total flow was relatively small so it did not have much impact on the flow-weighted average turbidity for source water. Data necessary to conduct a study on a smaller geographic scale within NYC was not available, and therefore, this study could not address the relationship of localized turbidity increases with local GI illness. Other water quality data that were available to us did not meet the requirements for this type of analysis. For example, microbial indicator results reported were predominantly below the detection limit. Thus, we could not examine this as an alternative indicator of water quality in this study. Turbidity as an indicator of water quality is also limited. A basic turbidity measurement alone does not provide information about the type or source of suspended particles. Regional differences in watersheds can influence the source and composition of particles contributing to turbidity and this may explain differences across some studies. For example, Lim et al. (2002) suggest that the drinking water source for Edmonton, Canada is strongly influenced by clay and silt particles associated with glacial melt and runoff, which may help explain the lack of association of GI illness despite relatively high turbidity values. ED visits for GI illness are likely an underestimate of overall community GI illness, as most people with GI symptoms do not seek medical care or seek care in outpatient settings. However, syndromic surveillance data are relatively quickly and easily measured and the speed of electronic data availability make diarrhea ED visits a potentially useful and timely albeit non-specific indicator of GI illness to help assess whether a potential water contamination event is having a population level impact on GI illness. Laboratory confirmation of waterborne pathogens is more specific, but it can take days or weeks to receive these results [11]. While these results should be viewed with consideration of any study limitations discussed above, the strength of this work and the selected model are reflected in the ability to detect the small magnitude of an association. Multiple sensitivity analyses support the robustness of the association. These findings interpreted along with findings from other published water quality and health studies, support the need for additional research in this area to elucidate the cause of the association and to identify an indicator or metric that can be more readily compared across regions. Given that turbidity is an indicator only and the causative agent is unknown, further investigation of turbidity levels in the spring season is needed, but no specific turbidity alert level is recommended at this time. Many types of data are used to monitor the NYC drinking water system as part of ongoing regular surveillance such as physical, chemical, and biological measures of water quality as well as public health surveillance data including syndromic data. This study provided a first step towards more complete understanding of the relationship between water quality and syndromic surveillance data in NYC and results will be considered in the context of the overall water quality monitoring system. No single indicator alone will be used to make management decisions, rather supporting information will be used to provide context to understand alerts from various systems. In 2013, NYC started providing enhanced UV disinfection of its Catskill/Delaware water supply, and soon will be able to provide enhanced filtration and disinfection of its Croton water supply. It may be useful to repeat this analysis when sufficient data are available to assess whether the association between turbidity and GI illness can still be observed.

Conclusions This study identified a small association between turbidity and diarrheal illness in NYC. A large proportion of the strong seasonal variations in diarrhea ED visits likely reflect many different causes of diarrheal illnesses. Smaller scale geographic analysis may provide further information on associations between turbidity and GI illness. Integrated analysis of turbidity data and syndromic surveillance data, as part of overall drinking water surveillance, may be useful for enhanced situational awareness of possible risk factors that can contribute to GI illness. Elucidating the causes of turbidity-GI illness associations including seasonal and regional variations would be necessary to further inform surveillance needs.

Supporting Information S1 Fig. Examination of risk estimates as a function of alternative degrees of freedom (df) for seasonal adjustment. https://doi.org/10.1371/journal.pone.0125071.s001 (PDF) S2 Fig. Fitted polynomial distributed lag models. Fitted polynomial distributed lag models for lag 0 through 13 day turbidity and all-age diarrhea ED visits. https://doi.org/10.1371/journal.pone.0125071.s002 (PDF) S3 Fig. Sensitivity analysis of excess risk of diarrhea ED visits in spring by time period. Sensitivity of percent excess risk of diarrhea ED visits for all-age group at lag 6 day in a 4th-order polynomial distributed lag model in spring periods using consecutive four years of spring periods. https://doi.org/10.1371/journal.pone.0125071.s003 (PDF) S4 Fig. Sensitivity of percent excess risk of diarrhea ED visits to alternative temperature specifications. Sensitivity of percent excess risk of diarrhea ED visits for all-age group at lag 6 day in a 4th-order polynomial distributed lag model in spring periods using alternative temperature specifications (1) no adjustment for temperature; (2) natural spline of same-day temperature with 3 degrees of freedom and natural splines of the average of lag 1 through 3 days with 3 degrees of freedom (the base model); (3) natural splines of same-day temperature with 3 degrees of freedom and natural splines of the average of lag 3 through 8 days with 3 degrees of freedom; and (4) 4th-degree distributed lag model of temperature for lags 0 through 13 days. https://doi.org/10.1371/journal.pone.0125071.s004 (PDF) S1 Table. Summary statistics of NYC daily turbidity data, 2002–2009. https://doi.org/10.1371/journal.pone.0125071.s005 (PDF)

Acknowledgments We thank our collaborators at the NYC Department of Environmental Protection including Steven Schindler, David Lipsky, and Anne Seeley for their contributions to this work and appreciate assistance from NYC Department of Health and Mental Hygiene staff including Marcelle Layton, Sharon Balter, Marc Paladini, Don Weiss, Trevor McProud, and Chris Boyd.

Author Contributions Conceived and designed the experiments: JLH TM TQN KI. Performed the experiments: JLH TM TQN KI. Analyzed the data: JLH KI. Contributed reagents/materials/analysis tools: JLH KI. Wrote the paper: JLH KI TM TQN.