Study setting

Skellefteå is situated in Västerbotten County geographically located next to Jämtland County where the Östersund outbreak occurred. The distance between Skellefteå and Östersund is almost 500 km. Skellefteå is a municipality with a population of approximately 72,000 inhabitants. Twenty-eight water treatment plants are operating within the municipality. Two of these deliver water to the city of Skellefteå; Slind WTP and Abborrverket WTP, where the latter delivers water to the majority of the inhabitants. All water treatment plants in the municipality use groundwater as the water source except Abborrverket which uses surface water obtained from the river Skellefteälven. Abborrverket WTP produces approximately 18,000 m3 of treated water daily to 44,000 of the 72,000 inhabitants in the municipality (31 March 2011). The normal water intake to Abborrverket is located far out and deep in the river but due to icing during the winter months the intake is shifted to a more shallow position closer to shore where the ice can be removed more easily.

Microbiological investigation

Human samples

Fecal samples from patients seeking healthcare for gastrointestinal illness were analysed with standard techniques for enteric bacterial pathogens; polymerase chain reaction (PCR) for analysis of noro- and sapoviruses and microscopy for analysis of Entamoeba spp. and Giardia intestinalis. Samples were only sporadically analysed for presence of Cryptosporidium oocysts up until 19 April 2011, when the current outbreak was first suspected. An intensification of testing for Cryptosporidium followed from that time until 1 July 2011 when the outbreak was considered over. Samples tested for Cryptosporidium were analysed using standard concentration technique followed by modified Ziehl-Neelsen staining [19]. A subset of positive Cryptosporidium samples (n = 26) were sent to the Swedish Institute for Communicable Disease Control for species identification by PCR restriction fragment length polymorphism (RFLP) analysis of the rRNA gene [20, 21]. Subtypes were characterized by sequence analysis of the 60 kDa glycoprotein (gp60) gene [22, 23].

Environmental samples

At the time of the outbreak Abborrverket WTP used flocculation and sedimentation followed by sand filtering and chlorination for water treatment. This water treatment setup could be sufficient for removal of Cryptosporidium oocysts if the processes work optimally and the concentration of oocysts is relatively low, but ultraviolet (UV) treatment is generally preferred as a disinfectant [24]. The winter intake was used from 19 November 2010 until 19 April 2011. A total of 38 samples were collected from the drinking water system during a period of 5 months, 19 April to 15 September 2011. These samples included raw water from the river Skellefteälven, i.e. incoming water to Abborrverket WTP, treated water at Abborrverket WTP and samples taken from the distribution net.

Twelve influent and twelve effluent wastewater samples were collected at the main sewage water treatment plant (SWTP) Tuvan. Moreover, in order to investigate possible causes of contamination of Skellefteälven and to trace sources of oocysts, 9 samples were collected from the wastewater and storm water systems and from other relevant locations.

Water samples were analysed for Cryptosporidium oocysts according to ISO 15553:2006 [25] with filtration of water (10–1000 L), immunomagnetic separation (IMS) and immunofluorescence (IFL) microscopy. The slides with concentrated and purified material were identified by fluorescent-marked oocysts specific in size, shape, internal structure and DAPI-(4′,6-diamidino-2-phenylindole)-stained nuclei. Wastewater samples were analyzed as water samples but without passing filters and in smaller volumes, 50–100 mL for influent wastewater and 0.3–0.5 L for effluent wastewater. Two sediment samples from the inside of the influent raw water pipe were also analysed as water sample but without filtration before IMS. DNA from one wastewater concentrate was analysed by sequence analysis of the gp60 gene as described for human samples [22, 23].

Epidemiological investigation

Web-based questionnaire

The same day as the BWN was issued, on 19 April 2011, a web-based questionnaire (Additional file 1) was created in order to immediately start collecting epidemiological data. The value of such a questionnaire was demonstrated in the preceding cryptosporidiosis outbreak in Östersund [11] and those experiences were applied here as well. The questionnaire was made available to the public on the website of the municipality on the evening the same day, and was closed on 9 May 2011. The public was informed of the questionnaire by press releases and there were also links to it from key web pages such as the local newspaper and Västerbotten County Council. The full data set was summarised after the outbreak was considered to be over. Visitors to the webpage who were residents of Skellefteå municipality, both individuals with and without GI symptoms, were asked to answer a set of questions regarding gastrointestinal illness in the family. A case attributed to the outbreak was defined as a person with residential address within Skellefteå municipality with ≥3 loose stools per day for at least 1 day with onset between 1 April and 5 May 2011. Respondents with a date of symptom onset before 1 April or after 5 May, persons who had travelled abroad 2 weeks prior to symptom onset, as well as individuals with a residential postal code outside Västerbotten County were excluded from the analysis. Remaining respondents who did not fulfil the criteria of having ≥3 loose stools per day were considered non-cases. More detailed analyses of the data were not performed since the follow-up postal survey was conducted.

Postal questionnaire

A retrospective cohort study was performed in June 2011 by sending a questionnaire to a random sample of 1754 citizens in the municipality of Skellefteå (Additional file 2, Additional file 3). The random sample was stratified by age (0–5 years, 6–15 years, 16–65 years and 66 years or older) and gender. Questions were asked to find out about the start and magnitude of the outbreak, the source of the outbreak and risk factors for disease. The questionnaire contained questions on demographics, onset, duration and occurrence of symptoms indicating cryptosporidiosis, and water consumption as well as history of symptoms before 1 January 2011. Caretakers were asked to answer for children <15 years of age. A case attributed to the outbreak was defined as a person with ≥3 loose stools per day for at least 1 day with onset between 1 December 2010 and 31 May 2011.

Statistical analysis of the postal questionnaire

Each of the 1754 respondents were assigned a random number and a barcode on the questionnaire was used to identify each respondent. The postal codes were matched to the water distribution areas of the WTPs. In a stratified survey study, weights are used to calculate the number of individuals in the population represented by each individual in the sample. Binary logistic regression was used to find associated variables for the propensity of responding to the survey. Age, gender and water supply were used to calibrate the weights for non-response to adjust for unbalance between the sample and the population.

The association between the binary outcome of case/non-case and the exposure variables was analysed by binary logistic regression. Included in the model as covariates and exposure variables were gender, age (0–5 years, 6–15 years, 16–65 years, and 66 years or older), gastric ulcer (yes, no), irritable bowel syndrome (yes, no), Crohn’s disease (yes, no), celiac disease (yes, no), lactose intolerance (yes, no), immunodeficiency disease (yes, no), average tap water consumption (<1 glass, 1 glass, 2–5 glasses, >5 glasses) and household water supply (Abborrverket, not Abborrverket or not from any WTP/own well).

The results from the binary logistic regression were expressed as odds ratios (OR). All “I do not know” answers for binary questions were regarded as non-informative and were set as missing values prior to the analysis. Missing values for binary variables were then given a value (yes, no) using multiple imputation chain equations [26]. The chains contained all exposure variables plus the outcome non-case/case [27]. Twenty datasets with different imputed values for missing data were created and binary logistic regression results from each dataset were weighted together into one result using Rubin’s formula [28]. All analyses were performed in the statistical software R (version 3.3.2) using the packages survey (version 3.31.2), MICE (version 2.25) and the generalized linear model function (glm) in the base R package stats. In all analyses a p-value less than 0.05 was used as a significant result and in case of estimated confidence intervals a confidence level of 95% was applied.

Analysis of phone calls to a health advice line

Healthcare Guide 1177 is a national Swedish telephone health advice line staffed by nurses. The service provides advice and information about urgent, but non-life-threatening, health problems. The medical record created for each consultation includes a structured data field, called the contact cause, that represents the most severe symptom as assessed by the nurse [29]. There are almost 200 contact causes in the service’s medical decision support system but only a handful are related to GI problems. For the purpose of this study daily call counts on GI symptoms were retrospectively extracted from the service for inhabitants in Skellefteå municipality from 1 August 2010 to 18 April 2011. The contact causes “vomiting or nausea”, “diarrhoea” and “stomach pain” were used since changes in contact patterns for these symptoms previously have been shown in outbreaks of cryptosporidiosis [14]. In addition, for each call, information on the postal code of the registered residence address of the patient was extracted.

Postal codes were divided into two geographical regions; belonging to the distribution area of Abborrverket WPT or not, and the number of inhabitants in the corresponding regions were calculated. To compare the call patterns of GI-related symptoms between these two regions a previously published outbreak detection algorithm [14] was used but with a minor modification. No analyses were performed for the period from the BWN and onwards since, as the information of an ongoing outbreak becomes public, the contact pattern to the health advice line changes drastically and it is challenging to adjust for this in the analyses.

The daily call count, C t , i , for one contact cause or a single group of contact causes at day t for geographical region i was classified as an outbreak signal if it exceeded a threshold T t , i :

$$ {T}_{t, i}= \max \left( L, V\right), $$

$$ V=\left({E}_{t, i}+ L\times {SD}_{t, i}\right), $$

$$ L\in \left\{3,5\right\} $$

$$ {E}_{t, i}={p}_{t, i}\times {N}_i, $$

$$ {SD}_{t, i}=\sqrt{N_i\times {p}_{t, i}\times \left(1-{p}_{t, i}\right)}, $$

$$ {p}_{t, i}=\frac{\sum_{j=1, j

e i}^{n_i}{\sum}_{\tau}{\omega}_{\tau}{C}_{\tau, j}}{10\times \sum_{j=1, j

e i}^{n_i}{N}_j}, $$

$$ \tau \in \left\{ t-7, t-8, t-9, t-10, t-11, t-12, t-13, t-14\right\}, $$

$$ {\omega}_{\tau}=\left\{\begin{array}{l}2, if\ \tau \in \left\{ t-7, t-14\right\}\\ {}1, if\ \tau \in \left\{ t-8, t-9, t-10, t-11, t-12, t-13\right\}\end{array}\right. $$