Sampling

Pooled air-toilet samples were obtained from 18 long-distance flights from Scandinavia Airline System (SAS) (www.sas.com) arriving to Copenhagen airport from nine different cities (Bangkok, Beijing, Islamabad, Newark, Kangerlussuaq, Tokyo, Toronto and Washington DC), representing eight countries, in three different regions (North America, North and South Asia) (Fig. 1). At the sites of origin toilets were emptied, rinsed with water and disinfected prior to departure. Idu-Flight was used as disinfection agent, which is a deodorizing liquid, based on glutaraldehyde and benzalkoniumchloride. On arrival to Copenhagen the air flight waste containers were emptied by the SAS Cleaning and Service. For our purpose, one car was designated to empty the airplane waste containers, which were toughly rinsed with water prior to collecting samples. From each of the individual airplanes 6-8 waste toilets were emptied under high vacuum pressure into the special container at the service car. This procedure mixes the content thoroughly and no individual fecal clumps or toilet paper can be visually identified. Three individual ½ L samples where then collected using sterile tubes from the cars waste container and placed in a refrigerator. Subsequently, a container was rinsed with water before collecting toilet waste from another airplane. Between one and seven flights were collected during a maximum of 12 hours and immediately transported in closed containers for hazardous goods to the Technical University of Denmark and processed immediately.

Sample handling and DNA purification

On arrival the samples were handled as outlined in Figure S1. Each individual ½ litre tube was mixed as much as possible in the tube. Four tubes, one of them supplemented with RNAlater, of 10 mL each were collected from each ½ litre tube and frozen at −80 °C. Twenty-five mL were collected from each of the three ½ L tubes obtained from a single airplane and mixed. Again four tubes of 10 mL were collected and stored as described above. The remaining approximately 35 mL combined waste was transferred to a 50 mL centrifuge tube and centrifuged at 1500 rpm for 2 minutes to remove large debris and large cells. A total of 1.2 mL was removed for conventional counting of bacteria and examination for antimicrobial resistance. A total of 30 mL supernatant was transferred to a new 50 mL tube. The supernatant was centrifuged at 10000 G for 10 min. Ten mL of the supernatant was collected and frozen at -80oC; the remaining was discarded. The pellet with the bacterial cells was dissolved in 500 μL phosphate buffered saline and transferred to Eppendorf Safelock™ tubes and DNA purified using a protocol including both lysozyme and lysostaphin to increase cell lysis38. Each of the DNA purifications were dissolved in 300 μl TE buffer and extracted with 2 rounds of phenol/chlorophorm, followed by precipitation to obtain highest possible purity. The purified DNA was frozen at −80 °C for later sequencing.

Whole community sequencing

Samples were diluted up to a final volume of 100 μL and fragmented using a Diagenode Bioruptor, with a 30'/30' on/off program of 20 cycles. 500 ng of fragmented DNA from each sample was used to convert the extracts into Illumina-compatible DNA libraries using NEBNext library building kit for second-generation sequencing (New England Biolabs, Ipswich, MA; Cat#. E6070L). Libraries were prepared according to manufacturer’s directions with slight modifications as follows: Following end repair with incubation of 20 minutes at 12 °C followed by 15 minutes at 37 °C the samples were purified on a Qiagen Minelute silica spin-column following manufacturer’s instructions. The DNA was ligated to Illumina blunt-end adapters as described by Meyer et al.39, for 20 minutes at 20 °C and again purified on a Qiagen qiaquick silica spin-column following manufacturers instructions. Subsequently, the fill-in procedure was performed for 20 minutes at 65 °C followed by a heat inactivation step for 20 minutes at 80 °C. Libraries were then used directly for post-library indexing PCR with unique indexing for each sample31. Indexing PCR was done in 100 μL reactions using Phusion High-Fidelity PCR Master Mix (Thermo Fisher Scientific, Waltham, MA) and 10 μL template DNA library under the following conditions: 1) 30 seconds at 98 °C, 2) 20 seconds at 98 °C, 3) at 30 seconds at 60 °C, 4) 30 seconds at 72 °C, 5) 5 minutes at 72 °C, 6) hold at 4 °C. Step 2-4 was repeated for 4 cycles. Subsequently, the PCR products were cleaned with Qiagen qiaquick spin columns. Concentrations were measured on a Qubit fluorometer 2.0 (Life Technologies) using the dsDNA HS assay and subsequently analyzed on a Agilent 2100 Bioanalyzer (Agilent technologies). All 18 samples were pooled into one pool in equimolar concentrations for a final concentration of 9 nM. Sequencing was performed by the National High-throughput DNA Sequencing Centre at the University of Copenhagen, Denmark, on an Illumina Hiseq 2000 instrument by 100 cycles paired end for a total of 8 lanes.

Concentration Read processing and alignment

Paired end read data was processed by mapping airtoilet fastq samples against several reference sequence databases, which can be downloaded as described in Table S2. Initially, trimming and removal of adaptor sequences was done using cutadapt40 with settings for minimum read length being 30 bp and a minimum Phred quality score of 30, to trim low-quality reads before adaptor removal (cutadapt parameter - quality-cutoff). Subsequently reads were processed through a mapping approach build on bwa mem41 and samtools42 software. Bwa mem (ver. 0.7.7-r441) was used with default settings to map reads against reference sequence databases. Throughout the mapping approach, only the most reliable hits were accepted i.e. properly paired reads were accepted provided that each read maps with an alignment length being at least 80% of the read length. As part of the initial cleaning process, potential PhiX i.e. PhiX174 control reads were removed using the procedure described above and all remaining reads are likely to have a biological origin. The read statistics for the pre-processing steps are shown in Table S1 for each of the 18 samples available. Next, reads were mapped to a number of reference sequence databases following one of two possible routes, being either chainmode or fullmode mapping. A flowchart of the two mapping procedures is shown in Figure S4. All samples were mapped against the databases that can be created from public available reference sequences. In Fullmode a resistance gene database (ResFinder) was used. In Chainmode the follow ordered list of databases were used: MetaHitAssembly, Bacteria, Plasmid, Human, Invertebrates, Protozoa and Virus. Reads that mapped to the Bacteria database were subsequently used to extract hits of pathogen specific origin. Three pathogens with the highest read counts were considered as shown in Table S6 (Salmonella enterica subsp. enterica serovar Typhimurium DT104, Clostridium difficile R20291, Campylobacter jejuni subsp. jejuni IA3902). In this table we only show the most confidently mapped reads and referee to them as unique meaning those where the bwa flags ‘AS’ and ‘XS’ differ. The flag ‘AS’ is the alignment score and ‘XS’ is the score for the second best alignment.

Normalized abundance estimation, clustering and significance testing

The raw read counts were normalized by the total number of reads after PhiX filtering, summed to species level. All species with an abundance of more than 1,000 reads were transformed by log10 and clustered in R using the package pvclust using the ward method euclidean distance measure. The raw read counts of resistance genes were normalized by the total bacterial count i.e. sum of hits that could be assigned either of the to bacterial databases ‘Bacteria’ downloaded from NCBI complete genomes and ‘MetaHitassembly’ which is a collection of assembled draft genomes identified from an analysis of fecal samples from 124 European individuals15. Conversion tables from ResFinder identifiers i.e anti-microbial resistance genes, to gene and class level are available in Table S8. Significance testing was performed using geographical region as levels and tested individually using a Wilcoxon Rank Sum test. False Discovery Rate (FDR) was determined by using the average of three Monte Carlo simulations. The significant testing of sample-wide abundance and richness were performed using Wilcoxon Rank Sum test.

Extraction of viral RNA and real-time RT-PCR detection of noroviruses

Five ml portions of sampled abattoir material (toilet content including glutaraldehyde and benzalkoniumchloride, pH 9.5) were mixed by vortexing with 35 ml PBS and left for shaking for 30 min at room temperature to obtain a homogeneous suspension as well as allowing viruses to detach from sample material. The mixture was centrifuged at 10,000 × G for 30 min to precipitate as much non-viral material, such as toilet paper, tissue and bacteria, as possible and the recovered supernatant were adjusted to pH 7.5 using HCL. Viruses were then precipitated from the supernatant by incubation with polyethylene glycol 8000 (80 g/L) and NaCl (17.5 g/L NaCl) during agitation (350 rpm) overnight at 4 °C. Pellet was resuspended in 0.75 ml PBS and the suspension was clarified by chloroform/butanol extraction (1:1 vol/vol), during which the mixture was vortexed for 30 sec followed by 5 min storage at room temperature before centrifugation at 10,000 × G for 15 min at 4 °C. Nucleic acids were extracted from the remaining water phase using BioMerieux reagents and miniMag apparatus (BioMerieux, Herlev, Denmark) according to the protocol of the manufacturer except for eluting the nucleic acids in 100 μl elution buffer. One-step real-time RT-PCR for the detection of NoV genogroup I and II in 2.5 μl extracted RNA was carried out as described elsewhere43 using a StepOnePlus™ System real-time PCR machine (Life Technologies Europe BV, Naerum, Denmark). Quantification of NoV GI and GII genome copies was done by interpolation to standard curves derived from NoV GI.3b and GII.1 RNA transcripts. Inhibition of amplification during RT-PCR detection of NoV GI and GII was evaluated by adding GI.3b and GII.1 transcripts, respectively, as internal amplification controls to each sample RNA extract and to nucleic acid-free water. The amplification efficiency was calculated by dividing the RNA transcripts recovered from the sample RNA by the RNA transcripts recovered from the nucleic acid-free water and was accepted for values above 50%. The extraction efficiency of viral nucleic acid was evaluated in all samples using Mengovirus, strain vMC 0 (ATCC VR-1597) as internal process control. Prior to extraction, approximately 1 × 104 plaque forming units of MC 0 were spiked into the samples and nucleic acid-free water. After extraction, the levels of MC 0 were determined by realtime RT-PCR43,44. The extraction efficiency was calculated by dividing the MC 0 RNA recovered from the abbatoir matrix by the MC 0 RNA recovered from the nucleic acid-free water and was accepted for values above 1%. The extraction efficiency was only used to evaluate the method performance and not incorporated into the calculation of virus concentration in the air flight samples. Significant testing was performed using Wilcoxon Rank Sum test.

Ethics

This study was conducted in accordance with the Danish Act on scientific ethical treatment of health research, as administrated and confirmed by the Research Ethics Committees of the Capital Region of Denmark (www.regionh.dk), Journal nr.: H-14013582.

Sequence data for the flight samples are available for download through ENA.