Research on the human microbiome, the microbiota that live in, on, and around the human person, has revolutionized our understanding of the complex interactions between microbial life and human health and disease. The microbiome may also provide a valuable tool in forensic death investigations by helping to reveal the postmortem interval (PMI) of a decedent that is discovered after an unknown amount of time since death. Current methods of estimating PMI for cadavers discovered in uncontrolled, unstudied environments have substantial limitations, some of which may be overcome through the use of microbial indicators. In this project, we sampled the microbiomes of decomposing human cadavers, focusing on the skin microbiota found in the nasal and ear canals. We then developed several models of statistical regression to establish an algorithm for predicting the PMI of microbial samples. We found that the complete data set, rather than a curated list of indicator species, was preferred for training the regressor. We further found that genus and family, rather than species, are the most informative taxonomic levels. Finally, we developed a k-nearest- neighbor regressor, tuned with the entire data set from all nasal and ear samples, that predicts the PMI of unknown samples with an average error of ±55 accumulated degree days (ADD). This study outlines a machine learning approach for the use of necrobiome data in the prediction of the PMI and thereby provides a successful proof-of- concept that skin microbiota is a promising tool in forensic death investigations.

Funding: Reagents, supplies, and sequencing was made by possible by a grant to NHL from the PSC-CUNY Research Award Program (67672-00-45) and the Seed Funding Program from the John Jay College Office for the Advancement of Research. Student stipends to DT, SG, ZK, and JP were provided by the PRISM program at John Jay College, funded by the Title V and HSI-STEM programs of the US Department of Education and the New York state CSTEP Program. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability: Sequence data are available from the NCBI Sequence Read Archive (SRA) under accession number SUB2118662. Other data underlying the study’s findings are available within the paper and its Supporting Information file.

Copyright: © 2016 Johnson et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

In this study, we sampled the bacterial communities in the ear and nasal canals of 17 cadavers, four of them repeatedly, throughout the course of surface decomposition and analyzed those communities with 16S rRNA gene amplicon sequencing. Statistical analysis at all taxonomic levels was used in a machine learning approach toward development of a computational model for prediction of the postmortem interval. To that end, we were successful in constructing a k = 4 nearest-neighbor regression model which accurately predicted the true postmortem interval to within 55 accumulated degree days (ADD), or two days at an average temperature of 27.5°C. We were also able to identify the bacterial taxa that are most informative of our predictive model of decomposition.

To date, most reports on the microbiome have focused on the communities of the GI tract, as it is the site of the richest diversity of microflora to begin with. However, we propose that that the skin microbiome offers several advantages as a site-of-interest for research into the postmortem microbiome. Specifically, we chose the aural and nasal cavities for our study, which are included as sites of interest in the NIH Human Microbiome Project. [ 13 ] These two niches are each unique environments in the body but are also accessible and non-invasive to sample. Using the ear and nasal cavities would offer distinct advantages during an active criminal investigation of a crime scene, as it would leave the cadaver essentially undisturbed.

In the largest and most comprehensive study on the human necrobiome to date, Metcalf, et al. used machine learning methods to characterize how the microbes derived from both the soil and the decomposing human or mouse cadaver assemble into a decomposition ecosystem with a predictable succession of bacterial and fungal organisms. [ 8 ] Using mouse cadavers in laboratory conditions and human cadavers in uncontrolled outdoor conditions, this study found that patterns of microbial succession were surprisingly independent of soil type and seasonal effects. In addition, researchers found many microbial taxa that are active in a similar schedule in both the human and mouse conditions. This implies that the sudden influx of carrion-derived nutrients into the soil initiates a common biological chain of events at the microbial level, regardless of the mammal species from which the carrion derives. This commonality bodes well for the forensic utility of the necrobiome.

Recently, Hauther, et al., repeatedly sampled bacteria from the large intestine of 12 cadavers as they decomposed in outdoor environmental conditions. [ 12 ] This study performed quantitative analysis of bacterial communities using the 16S rRNA gene for phylogenetic identifications and identified three specific genera of bacteria that show specific promise as quantitative indicators of the postmortem interval.

Other interesting work on the necrobiome has probed how microbial communities in cadaver tissue and the soil merge into one dynamic system. Studies by Finley, et al., and Cobaugh, et al. have reported that cadaver-derived microbes can be detected in the nearby soil for up to a year and possibly much longer. [ 10 , 11 ] Importantly, the detection of microbes in nearby soil appears to follow a steady progression and could prove fruitful for forensic estimation of the postmortem interval, even long after a body has been removed the scene.

A study by Pechal, et al. reported on the usefulness of monitoring the succession of bacterial taxa during the course of decomposition of pigs in an uncontrolled environment. [ 7 ] In that study, they were able to build a statistical model using metagenomic sampling that explained nearly 95% of the microbiome changes that occurred through the course of decomposition. This convincingly demonstrated that a data analytics approach can overcome the inevitable noised introduced by sampling from an outdoor environment. An earlier study by the same group described temperate-zone seasonal variations of the swine necrobiome, an important consideration for forensic investigations throughout the global North. [ 8 ] Seasonal variations in the decomposition microbiome of pigs were also reported by Carter, et al., who also reported an important contribution of soil-derived microbes to the decomposition ecosystem of the pig cadavers. [ 9 ]

In a 2013 study Metcalf, et al. endeavored to discover a so-called “microbial clock” to provide estimates of the postmortem interval in mice. [ 6 ] Impressively, the model that they developed was accurate to within three days of error over a period of 48 days of decomposition. It is worth noting, however, that the ambient conditions were held steady during the course of the experiment and insects were excluded in order to reduces sources of environmental variability. The controlled environment is an important first step, but favors construction of a robust predictive model at the expense of attempting to replicate the conditions when the model might actually be used, that is, human decedents discovered after an unknown period of time in uncontrolled environments.

Several recent reports have aimed to describe the postmortem human microbiome. Not surprisingly, the various microbial communities that colonize the human person change considerably following the death of the host as the chemical and biological milieu changes in almost every conceivable way. [ 5 ] The primary goal of research into the postmortem microbiome, or necrobiome, is to aid in death investigations by providing a means to reliably estimate the postmortem interval (PMI). Current methods of estimating the PMI of a deceased human person discovered in an uncontrolled environment are quite crude, involving subjective physical inspection of the cadaver for early-phase decomposition and insect colonization in later-phase decomposition. However, these techniques are notoriously unreliable. The use of entomology, in particular, is confounded by temperature, weather conditions, seasonal variation, geographic location, and many other factors, both known and unknown. The high variability of these PMI estimation techniques makes it clear that additional approaches are needed. Microbiome-based estimates may prove particularly useful in cases where insects are absent or delayed, such as indoors, burials, or in colder temperatures.

The human body is inhabited by a vast number of microorganisms, which have occupied every conceivable ecological niche. Recent advances in sequencing has resulted in a great deal of research focused on the human microbiome. [ 1 ] In particular, the microbiota of the skin is increasingly the subject of research into inter-personal differences and microbe-host interactions, revealing that microbial communities differ between individuals and between different sites on the body. [ 2 ] Compared to that of the gut and oral cavity, the skin microbiome appears to be more influenced by the host environment. [ 3 ] It is also becoming apparent that skin microbial communities play a role in many diseases, including obesity, diabetes, cancer and chronic inflammatory disease. [ 4 ] While the skin microbiome of living individuals has recieved attention, we know much less about the fate of these microbial communities after host death. These microbes are likely strongly influenced by the decomposition environment, which includes insect colonization and changes in soft tissue chemistry.

Because the events of tissue decomposition are equally dependent on time and temperature, we calculated the accumulated degree days (ADD) for each sample, with each 24 hour period postmortem counting as an equivalent to its average temperature in Celsius, as explained in Michaud, et al. [ 14 ] The following assumptions were made in these calculations. First, because cadaver placements at the ARF facility take place between 11:00A.M. and 1:00P.M., and all swabs are taken around that same time, we converted each day into two-half days, halving the average temperature for each day. Secondly, all cadavers were kept at 3°C for all days between the date of death and placement at the facility. Thirdly, if a cadaver was frozen, those days counted as zero toward the cumulative ADD value. Fourthly, average temperature listed for Knoxville, Tennessee, as recorded in Weather Underground ( www.wunderground.com ) was used as a proxy measurement for the average temperature at the ARF facility, which does not track and archive precise local temperature.

The barcoded DNA amplicons were pooled together and delivered to the Genome Technology Center at NYU Langone Medical Center for next-generation 16S metagenomic sequencing using the MiSeq platform (Illumina). Compiled sequence libraries were analyzed and phylogentically classified using the BaseSpace program (Illumina). Spreadsheets with absolute numbers of sequence reads for each taxon in each sample were extracted and transformed to relative abundance measures as explained below.

Next, 25 μL of the purified amplicon PCR product was barcoded by index PCR using the Nextera XT Index Kit from Illumina, as per provided protocol. Following PCR cleanup (again using the AMPure XL kit), 2 μL of each barcoded PCR product was subjected to agarose gel electrophoresis to verify integrity of each sample through visualization of a 630bp band.

Following the amplification step, 2 μL of each sample was run in a 1.5% agarose gel to ensure that the PCR was successful. Only those samples that yielded a clean PCR product at 550bp were included in future steps. PCR clean up was performed using the Agencourt AMPure XP beads kit by Beckman Coulter, following the provided protocol precisely. Final purified DNA was eluted in a volume of 50 μL and [DNA] was determined. Samples with less than 15 ng/μL were concentrated down to ≈ 25 μL using a speedvac before proceeding to the index PCR.

Preparation of sample amplification for sequencing was completed by following the 16S Metagenomic Sequencing Library Preparation protocol by Illumina. Briefly, the V3 and V4 regions of the 16S rRNA gene were amplified using universal degenerate primers in the provided kit. Procedures were exactly according to protocol with the following exceptions: volumes used were 7.5 μL template DNA, 2.5 μL Amplicon PCR Forward Primer (1 M), 2.5 μL Amplicon PCR Reverse Primer (1μM), and 12.5μL 2x KAPA HiFi HotStart ReadyMix for a total of 25μL. This modification was to increase the DNA concentration used per sample, since some samples were too low to follow the recommended protocol precisely.

DNA was extracted using the PowerLyzer PowerSoil DNA Isolation Kit (MoBio Laboratory) precisely according to the manufacturer’s protocol. Final purified DNA was eluted in a final volume of 100 μL. DNA concentration was determined by measuring the absorbance of each sample at 260 nm using the NanoDrop 2000 Spectrophotometer (Thermo Scientific). Only samples harboring at least 1 ng/μL DNA and yielded a clean spectrogram with a peak at 260 nm were included in subsequent amplification steps.

During sample collection, aseptic technique was utilized as much as possible. Because the bodies were placed prostrate, the heads are turned, facing one side of the other. For each cadaver, the nostril and ear canal chosen for sampling was that on the side of the face facing away from the ground, preventing or minimizing the sampling of bacteria from the soil and nearby cadavers. Just prior to each sample collection, the extreme tip of a fresh, sterile, Cap-Shure™ swab was briefly dipped into sterile phosphate-buffered saline (obtained 1X from Fisher Scientific). Then, the swab was placed into the nostril or ear canal until the entire cotton swab was inside the canal. The swab was gently pressed against the outside wall before moving the swab in a circular motion for two complete rotations. The swab was then placed back into its collection tube, which was itself placed back into its plastic wrapping, separately, and then into a sterile sample collection bag. Sample bags were kept at -20°C until DNA extraction was performed.

All samples were collected from cadavers placed at the Anthropological Research Facility (ARF) at the University of Tennessee at Knoxville. The use of deceased human subjects at the ARF does not require IRB approval as the bodies were donated for research purposes to the facility. In addition, this research was reviewed and approved by the Internal Review Board of the City University of New York as part of larger project (protocol #514576). The ARF is a preserved temperate deciduous forest with well-drained fine textured clayey soils. All cadavers were placed on the surface of the soil and allowed to decompose naturally. Bodies were placed in a prostrate position, unclothed, and loosely covered to reduce mass scavenging by large animals. A total of 144 sample swabs were taken from a total of 21 cadavers, most as a single collection event for each cadaver. However, four of these cadavers were swabbed repeatedly through the course of decomposition, starting at placement and continuing every 2–3 days until the tissues were too decomposed to access the ear and/or nose. Data from only these four cadavers was used in the early phase of our computational analysis when the models that were best suited for our data were selected, as explained below. Subsequent computational modeling included data from all suitable samples.

The RFR regressor is an ensemble method, meaning it employs a voting approach based on a family of simpler regressors. In the case of RFR, the simpler family of regressors consists of decision trees. This model neither has parameters nor is it instance-based. A more lengthy discussion can be found in the references. [ 21 ]

The KNR regressor is a simple instance-based approach, which means that no parameters are fit to the data. [ 15 ] Rather, the data itself is stored directly and used to predict the values for new instances. In the KNR case, this happens simply by averaging the known ADD for the k most similar training points for some new (test) instance. The measure of similarity may be Euclidean distance (when the data are regarded as vectors) or something more exotic. Though the KNR regressor does not have parameters, it does have hyperparameters, namely k and specifics of the algorithm used, which might specify the meaning of “similarity” or weight the training vectors in certain ways in an attempt to improve performance. An interesting corollary of the instance-based nature of KNR is that the training accuracy (the accuracy of the predictions the model makes on the training set) is always perfect.

Describing each of these regressors in detail is beyond the scope of this report, but a few general remarks may be helpful. The regressors SVR, RR, RFR, LR and BRR all fit a collection of linear coefficients to the data. They differ from one another and from simple linear regression in the objective function and their constraints, which are optimized during the fitting process. All of these regressors include some kind of regularization which is a penalty assigned in the objective function to complicated solutions. [ 15 ] Usually this amounts to penalizing coefficients which, when viewed as a vector, have a large norm with respect to either the L1 or L2 norms (the Manhattan metric and Euclidean metric, respectively). The values that control the degree to which complicated solutions are penalized are known as the hyperparameters for the regressor. [ 15 ] Fitting the hyperparameters to the data is important to avoid overfitting. Note that for the non-curated dataset, we have n > m in all cases except for kingdom and phylum, necessitating some form of regularization to avoid trivial solutions.

To analyze the data, several regression techniques were considered. To evaluate the effectiveness of the regressors, we used a machine learning approach, splitting the data randomly into two mutually exclusive groups called a training set and a testing set. The train/test split depends only on m and is the same for every taxon. The training set is selected uniformly at random to comprise 80% of the instances. The testing set is then taken as the remaining 20% of instances. [ 17 ] The regressors we employed in our analysis were the following (as implemented by the scikit-learn project, version 0.17): [ 18 ] Support Vector Regression (SVR) [ 19 ], K-neighbors Regression (KNR) [ 15 ], Ridge Regression (RR) [ 20 ], Lasso Regression (LR) [ 15 ], Elastic Net Regression (ENR), Random Forest Regression (RFR) [ 21 ], and Bayesian Ridge Regression (BRR) [ 20 ].

It is not necessary that q D be computed literally with respect to species. For example the p i can be sample composition percentages at the level of phylum, genus, etc, in which case q D represents diversity at the level of the appropriate taxon. For a fixed q, higher values of q D correspond to higher levels of diversity.

Intuitively, when q is small, q D provides a formalization of diversity which emphasizes species richness, whereas when q is large, equal representation of species in the environment (i.e. equitability) becomes more influential. In fact when q = 0, q D is simply the species richnes. Note also that special values of q cause q D to specialize to a number of well known diversity indices. For example, if q = 2, then q D is Simpson’s diversity index , and while q D is undefined for q = 1, it is the case that lim q → 1 q D = exp(H′) where H′ denotes the Shannon index defined by .

Here the variable S refers to species richness (the number of species present) and p i is the proportion of the population accounted for by the i th species. This quantity, which depends on a nonnegative free variable q, can be understood as the reciprocal of the weighted power mean (with power q − 1) of the proportions p i of the various species, where the weight of species i is p i .

In our discussion of diversity, we will use the following quantity to measure diversity of the microbial communities in each swab: [ 16 ]

The exact set of organisms used for each level for both non-curated and curated cases can be found in our supplementary materials ( http://bit.ly/2f4ltDH ). When the data matrix X is built from a spreadsheet, it contains integers for entries, with entry x i,j reflecting the absolute number of instances for organism j detected in sample i. Because this number may depend on, for example, the richness of the swab offered to the analyzer, it is normalized in the following way. For each row i of X, we compute s i = Σ j ≤ n x i,j , and replace with . Note that this form of normalization depends only on the row (i.e. sample), and can be done to new data without information from X. We do not perform column-based normalization (so-called feature normalization) manually, but this may be done in some of the algorithms we use for regression. Consult our code in the supplementary materials for details.

The number of rows in each table is 67 for all data, and the number of columns is the number of organisms, as shown. We also provide the logarithm of the number of columns in each dataset, for later reference.

A separate pruning was done at each level of the taxonomic hierarchy. The effect is that the curated data set has many fewer columns than the non-curated data, though they have an equal number of rows. The curated data is also higher quality in the sense that the selected taxa have unusually high agreement across cadavers as well as higher magnitude correlation values with respect to ADD and percent composition. Table 1 summarizes the dimensions of our data matrices.

For data that came from the four cadavers that were sampled repeatedly throughout decomposition, we also attempted to refine our data matrices into two different types: curated and non-curated. The non-curated data matrices X have one column corresponding to each organism occurring a nonzero number of times in some ear sample, or some nose sample (the same organism is frequently represented twice–once for the ear and once for the nose). The curated data employs a reduced set of organisms (both for ear and for the nose). The pruning of organisms was done manually through visual inspection, considering factors such as the correlation coefficient of the organism with the dependent variable, and the coherence of any correlation across multiple cadavers. To facilitate this process, code was written to plot ADD against percentage composition for each taxon, producing 5243 individual plots. In these images, data was color coded by cadaver, allowing an intuitive judgement to be made regarding the agreement across cadavers, as well as the plausibility of a functional relationship between ADD and percent composition. Fig 1 provides an example of an image of this type.

We regard the data as matrices X in which rows correspond to instances–swabs–and columns correspond to features—organisms (or taxa). The symbol X can be regarded as a multidimensional independent variable, for which there is a corresponding dependent vector variable y. If X is m × n then y is a column vector of length m, where row i of X determines (at least partly) the ADD value found in row i of y. There are a number of different possible matrices X, depending on which swabs, which organisms, and which taxonomic levels are considered. For instance, we may consider only nose swabs and restrict our attention to a special subset of organisms at the phylum level, and attempt a regression of y from the resulting matrix X. In another case, X may include both ear and nose swabs, and refer to the data at the class level. We thus consider several different transformations of our data to see which transformation performs best with respect to the problem of determining y from X. The main differences analyzed refer to whether X contains only ear data, only nose data, or both jointly, and whether all organisms are considered, or only a special curated subset, as described below.

In addition to species classification, each sequence read was also assigned to its proper kingdom, phylum, class, order, family, and genus. Mathematically, the species measurements may be considered to be the raw data, with data corresponding to higher taxonomic levels understood as transformations of the original data. In the parlance of machine learning, this is a form of feature extraction. Considering the data at higher taxa clusters the species into similar kinds, preserving some amount of information, while reducing the number of independent variables, and possibly increasing the effectiveness of the regression. [ 15 ]

The data for this project was produced by collecting a series of bacterial swabs from the ear and nasal canals of cadavers undergoing active decomposition in uncontrolled environmental conditions. Each swab is annotated with a time-since-death measurement that accounts for both temperature and time elapsed in a single variable, accumulated degree days (ADD). The DNA collected in each swab was subjected to quantitative sequencing of the 16S rDNA gene, which allowed detection of individual species and subsequent calculation of its relative abundance in the entire sample. Thus, the raw data consists of percent-abundance of each sequence (representing a bacterial species) and the ADD to which the cadavers were exposed at the time of the swab. In order to study these data as a regression problem, we set the independent variables as the relative abundance of the species (or higher taxa) in each collection swab and the dependent variable as the ADD value for that swab. The aim of our regression analysis was to observe if the dependent variable, ADD, can be determined as a function of the independent variables.

4 Results

4.1 Effect of Decomposition on Microbial Diversity We began the analysis of our data by inquiring if there was a useful relationship between ADD and sample diversity. For a data matrix X, recall that the rows of X represent population samples, and any given row naturally corresponds to p 1 ,p 2 , …, p S as in the definition of qD. Then if qD(X) denotes the column vector whose rows are the diversity values with respect to qD of the corresponding rows of X, we asked if there was any correlation between qD(X) and y. We used Pearson’s correlation coefficient to measure the strength of such a relationship for a number of values of q and several different taxa. Consider Fig 2, below. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 2. The images A, B, and C show how the correlation between qD(X) and y depends on the choice of q and the dataset X. The image D shows how diversity changes with ADD for the ear, nose and joint datasets (q = 0.4). https://doi.org/10.1371/journal.pone.0167370.g002 Panels A, B, and C in Fig 2 show Pearson’s r as a function of q for the species, phylum, and order datasets, respectively. Each contains three curves, corresponding to whether X contains data taken only from nose swabs, only from ear swabs, or a combination of both (joint). For most datasets, diversity decreased as ADD increases for the microbial communities found in the ear swabs, as is the case with data from both the ear and the nose swabs considered jointly. Using data from the nose swabs alone yields weaker correlations for the most part, and diversity tends to have a positive (increasing) relationship to ADD. Panel D of Fig 2 shows diversity (q = 0.4) as a function of ADD for the species data. Note that for ear-only data, the plot is predominantly decreasing for the first 100 ADD, after which it becomes comparatively constant. The nose-only data, by contrast, yields a curve that is approximately constant for this choice of taxon and q. It should be noted that in computing the diversity of the joint data, if the same organism occurs both in the ear and in the nose, this is counted as two taxa rather than one. The lowest value attained for the correlation coefficient is r = −0.425 for curated, joint ear and nose, family-level data, with an optimal choice of q = 0.226. The highest value is r = 0.35 for noncurated, nose-only, class-level data for an optimal choice of q = 0.50. The p values for these r are p = 0.00033 and p = 0.0033, respectively. Table 2 displays the r coefficients with the lowest p values for each dataset, as well as the associated q values. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 2. The most significant correlation found between qD(X) and y for each dataset X, and the optimizing q value. Kingdom data is omitted. https://doi.org/10.1371/journal.pone.0167370.t002