Animal Care and Use Statement

All methods were approved by and carried out in accordance with Monell Institutional Animal Care and Use Committee (IACUC) guidelines and regulations under approved protocol #1176.

Cohort 1: Behavioral odor discrimination assays in the Y-maze

Urine donor care

Thirty-six four-week old male C57BL/6 J mice were obtained commercially, from Jackson Laboratories (Bar Harbor, ME). Upon arrival to the Monell Chemical Senses Center animal facility, mice were pair housed in 11.5 × 8-inch plastic shoebox containers with approximately 30 grams of wood shaving bedding per cage (cedar chip laboratory bedding, Northeastern Products Corporation, Warrensburg, NY). Mice had ad libitum access to clean drinking water and chow (Teklad Rodent Diet 8604, Harlan, Madison, WI) and full cage changes were performed weekly. Ambient room temperature was controlled at ~74–76° F and mice were housed on a 12:12 light/dark cycle (lights on at 7:00 am and off at 7:00 pm). Animals were checked twice daily for general health and condition. Mice were acclimated to the animal housing facility for ~2 weeks prior to the start of experimental protocols and were therefore ~6–7 weeks of age at the initiation of experiments. All mice were uniquely ear marked upon arrival to Monell to allow individual identification throughout the study.

Urine donor experimental treatment

Cages and individual mice within cages were randomly assigned to experimental treatments (Fig. 1A). To induce physiological and behavioral symptoms of sickness, we injected mice with lipopolysaccharide (LPS). LPS is an endotoxin, and an antigenic component in the cell wall of gram negative bacteria such as Escherichia coli (E. coli). Injection with LPS acutely reduces physical and social activity levels, induces lethargy, reduced feeding and drinking behavior, fever, and body mass loss23,41,43, but animals typically return to starting body mass and increase activity levels within several days. Because it is not a live replicating pathogen, LPS cannot be transmitted between or among individuals. We obtained LPS from E. coli as a lypholized powder from Sigma Aldrich Co. (product L2630, St. Louis, MO) and prepared LPS solution in 0.01 M sterile phosphate buffered saline (PBS) at a concentration of 0.20 mg/mL for intraperitoneal delivery of 200 uL (40 ug per mouse (ave body mass 20 g) or ~ 2 mg/Kg)). Mice injected with LPS were referred to as “sick”. Healthy urine donor animals received matching volume intraperitoneal injections of a 0.01 M sterile PBS solution. PBS does not induce inflammation or immune activation. Mice injected with PBS were referred to in two ways, depending solely on who they were cohoused with. Healthy PBS-injected mice cohoused with other healthy PBS-injected mice were referred to as “healthy”, while healthy PBS-injected mice cohoused with sick LPS-injected mice were referred to as “exposed” (Fig. 1A). Sick and healthy donor mice providing urine for training and validation Y-maze trials were cohoused with partners of the same experimental treatment (Fig. 1A). Healthy and exposed donor mice providing urine for Y-maze generalization trials were cohoused with partners of the same or different experimental treatments (Fig. 1A). Consistency in cohousing of all urine donors controlled for the potentially confounding effects of social/pair housing and experimental treatment. Six healthy/PBS-injected and six sick/LPS-injected donors provided urine for odor discrimination experimental training trials, an additional six healthy and six sick donors provided urine for behavioral odor discrimination validation trials, and a final six healthy and six exposed donors provided urine for behavioral odor discrimination generalization trials.

Urine donor collection procedure

Urine is a potent source of body odor in mice and also constitutes one component of scent marks used in territorial behavior65,66,67. Urine was directly collected from donor mice by light abdominal pressure. This is a minimally invasive and extremely rapid (<30 s) method that allows repeated sample collection from the same individual over time. This method of collection was especially valuable in this study, because it reduced the possibility of sample contamination from the housing environment or due to exposure of urine to air and microorganisms for long periods of time during the collection process. Investigators also changed gloves between sampling of mice from different treatment groups, further reducing the possibility of odor transfer in the process of urine collection from one sample to another. Donor animals quickly became acclimated to the direct urine collection procedure. Urine volume collected from each individual mouse varied from approximately 5 uL to ~80 uL per day; on some days there were individual donors that had empty bladders, and did not provide a sample. Collected urine was stored separately for each donor mouse and each day of collection and all samples were transferred to a −5 °C freezer immediately after collection. When we were ready to use urine samples in the behavioral odor discrimination assay trials, we chose 3 samples from the same mouse over several days ranging from 1–15 days post-LPS or post-PBS injection (e.g., urine samples collected 2, 4, and 8 days after injections for mouse # 1) and pooled these samples to obtain enough urine to be used as stimulus. In general, pooled urine samples were a volume of ~0.25 mL.

Biosensor mouse care and testing in the Y-maze

Ten mice were initially selected for behavioral odor discrimination bioassays. We called these animals “biosensor mice”, but they have also been referred to in previous studies as “sniffer mice”19,50. Biosensor mice were trained to perform the behavioral task of urine odor discrimination in the Y-maze assay. In this study, biosensor mice were adult (9–18 month) female C57BL/6 J mice, bred and maintained at the Monell Center. These mice initially underwent acclimation to the Y-maze odor discrimination testing apparatus at approximately 6 months of age, after which time they were exposed to training panels, consisting first of urine from two different mouse strains and later consisting of urine from sick and healthy mice. Initial biosensor mouse training for our specific experiment took place over a range of 7–15 days. Each day of training was carried out with a unique combination of sick and healthy urine donors. In later validation and generalization trials, biosensor mice were exposed to unique and completely novel pairs urine from donor mice. All biosensor mice were housed individually in a separate room from urine donors, but they were maintained under identical environmental and husbandry conditions as urine donor mice. All biosensor mice underwent water restriction for 23 h prior to Y-maze behavioral trials.

A Y-maze apparatus was used to test behavioral odor discrimination of biosensor mice as previously described19,23,41,50 (Supplementary Fig. S1). Air was continuously blown over two 35 mm Petri dishes containing approximately 0.25 mL urine from donor mice. Petri dishes containing urine were randomly assigned and enclosed on either side of the Y-maze “arms” and were inaccessible to physical contact by biosensor mice at all times. Air was blown from the ends of the two arms of the Y-maze through the neck of the maze, and down the central arm of the maze to a gated starting box area, where each biosensor mouse was placed at the start of each new trial. The start box gate was manually raised and lowered in a timed sequence, and two additional gates located at the entrances to the arms of the Y-maze allowed the biosensor mice to be contained after their odor/Y-maze arm choice was made. A positive response to one of the odors in the Y-maze was determined when biosensor mice ceased olfactory scanning activity, or odor tracking, through the Y-maze and displayed characteristic digging and scratching behaviors toward the closed reward box adjacent to the volatilized odor source. Trained biosensor mice often made the decision between the two arms of the Y-maze within 1–3 seconds of opening the start box. During training trials, biosensor mice were rewarded for a correct choice of odor by investigator removal of the metal cover containing the water source (an open tipped conical tube), which allowed the mouse to retrieve a single drop of water. Conical tubes holding a single drop of water were placed on both sides of the Y-maze at all times. In training within the Y-maze, mice can be reinforced to either of the odors presented (i.e., odors of sick or healthy urine donors, in our experimental training paradigm). We made the a priori choice to always reinforce biosensor mouse training to the odors of sick mice because we were primarily interested in whether mice could attend to potentially similar odor changes in our generalization trials comparing healthy and exposed urine donor mice (Fig. 1A,B). In experimental extinction, validation, and generalization trials (described in further detail, below) biosensor mice did not receive a water reward, and were instead transferred immediately back to the Y-maze start box after their odor choice was made and recorded. On average, 48 individual trials were run by each biosensor mouse during a single test session, which included rewarded training trials, unrewarded training trials (extinctions), and validation or generalization trials spaced evenly throughout the session. Experimental validation and generalization sessions occurred once per week and were interspersed from week to week. A single session included validation or generalization trials, but never both. A single validation or generalization session typically consisted of 5–6 unrewarded trials with novel donors, 30–40 rewarded training trials, and 5–6 unrewarded training trials. Four different types of experimental trials were used in the Y-maze paradigm, described below.

(a) Rewarded training trials: Biosensors mice received a water reward for a “correct” choice of the odor box containing urine from a sick conspecific. To reduce the possibility that mice were being trained to individual identities rather than donor treatment, unique combinations of training donors were used throughout training. Identity of the samples was known to the operator of the Y-maze in training (sick versus healthy), so she could provide an immediate water reward when the correct choice was made. There were two types of training trials of interest. Initial, or primary rewarded training trials were completed until biosensor mice reached an average of 80% correct responses to sick mouse urine odors in two consecutive blocks of training trials. This average “success threshold” of 80% correct responses was set a priori, and mice were not permitted to continue in further trials if they performed in training below this level. In our study, one of the ten biosensor mice did not successfully perform above the 80% success threshold and was eliminated from further trials. Rewarded experimental training trials were interspersed throughout experimental validation and generalization trials, such that mice were always exposed to rewarded and unrewarded trials within each testing session. The number of rewarded training trials within a validation or generalization trial ranged from 24–30. (b) Extinction/unrewarded training trials: Extinction trials were introduced so that biosensor mice could gain exposure to unrewarded trials, which were subsequently used in validations and experimental generalization sessions. Intermittent partial reinforcement also serves to strengthen operant responses68. Unrewarded extinction trials were run with urine from sick and healthy training donors (i.e., urine samples were familiar to biosensor mice because they had been presented at least once before in training). Extinction trials were interspersed within validation and generalization sessions, and ~5 extinction trials for each individual biosensor mouse were run within in each testing session. (c) Validation trials: Experimental Y-maze validation trials served as a type of generalization trial. Validation trials offered an additional level of assurance that biosensor mice were learning about treatment, rather than individual differences. In validation trials, we presented novel urine donor mice exposed to the same experimental treatments as training donors (sick versus healthy) to biosensor animals. Biosensor mice were exposed to these unique pairs only once across each validation session, but biosensor mice were typically presented with 5 validation trials per validation session. Investigators were blind to the identity of these samples (a second investigator recorded sample ID but did not run biosensor mice in any trials), and mice were unrewarded for their choice of odor/Y-maze arm. Performance in validation trials was scored at the end of experimental Y-maze behavioral assay. (d) Experimental generalization/test trials: Experimental generalization trials allowed us to ask how trained biosensor mice generalized their learned responses to completely new conditions applied to another novel set urine donors (Fig. 1B). In other words, generalization trials provided a crucial test of our hypothesis. Generalization trials were designed to ask trained biosensor mice to compare the odors of healthy and exposed mice (Fig. 1A,B). As with experimental validation trials, investigators were blind to sample identity, samples were randomly assigned to Y-maze arms, and mice were unrewarded for their choice of odor/Y-maze arm. Performance in generalization/test trials was scored at the end of Y-maze behavioral assay (i.e., after trials were completed). As with extinction and validation trials, a total of 5 generalization trials were run within each generalization session.

Statistical analysis of biosensor responses in the Y-maze

Statistical procedures applied to the behavioral odor discrimination assays using the Y-maze setup were carried out in R Version 3.3.269. Figures were made using R package ‘ggplot2’70. To compare performance in the behavioral odor discrimination Y-maze assay, we used an exact test of binomial proportion. This test examines whether the proportion of biosensor animal responses to one or the other arm of the Y-maze differs significantly from chance, or 50/50 choice of the left or right arm of the Y-maze23. Before carrying out the binomial test, we used a generalized linear model with a Poisson distribution (using the ‘glm’ function in R package ‘lme4’71) to test for main and interactive effects of time (day on which trials were carried out), biosensor mouse ID, and trial type (extinctions carried out within validation sessions versus extinction trials carried out within generalization sessions) on our outcome of interest (number of correct versus incorrect responses in Y-maze generalization trials). This allowed us to test whether mouse biosensors were behaving identically to unrewarded training samples from sick and healthy urine donors across sessions, and therefore allowed us to rule out the possibility that changes in individual biosensor mouse performance over time was confounded with our question of interest, about responses to novel donors in the experimental validation and generalization sessions. We used log total number of trials performed for each biosensor animal as an offset term in this predictive model. Because there were no significant main or interactive effects of time, biosensor mouse ID, or trial type, we pooled biosensor mouse responses across sessions, and by trial type. That is, performance for each biosensor animal was averaged across-validation sessions (e.g., 15–20 validation trials), across generalization sessions (e.g., 15–20 generalization trials), and extinction sessions (e.g., 30–40 extinction trials).

Cohorts 2 and 3: Statistical predictive modeling based on urine odorants identified by GC-MS

Urine donor care

We examined two additional cohorts of mice (cohorts 2 (N = 56 mice) and 3 (N = 20)) in the subsequent portion of our study, designed to chemically discriminate urinary volatile profiles. Urine samples from donor mice in cohort 2 and 3 were subjected to headspace analysis employing gas-chromatography mass-spectrometry (GC-MS). As with cohort 1, urine donors in cohorts 2 and 3 were also obtained from Jackson Laboratories (Bar Harbor, ME), were approximately the same age and body size at the time of experiments as urine donors in cohort 1. Furthermore, donor mice providing urine samples for our chemical discrimination assays were treated identically to urine donors for the behavioral discrimination assays, in terms of their housing conditions, food and water availability, frequency of cage changes, and handling, as described above.

Urine donor experimental treatment

Experimental treatments randomly applied to individual urine donors in cohorts 2 and 3 were carried out identically to cohort 1. The single difference between urine donors in cohort 2 and 3 was in housing setup. Urine donor mice in cohort 2 were cohoused (one healthy mouse with another healthy mouse or one healthy mouse with a sick mouse), similarly to generalization donors in cohort 1) and allowed to freely interact and make contact with each other throughout the cages. Nine cages contained paired healthy mice (n = 18 healthy) and 19 cages contained paired sick and exposed mice (n = 19 sick and n = 19 exposed, Fig. 1A). Urine donor mice in cohort 3 were cohoused in cages of the same size and dimensions as cohort 2, but cages were divided with plastic, semi-permeable/perforated inserts, which permitted olfactory, auditory, and slightly obscured visual communication between sides of the cage (Supplementary Fig. S6). Each cage divider had 8, 0.5 mm holes at regularly spaced intervals. Dividers prevented bodily contact between mice and prevented contact with virtually all bedding and waste materials of animals on the adjacent side of the cage. Cages tops were also open to the environment, and therefore permitted air/volatile odorant transfer both above and through the cage partitions. In cohort 3, three cages contained paired healthy mice (n = 6 healthy) and 7 cages contained paired sick and exposed mice (n = 7 sick and n = 7 exposed, Fig. 1A).

Urine donor collection procedure

Urine collection procedures for donor mice in cohorts 2 and 3 were identical to those for cohort 1. Urine samples were never pooled for chemical analysis. At least 25 uL urine was required per sample to run chemical analysis.

Urine analysis procedures with GC-MS

Urine samples (25 uL) from individual donor mice were prepared for chemical analysis by fortification with an internal standard, 800 ng l-carvone. We added 10 uL of an aqueous fortification solution of carvone to mouse urine. Vials containing no urine samples (carvone blanks) were also fortified with 10 uL of the carvone solution and analyzed by GC-MS. Samples were subjected to headspace analysis using a HT3 dynamic headspace analyzer (Teledyne Tekmar, Mason, OH, USA) outfitted with Supelco Trap K Vocarb 3000 trap (Sigma-Aldrich Co., St. Louis, MO) and as described elsewhere24 using splitless injection. Briefly, during headspace collection, samples were maintained at 40 °C, swept with helium for 10 minutes (flow rate of 75 mL/min), and the volatiles collected directly on the thermal desorption trap. Trap contents were desorbed at 260 °C directly into a Thermal Scientific ISQ single-quadropole gas chromatograph-mass spectrometer (GC-MS; Thermo Scientific) equipped with a 30 m × 0.25 mm id Stabiliwax-DA fused-silica capillary column (Restek). The GC oven program consisted of an initial temperature of 40 °C (held for 3 min) followed by a ramp of 7 °C/min to a final temperature of 230 °C (held for 6 min). The mass spectrometer was operated in scan mode from 33 to 400 m/z. Volatile compound identification was determined based on spectral library matching in the NIST 528 Standard Reference Database (Supplementary Table S1). For known mouse urine volatiles with poor spectral library matches (e.g.,<50% matches), we examined relative retention times among known mouse urine volatiles on Stabilwax columns to assist with compound identification51,52,53. Baseline correction, noise elimination, and peak alignment of the chromatographic data were performed with Metalign72 and the MSClust tool was used to perform mass spectral extraction for generation of selected ion chormatographic peak responses73.

We averaged peak responses over two samples for each individual, consisting of urine collected between days 4 and 15 post treatment exposure. For all mice, a single sample within the range of days 4–7 and another sample within the range of days 9–15 were analyzed. In several cases, we did not collect enough urine volume for mice, and so, these animals were omitted from our analysis (e.g., starting sample size was not identical to sample size in analyses). Urine samples for both cohorts of mice were run interspersed throughout our GC/MS runs to avoid the potentially confounding effect of time on instrumentation results.

Statistical predictive modeling of odorants described by GC-MS

We analyzed chemical data from the two cohorts of mice (cohort 2 and 3) separately, but the processes of model/feature selection from an initially larger number of all possible identified chromatographic peaks, followed by linear discriminant analysis and cross-validation performed on the training data set, and subsequent classification-based predictions on a novel test data set were identical (and similar to previously published methods24). We performed stepwise model selection using the package ‘klaR’74 on the training data set, consisting of healthy and sick mouse samples, using the function ‘greedy.wilks’. We stopped at inclusion or exclusion of peaks in the stepwise selection process when the squared canonical correlation value was greater than or equal to at least 0.60 (or 60%). We set this threshold for model selection a priori, as previous experience with similar data sets suggested that this threshold of 0.60 would likely be adequate for desired classification performance of the model (e.g., to minimize the bias-variance tradeoff) in cross-validation and subsequent predictions on the test data set. We applied a linear discriminant analysis with the ‘lda’ function in R package ‘car’75 and used leave-one-out cross-validation to assess the accuracy of our model, fit to the training data set. Finally, we used the ‘predict’ function to perform predictive classification of our test data set, which included exposed mice in cohorts 2 and 3 (analyzed separately). The test data set was not used in initial model building/feature selection. We present results as the number of exposed mice in the test data set classified as sick versus the number of exposed mice classified as healthy. To statistically compare the difference in classification of exposed mouse samples as sick versus healthy, we used a two-tailed non-parametric Wilcoxon rank sum test with continuity correction. This statistical test was performed using the posterior probability scores for classifying each individual test observation as sick or healthy. Posterior probability scores represent the degree of certainty (expressed as a percentage) with which each test sample was predictively classified as sick versus healthy). The Wilcoxon test examined the hypothesis that the mean posterior probabilities were equal for sick versus healthy classification. Thus, a significant P value suggests a significant difference in the average probability of exposed mice being classified as sick versus healthy. Although it was not the intent of this study to make individual pairwise comparisons of individual compounds identified using model selection techniques, we present suppelmentary results of pairwise two-sample t tests (or a non-parametric alternative), comparing compound levels between sick and healthy mice, in Supplementary Table S2A,B.