Study species and sample collection

North Atlantic right whales were sampled in daylight hours (between 06:00 and 18:00) in calm seas (<3 Beaufort sea state) along the eastern Atlantic seaboard, where right whales congregate for seasonal feeding. Blow samples were collected using a sampling device (see below) fastened to the end of a carbon fiber pole (9.75 m long), which was mounted to a cantilevered pivot on the foredeck of an 8-m research vessel4 (Fig. 1). During sample collection, the vessel slowly approached an individual whale at idling speed on a gradually converging course to minimize disturbance to the whale. The sampling device was held ~3–4 m above the water and rotated skyward to avoid seawater contamination until there was a good chance of obtaining a sample. On anticipating surfacing behavior of the whale, the pole was extended and lowered to position the sampling device above the exhaling blowholes (0.2–0.8 m) to catch a portion of the aerosol droplets. To evaluate the efficiency of blow sample collection, we recorded the sampling outcome for every whale that was approached for a blow sample (sample successfully collected or no sample). Date, time, and location (latitude/longitude) of collection were also recorded. Field research on right whales was approved by the New England Aquarium’s Animal Care and Use Committee (IACUC) and carried out under the U.S. National Marine Fisheries Service permit number 14233 and Canada’s Department of Fisheries and Oceans permits under the Species at Risk Act.

Sampling devices

Two different sampling devices were used to examine the practicality of each material for collecting blow from free-swimming whales, and to provide a sampling device comparison. Both devices had passed prior laboratory validations69, but each sampling material had different physical qualities for collecting a volume of sample. The preferred device for optimal analytical precision69 was a sterile polystyrene dish (25 cm × 25 cm; Corning® bioassay dish CLS431111, Sigma-Aldrich, St Louis, MO, USA; ‘dish’ hereafter) (Fig. 1). The second sampling device was a single-ply of nylon 110 µm mesh (cut to 30 cm × 30 cm; Nitex nylon, Elko Filtering, Miami, FL, USA; ‘nylon mesh’ hereafter) stretched over a clean plastic framework, which had previously been used in published studies as a collection material for cetacean blow4,32,33. In preparation, nylon mesh was thoroughly washed before use to remove potential interfering exogenous particles – using separate wash cycles of soapy water, distilled water, and 70% ethanol as previously described69.

Sample quality score

Every sample of whale blow collected was subjectively scored for quality, based on the proximity of the sampling device to the whale’s exhaling blowholes and the amount of visible blow droplets collected. Sample quality scores were: fair = sampler was in the exhaled vapor at >2 m above the blowholes, collecting diffuse fine droplets; good = sampler was 1–2 m above the blowholes, collecting coarse droplets covering <30% of sampling surface; excellent = sampler was <1 m above the blowholes, collecting coarse droplets across >30% of sampling surface. This qualitative score was recorded on the presumption that it characterized the amount of respiratory fluid collected on a sampling device4, with samples scored as excellent likely holding greater sample volume. If the sampling effort was not successful or poorly scored, we redeployed the sampler to collect from the same whale until it dived, recording the number of collection attempts (up to four blows collected for a given sample), and the revised quality score. Immediately after collection, the sampling device was placed in a protective zip-type bag, detached from the pole, and stored on ice packs in a cooler before being frozen at −80 °C upon return to shore (typically within 4 ± 0.2 h of sample collection). Previous testing has confirmed that steroid hormones are stable under these field storage conditions for at least 6 hours69.

Assigning reproductive state

Sampled whales were photographed to enable individual identification, based on unique markings such as callosity patterns and scars, and to obtain life history data using the North Atlantic Right Whale Identification and Sightings Database55. Photo-identification of individual whales was performed after conclusion of fieldwork by expert personnel using well-established protocols54. Whales were categorized as juveniles (1–8 y.o. and never calved) or adults (year before first calving or ≥9 y.o.)70. Pregnant females were confirmed by multiple sightings with a dependent calf in the year after sampling. This method of identifying pregnancy in females would not account for perinatal mortality, spontaneous abortion or undetected embryogenesis; i.e., we cannot rule out the possibility that some females re-sighted without calves may in fact have been pregnant the year prior.

Evaluation of other sampling influences

When an animal perceives a stressor, a typical physiological response involves a measureable increase in circulating glucocorticoids (including cortisol) within 5 minutes65. Blow sampling does not make contact with the whale; however, the use of an extended pole to collect the blow sample does necessitate a close vessel approach (ca 5–10 m) over a period of time, such that the duration of the sampling event and repeated sampling of an individual may influence the adrenal stress response of a whale – and ultimately, might affect cortisol concentrations in the collected blow sample. Therefore, we recorded the duration of sampling for each whale (number of minutes between initiation of the slow approach towards the whale and collection of the sample), as well as whether or not the identified whale had previously been sampled during the study period (categories of first sample collected, repeat sample on same day, or repeat sample on a different day).

Sample analysis

Sample extraction

Obtaining a true volumetric measurement of respiratory vapor in each sample collected from a large whale was not possible due to various factors (see Introduction), especially unknown seawater contamination4,9 and air-dried sample potentially adhering to the sampling surface69. To process samples, we used methods validated by Burgess et al.69 for extracting hormones from low-volume samples collected on dish and nylon mesh devices. In brief, dish samples were extracted by pouring 50 mL of 100% ethanol (EtOH) onto the dish surface, which was then lidded and gently agitated on a plate-shaker for 30 min. This EtOH rinse was decanted into 25 × 125 mm borosilicate glass tubes and dried under compressed air for 24 h. Nylon mesh samples were extracted by pouring 80 mL of 100% EtOH over each mesh inside a 120-mL polypropylene jar. The jar was vigorously mixed on a plate-shaker for 1 h, after which the liquid was decanted into 25 × 125 mm borosilicate glass tubes. The nylon mesh component was centrifuged at 4000 g for 15 mins to separate additional liquid, which was added to the glass tubes. The zip-type bag that held the nylon mesh sample was also rinsed with 20 mL of 100% EtOH. The combined ~100 mL EtOH rinse was dried in glass tubes under compressed air for 24 h. All samples were reconstituted in 1.0 mL of dH 2 O (= total extract volume; N.B. extract volume does not relate to the original [unknown] sample volume of respiratory droplets collected at sea), and stored frozen at −80 °C until hormone analysis.

Urea analysis

Urea was investigated in whale blow because of its use as a normalization factor in several studies49,50, and to examine the assumption that urea amount reflects the blow concentration of samples collected from large whales. Other potential biomarkers for evaluating dilution, albumin and creatinine, were investigated in supplementary trials but these compounds had unsuccessful detectability in blow sample extracts when using commercial assay kits - and subsequently, further investigation of these compounds was halted to proceed with the development of urea analyses.

Blow sample extracts were analyzed for urea using a colorimetric detection kit (#K024-H1; Arbor Assays, Ann Arbor, MI) designed to quantitatively measure urea nitrogen in various sample types (including saliva, another fluid with low urea concentration). The urea assay is more sensitive to sample turbidity than the hormone assays; therefore, all samples were clarified before urea analysis via centrifugation at 5000 g for 10 min, followed by filtration of 100 uL aliquot of the resulting supernatant through a 0.22 µm pore membrane unit (#SLGVX13NL, Millex-GV hydrophilic PVDF membrane filter; EMD Millipore, Darmstadt, Germany) using a disposable Luer-Lock™ syringe (1 mL; #14-823-30, BD, NJ). For urea assay, nine standards (0.04–10.0 mg/dL; assay sensitivity = 0.01 ± 0.02 mg/dL) and clarified samples (undiluted) were loaded in duplicate as 30 uL volumes and mixed with kit reagents in a 96-well microtiter plate. This assay was performed at 60% volume (i.e., all reagent volumes were reduced to 60% of that stated in the manufacturer’s protocol) to minimize volume required from each blow sample; in-house testing verified that a 60%-volume protocol maintains good assay performance with acceptable accuracy and sensitivity (data not shown). Next, the plate was incubated at room temperature for 30 mins before reading the optical density at 450 nm. Urea concentrations were determined using a four-parameter logistic model based on the standard curve. Raw assay results for urea nitrogen were converted to urea by multiplying by 2.14, and expressed as milligrams of urea per deciliter of extract volume (absolute urea mg/dL extract).

Hormone analysis

Enzyme immunoassay kits (EIA; Arbor Assays, Ann Arbor, MI) were used for the quantification of progesterone (#K025-H1), testosterone (#K032-H1) and cortisol (#ISWE002) in all samples. Assay methods were performed according to manufacturer instructions (see http://www.arborassays.com), except that an additional low standard was included in each standard curve to increase the detection range, i.e., assay standard curve for progesterone ranged from 0.025 to 3.2 ng/mL (8 standards; assay sensitivity = 0.012 ± 0.008 ng/mL); for testosterone from 0.021 to 10.0 pg/mL (8 standards; assay sensitivity = 0.006 ± 0.006 ng/mL); and for cortisol from 0.0125 to 3.2 pg/mL (9 standards; assay sensitivity = 0.003 ± 0.002 ng/mL). For progesterone and testosterone assays, blow extracts were diluted at 1:4 with assay buffer (#X065; Arbor Assays). Some samples were re-assayed at 1:2 dilution to bring assay results nearer to 50% binding for best assay precision. Seventeen samples for progesterone and one for testosterone were below the limit of assay detection. For cortisol, blow extracts were analyzed undiluted; all had detectable cortisol. Each plate contained a configuration of standards, non-specific binding wells, maximum binding wells and controls run in triplicate (i.e., in duplicate at the beginning of the plate and singular at the end), and samples were assayed in duplicate. Hormone concentrations were determined using a four-parameter logistic model based on the standard curve. Raw assay results were expressed as nanogram of hormone per milliliter of extract volume (absolute hormone ng/mL extract) [N.B. extract volume was always 1.0 mL and absolute hormone values are not relative concentrations, since all samples were dried and reconstituted in 1.0 mL dH 2 O].

Assay quality control and verification

Since the assay kits used here were not designed for use with whale respiratory vapor samples, the suitability of each assay kit for measuring blow extracts was assessed. Parallelism and accuracy validation tests71 were performed using a pool of sample extract to ensure that antibodies and reagents recognized the targeted analyte in whale blow in a predictable manner and without interference (see Supplementary information). The binding of serial dilutions of blow extract (neat to 1:64) was parallel to the standard curve in immunoassays (progesterone: F 1,9 = 3.06, P = 0.11; testosterone: F 1,8 = 0.16, P = 0.70: cortisol: F 1,13 = 0.63, P = 0.44; see also validations by Hunt et al.4) or colorimetric assay (urea: F 1,12 = 2.31, P = 0.15; see Supplementary Figure 1a), indicating that substances in whale blow extracts do not interfere with antibody binding. All assays exhibited accuracy at their target dilutions (progesterone: slope = 1.14, r2 = 0.99; testosterone: slope = 0.83, r2 = 0.99; cortisol: slope = 0.72, r2 = 0.99; urea: slope = 0.99, r2 = 0.99), verifying reliable determination of analyte concentrations in right whale blow samples due to good mathematical accuracy across a range of concentrations from very low to high (see Supplementary Figure 1b; see also validations by Hunt et al.4). Additionally, seawater samples collected during field sampling were analyzed in each assay, with zero detectable hormone or urea measured.

To monitor precision and reproducibility in assays, high (~30%) and very low (~90%) concentration control samples were run on each plate (n <6 assays performed for each analyte). All assays were performed by the same person, and any sample with a coefficient of variation (CV) between duplicates of >10% was re-assayed72. For all assay types, the intra-assay CVs between sample duplicates were <8.2% (2.6 ± 0.3%), and the inter-assay CVs were <5.4% (2.9 ± 1.0%) and <10.5% (6.4 ± 1.4%) for high and low concentration controls, respectively. To quantify assay sensitivity, zero-standard replicates (n = 20 wells) were analyzed, with sensitivity calculated as the mean of assay results for zero-standard replicates ±2 standard deviations72.

Assay interference

Various materials used in the collection and processing of blow samples have been shown to introduce consistent low levels of assay interference69; therefore, we tested for exogenous interference in all assays for both sampling devices. As recommended by Burgess et al.69, blank materials of dish and nitex mesh were extracted and processed following the same procedure as a biological sample (n = 20 for each different sampler type), and then assayed for urea and all three hormones to achieve an estimate of background (spurious) measures in these negative control samples. Based on results (see Supplementary Table 1), sample concentrations for urea, progesterone, testosterone and cortisol were all adjusted for low and consistent background levels by subtracting the mean negative control concentration for that sampling device from the observed concentration. This correction helped to evaluate whether adequate blow sample had been collected for hormone analysis, since only those sample measurements greater than known assay interference for the sampling device were retained in the dataset (i.e., limit of detection). All analyte data are reported corrected for assay interference.

Statistical analysis

The rate of success for collecting a blow sample from an approached whale during each day of fieldwork was analyzed using Chi-square analysis. Spearman rank correlation was used to investigate whether the quality score of samples improved with the number of days spent sampling. Data on whale identity, sex, and reproductive state were integrated with blow analysis results for all samples. Analyte data were all modeled in a generalized linear model (GLM) framework using a log-link function and gamma distribution, which better accounted for the right-skewed distribution of measured concentrations in blow.

To investigate urea as an indicator of the dilution of collected blow samples, we used a GLM to examine the quality score of samples [fair, good or excellent] and the type of sampling device [dish or nitex mesh] as explanatory variables of absolute urea concentration in blow extracts (ng/mL extract; response variable). An interaction term (quality score × sampling device) was included to consider the possibility that urea concentrations may not exhibit the same changing relationship across quality scores in dish and nitex mesh samplers. Given that urea is typically maintained in a narrow concentration range in the body47, samples with poorer quality scores were predicted to have consistently low levels of absolute urea per mL of extract to reflect lower volumes of respiratory fluid collected. Conversely, samples with better quality scores were predicted to have the highest levels of absolute urea per mL extract. Detecting these trends would indicate that urea content of extracts reflects the amount of respiratory fluid collected from a whale (i.e., urea is a meaningful dilution indicator), which is a fundamental validation before proceeding to test urea as a normalizing factor for the variable dilution of respiratory droplets. Based on the results, hormone assay results were normalized against the amount of urea in each extract, using the formula: blow hormone concentration normalized (ng/mg urea) = absolute hormone (ng/mL extract) /absolute urea (mg/mL extract) . Thus, hormone concentrations of blow samples were quantified as nanogram of hormone per milligram of urea (ng/mg urea).

Hormone data were analyzed using generalized linear mixed models to allow both fixed and random components to be fitted to a model; in this case, individual whale (i.e., whale identity) was included as a random effect to account for individual-level variability. For models, we incorporated predicted explanatory variables that were known for each right whale and each sampling event, including whale sex and age class [juvenile or adult], the duration of the sampling event [binned as ≤3 min, 4–10 min or >10 min], time of day [hour integers, 6:00–18:00], sample quality score, type of sampling device, and sample occurrence for an individual whale [categories of first sample collected, repeat sample on same day, or repeat sample on a different day]. A set of competing a priori models were generated that tested different ways in which blow hormone concentrations – progesterone and testosterone as sex hormones (6 models), and cortisol as a stress-related hormone (8 models) (see Table 1) – could vary as a function of effects associated with life history traits and stress-related influences (i.e., biologically meaningful measures), or from non-biological variables associated with sampling artifacts (i.e., measures that might indicate sample concentration was not adequately standardized for dilution). NB: both dish and nitex mesh samples were retained in the dataset because ‘type of sampling device’ provided an additional and useful non-biological variable with which to evaluate the validity of hormone normalization results. We hypothesized that normalized hormone values (ng/mg urea) would demonstrate reliable quantification of hormone concentration in different whale blow samples, exhibiting stronger associations with biologically relevant factors than sampling artifacts. Finally, we conducted the same model analyses on the dataset of absolute hormone values (ng/mg extract) that were not adjusted for differing amounts of sample collected, and therefore, not standardized for true sample concentration (i.e., not relative data). Inclusion of model results for non-normalized hormone data permits comparison between normalized data outcomes and random expectations.