Significance Emission inventories of major climate-forcing agents like black carbon suffer high uncertainty for the early industrial era, thereby limiting their utility for extracting past climate sensitivity to atmospheric pollutants. We identify bird specimens as incidental records of atmospheric black carbon, filling a major historical sampling gap. We find that prevailing emission inventories underestimate black carbon levels in the United States through the first decades of the 20th century, suggesting that black carbon’s contribution to past climate forcing may also be underestimated. This study builds toward a robust, spatially dynamic inventory of atmospheric black carbon, highlighting the value of natural history collections as a resource for addressing present-day environmental challenges.

Abstract Atmospheric black carbon has long been recognized as a public health and environmental concern. More recently, black carbon has been identified as a major, ongoing contributor to anthropogenic climate change, thus making historical emission inventories of black carbon an essential tool for assessing past climate sensitivity and modeling future climate scenarios. Current estimates of black carbon emissions for the early industrial era have high uncertainty, however, because direct environmental sampling is sparse before the mid-1950s. Using photometric reflectance data of >1,300 bird specimens drawn from natural history collections, we track relative ambient concentrations of atmospheric black carbon between 1880 and 2015 within the US Manufacturing Belt, a region historically reliant on coal and dense with industry. Our data show that black carbon levels within the region peaked during the first decade of the 20th century. Following this peak, black carbon levels were positively correlated with coal consumption through midcentury, after which they decoupled, with black carbon concentrations declining as consumption continued to rise. The precipitous drop in atmospheric black carbon at midcentury reflects policies promoting burning efficiency and fuel transitions rather than regulating emissions alone. Our findings suggest that current emission inventories based on predictive modeling underestimate levels of atmospheric black carbon for the early industrial era, suggesting that the contribution of black carbon to past climate forcing may also be underestimated. These findings build toward a spatially dynamic emission inventory of black carbon based on direct environmental sampling.

Black carbon, the light-absorbing component of soot, is a complex carbonaceous aerosol that results from the incomplete combustion of organic matter, such as fossil fuels (1). Starting in the mid-19th century, cities within the US Manufacturing Belt—such as Chicago, Detroit, and Pittsburgh—experienced sharp rises in atmospheric soot due to their reliance on regional supplies of highly volatile soft, bituminous coal for manufacturing, domestic heating, and railway transportation (2). By the late 19th century, the palls of coal smoke hanging over industrial cities galvanized early civic reformers, who fought urban smoke pollution as an unsightly nuisance, an economic inefficiency, and a public health concern tied to respiratory illness and increased mortality (2, 3). These early, city-level efforts to mitigate atmospheric soot laid the groundwork for the modern environmental movement in the United States. While US cities no longer experience levels of atmospheric black carbon comparable to the historic peaks of the early 20th century, particle pollution remains a pressing public health and environmental issue in the United States and globally (4, 5).

Black carbon has more recently become recognized as a major contributor to anthropogenic climate change (4, 6, 7). As such, historical emission inventories are consequential for understanding black carbon’s effect on past climate and accurately modeling future climate scenarios. Estimates of black carbon emissions, however, have high uncertainty for the early industrial era (1), limiting our ability to use past emissions data to extract climate sensitivity. In the United States, efforts to measure concentrations of atmospheric soot were limited to sporadic city-level surveys before the mid-1950s (8), when federal legislation targeting air pollution gave rise to a coordinated national network for atmospheric monitoring (2, 3). As a result, our current understanding of atmospheric black carbon levels before midcentury in the US Manufacturing Belt is limited to anecdotal evidence and piecemeal records. Building accurate emission inventories of climate-forcing agents like black carbon remains a key step toward establishing a more rigorous understanding of how atmospheric pollutants affect climate.

Recent efforts to estimate historical black carbon emissions have used predictive models that combine fuel consumption data with emission factors, a variable that rates the efficiency of burning technologies (9⇓–11). Emission inventories generated by these models have been instrumental in evaluating the contribution of atmospheric black carbon to climate change (12⇓–14), but their power is contingent on the ability of emission factors to accurately capture changes in real-world burning efficiency over time. The robustness of predictive models can be independently evaluated by direct sampling data, such as the Greenland ice-core record (15), which captures free-tropospheric emissions of black carbon from North America and stands as one of the few inventories based on a standardized, direct sampling metric of black carbon that extends back before the 1950s. The emission trends inferred from predictive models [such as the Speciated Pollutant Emissions Wizard (SPEW) database from Bond et al., 2007 (11)] generally mirror the Greenland ice-core record, indicating a rise in atmospheric black carbon in the late 19th and early 20th centuries associated with increased coal consumption, with emissions dropping to near preindustrial levels shortly after midcentury. While these contrasting methods achieve comparable results, there are inconsistencies between them: The ice-core record indicates a peak in black carbon concentrations in the first decade of the 20th century, while predictive models place this peak two decades later. Reconciling this disparity not only strengthens our understanding of environmental history and policy, but also holds important consequences for downstream climate analyses.

Here, we develop an alternative direct-sampling method for estimating historical trends in atmospheric black carbon by analyzing black carbon deposition on bird specimens collected within the US Manufacturing Belt over the past 135 y. In contrast to the Greenland ice-core record, our dataset recovers historical trends in atmospheric black carbon that are geographically localized. Our method therefore bypasses assumptions about the origin of atmospheric pollutants that are necessary to interpret the ice-core samples. As a direct sampling metric, our dataset also bypasses the need to make assumptions about burning efficiency and technology shifts on which predictive models rely, providing an independent means for evaluating such models. By providing a more accurate, localized picture of historical trends in atmospheric black carbon, the results of the study yield a diverse set of implications that advance our understanding of human impacts on the physical and natural world, from assessing the impacts of black carbon on the environment to evaluating historical policies designed to clean up the air in some of the world’s smokiest cities.

SI Materials and Methods Photographing Specimens. Specimens were imaged with a mirrorless interchangeable lens camera (Sony a7R II) paired with a native 55 mm lens (Sonnar T* FE 55 mm F1.8 ZA), positioned at a fixed height of 72 cm over a self-contained light box (MK Digital Direct Photo-e-Box BIO) outfitted with 28-W continuous full-spectrum fluorescent bulbs (6,500 K, 84CRI) run through 120-V AC 60-Hz electronic ballasts. Specimens were illuminated by using top, side, and back bulbs in the light box, omitting the bottom (stage) bulbs and supplemental LED bulbs to ensure an even distribution of diffuse light from a single illuminant type source. At the beginning of each imaging session, the lighting elements were turned on and allowed to warm up for 20 min before shooting. Specimens were oriented so that the target area on the breast was positioned at the center of the camera’s field of view. The light box was fully enclosed during each exposure, except for a rectangular aperture on the top, sized to fit the camera’s field of view. Overhead lighting was turned off in each of the shooting locations, and windows were covered to further reduce ambient light leakage. The images were captured in 14-bit uncompressed raw format and analyzed by using RawDigger software (Version 1.2.11), which provides access to raw data directly recorded by the digital camera’s CMOS sensor. Analyzing the raw sensor data directly enabled us to bypass the linearization step described by Stevens et al., 2007, and McKay, 2013 (32, 33), since the raw values have not been altered by nonlinear gamma encoding algorithms that are introduced when raw sensor data are converted into conventional image formats, such as JPEG or TIFF (34). Before shooting, we tested the linearity of the camera’s CMOS sensor following the procedure outlined in Stevens et al., 2007 (32) and we found that the sensor provided a linear response over the entire dynamic range (Fig. S9). Exposure settings (shutter speed, aperture, and ISO) were optimized through a series of trials using reflectance standards. We conducted trials using four types of reflectance standards, including the XRite ColorChecker Passport (8-step), QPcard 101 (3-step), Labsphere Spectralon Diffuse Reflectance Standards (10 reference targets), and Munsell Neutral Value Scale matte finish (31-step). We found that each standard provided comparable results, but we selected the Munsell Neutral Value Scale as our primary standards because it was relatively affordable, provided the largest number of reference points, and included published reflectance percentages printed directly on the cards for easy reference. To determine exposure settings, we analyzed trial images in RawDigger with a goal of maximizing the dynamic range (defined as the distance between minimum and maximum light intensities) without introducing signal clipping on any of the color channels (R-G-B-G2), which occurs when certain clusters of pixels fall outside of the dynamic range due to overexposure (saturation). It is essential to refer to the raw data when assessing whether signal clipping has occurred, since the channel-specific histograms on many digital cameras’ displays incorporate gamma-encoding algorithms that make it difficult to tell whether signal clipping has actually occurred. Exposure settings maximizing dynamic range will often indicate overexposed areas on the camera’s built-in displays, when no signal clipping in the raw file has taken place. The ISO was set to 100 to ensure a limited amount of digital noise. Based on the trials, an aperture of f/16 was chosen to minimize optical vignetting (light falloff), which is introduced at lower focal ratios, while providing a depth of field that would ensure that the target area appeared in focus for all specimens, which varied in height due to differences in natural size and preparation of the specimens. With these parameters in place, a shutter speed of 1/25 s was selected to maximize the dynamic range. While the use of a light box ensured relatively even and continuous illumination compared with open studio lighting arrangements, perfectly consistent illumination is difficult to achieve in practice. Some unevenness was discovered in blank reference images, which was determined to have resulted from lens variables (optical vignetting and lens flare) and may have also been influenced by the arrangement of the bulbs in the light box. To account for these factors, the target area for each specimen was confined to a 3- × 3-inch square, which limited variance in illumination to <1%. Under the constant lighting conditions that a light box provides, reflectance standards theoretically only need to be photographed once over the course of shooting to generate calibration regressions. In practice, however, some minor variations in overall illumination were discovered between the three locations, which may have been due to light leakage or slight variations in the voltage supply to the bulbs at each location. This variation, however, was easily accounted for by imaging the Munsell Neutral Value Scale reflectance standards at each location and calculating reflectance values for specimens with location-specific reflectance regressions. Since reflectance is expressed as a percentage, and these percentage values are relative to the standards, no additional adjustments were needed to normalize the color channels or calibrate the values across shooting locations. We photographed each card of the Munsell Neutral Value separately at The Field Museum and Carnegie Museum of Natural History, positioning each card at the center of the field of view in the same area where we measured reflectance from bird feathers. To determine reflectance regressions from these locations, we used all 31 reflectance standards (ranging from 3.1 to 90% reflectance). At the University of Michigan Museum of Zoology, we photographed the Munsell Neutral Value Scale fanned out in single photograph. For this sample, we only included 12 reflectance steps (ranging from 9 to 84.2% reflectance) that fell within the target area (Fig. S9). Determining the Smoothing Function for the GAM. Smoothing parameters for GAMs can be determined in mgcv by using functions such as GCV that minimize residual deviance (goodness of fit) and degrees of freedom (21). With our final dataset, the GAM estimated a smoothing function of k = 10 (this model is plotted in Fig. S10), which recovered a smoother curve than k =20 (Fig. 2). Oversmoothing, however, can obscure signals in the data (35, 36), which appears to be happening with k = 10 based on our knowledge of likely inflection points (such as the 1929 US stock market crash) that are present in the consumption data and the Greenland ice-core record. For reference, in Fig. S10, we include a variety of smoothing functions from k = 10 to k = 100. Based on the comparison of possible k values, k = 10 appears to apply an overly powerful smoothing operation in the GAM, forcing the first decline of black carbon to begin in the early 1920s rather than the end of the decade where we would expect it to appear based on consumption trends; k = 13 through k = 35 recovers trends that are effectively identical, which appears to recover important signals in the data that over smoothing misses; k = 36 and greater generate toothy trends that overrepresent random variations within the sample set. Based on the variation in the shape of different GAMs, we selected a smoothing function of k = 20 to produce a relatively smooth trend line that still maintained a distinctive shape that allowed for comparison against consumption data. How Sampling Months Were Determined. Beginning in late summer, each species used in the study initiates an annual molt to replace worn and soiled body feathers with fresh plumage. This molting period can last through the fall months (20). Natural variation in the timing of the molt produces a mix of birds with fresh and soiled plumage among specimens sampled from these months. This annual molt signal was apparent in our sample, with samples from fall months producing shifts in mean reflectance caused by the introduction of freshly molted birds, along with uncharacteristically broad ranges in reflectance values compared with other months (Figs. S4 and S5). Freshly molted individuals do not provide evidence for atmospheric conditions in a given year, warranting their removal from the final dataset. Since freshly molted birds begin to accumulate particulate matter immediately after the molting cycle is complete, rather than selectively evaluating which individuals had recently molted, all of the specimens sampled during these months were removed. We determined the months to exclude for each species based on abrupt shifts in mean reflectance between months, which are indicative of annual molting patterns. For example, in Horned Larks, reflectance values shift abruptly between July and August and then increase again between November and December, indicating that the sample of birds in the months of August–November includes a substantial number freshly molted individuals (Fig. S4). Following this method, the months of August–November were excluded for Horned Larks and Red-headed Woodpeckers, and the months of September–November were excluded for Field Sparrows, Grasshopper Sparrows, and Eastern Towhees (Fig. S4). We could be confident in these shifts given their seasonal timing, since overall black carbon emissions seasonally trend in the opposite direction for a given year in the Northern Hemisphere, as fuel consumption increases to meet heating needs when average temperatures drop (8, 15). We limited this inquiry to the years 1880–1950 because after midcentury, birds are substantially cleaner in all months, compromising our ability to detect monthly breakpoints.

SI Evidence that Bird Specimens Accumulated Black Carbon from the Environment Before Collection To link reflectance data to black carbon levels for a single year, it had to be established that black carbon accumulation occurred before collection. Multiple lines of evidence indicated that the black carbon accumulated on bird specimens originated from the environment while the birds were alive and not from posthumous soiling or discoloration that occurred while being stored in a collection: (1) Since posthumous soiling would accrete continuously, if soiling had occurred over time in storage, it would not have been possible to observe seasonal differences, and any monthly trends that result from the annual molting cycle would have been erased or vastly diminished, particularly in older specimens. We found that consistent numbers of birds collected during the fall were much cleaner in a given year, indicating freshly molted individuals (Figs. S4 and S5). These patterns were observable even among birds that had been in the same collections as soiled birds, stored together since the time of collection.

(2) We conducted a visual survey of bird specimens collected outside the US Manufacturing Belt from other parts of the United States or from less industrialized countries during our 135-y sampling period. If posthumous soiling had occurred within our sample, we would have expected specimens collected in these nonindustrialized regions to have exhibited comparable levels of soiling to those in our sample, which we did not find. A visual example of this evidence can be seen in Fig. S11, which shows five Horned Larks collected in Illinois and five Horned Larks collected along the western coast of North America. All 10 birds were collected during nonmolting months between 1903 and 1922, a period in which consistently high levels of black carbon deposition were found on bird specimens collected within the US Manufacturing Belt.

(3) If specimens in our sample accumulated black carbon from sitting in museum collections, we would have expected specimens to have soiled ventral sides and cleaner dorsal sides because they generally rest in drawers with their breast and belly facing up. The dorsal side of the specimens would thus have been protected from soot precipitate. We found, however, that both sides of specimens exhibited soiling (Fig. S12).

(4) If substantial posthumous soiling had occurred within our samples, we would have predicted that the oldest specimens would have been the sootiest based on gradual accumulation over time. However, we found a slight increasing trend in black carbon deposition between 1880 and 1910. Fig. S11. Ten Horned Larks (E. alpestris pratensis) at The Field Museum, showing that specimens collected in nonindustrial regions do not exhibit comparable levels of soiling to birds collected within the US Manufacturing Belt. The five specimens in Left were collected in Illinois, inside the US Manufacturing Belt. The five specimens in Right were collected along the western coast of North America, outside of the US Manufacturing Belt. All 10 specimens were collected during nonmolting months (January–April) between 1903 and 1922. Fig. S12. Images of the dorsal side of specimens from Fig. 1 and Fig. S3. These images, paired with Fig. 1 and Fig. S3, show that even soiling appears over the entire bird, indicating that the soiled birds in our sample acquired black carbon from the environment while alive. (A) Field Sparrows (S. pusilla pusilla) from 1906 (Upper) and 1996 (Lower). (B) Grasshopper Sparrows (A. savannarum pratensis) from 1907 (Upper) and 1996 (Lower). (C) Horned Larks (E. alpestris pratensis) from 1904 (Upper) and 1966 (Lower). (D) Eastern Towhees (P. erythrophthalmus erythrophthalmus) from 1906 (Upper) and 2012 (Lower). (E) Red-headed Woodpeckers (M. erythrocephalus) from 1901 (Upper) and 1982 (Lower). Together, these lines of evidence suggest that any posthumous soiling from sitting in museum storage is negligible.

Conclusions This research highlights the unexpected ways in which museum materials can yield insights about the physical and natural world and help address present-day environmental challenges. Natural history collections are powerful resources for tracking environmental pollutants through time (29, 30) because specimens provide durable snapshots of the past environments from which they were drawn. For this study, bird specimens provided an incidental record of atmospheric black carbon from a period before standardized methods and coordinated systems for assessing air quality were in place. We focused on the US Manufacturing Belt because of its historical importance as a polluting region, but our dataset can naturally be expanded to encompass other regions with long industrial histories, such as Western Europe. Natural history collections thus represent a unique resource for exploring past environments and environmental history. For the purpose of this study, we used bird specimens as a direct sampling metric to assess historical concentrations of black carbon, which we used in turn to evaluate past environmental policy. Our study, however, also highlights the impact of environmental pollution on wildlife. Our samples show that black carbon particulate covered the landscape along with its living inhabitants. Black carbon accumulation on birds has potential implications for evolutionary pathways because plumage is fundamental in avian displays and signaling. Birds use their plumage to attract mates, defend territories, and/or camouflage themselves within the landscape to escape detection from predators. What happens when bright, sexually selected plumage patches are coated in soot, obscuring plumage signals that have evolved over hundreds of thousands of years? What are the consequences of black carbon deposition for visual predators when animal prey coloration is homogenized with the surrounding environment? How black carbon deposition on feathers has impacted signaling within and among species remains an open question.

Materials and Methods Reflectance has long been used as an efficient and reliable metric in atmospheric sampling (31). For the purposes of this study, we were interested in deriving relative ambient concentrations from black carbon deposition on bird feathers. Since black carbon is defined by its light-absorbing properties, trends of black carbon deposition on specimens can be quantified as a function of the reduction in reflectance relative to unsoiled specimens. We adapted photography methods from Stevens et al., 2007, and McKay, 2013 (32, 33) to quantify the reflectance of each specimen. For complete details of the materials and methods used to photograph specimens, see SI Materials and Methods. To determine the reflectance value for each specimen from a digital image, we used regression equations calculated from reflectance standards to convert raw sensor data to known reflectance values. We calculated R, G, and B channel-specific regressions from Munsell Neutral Value Scale reflectance standards in RawDigger (Version 1.2.11) for each of our three shooting locations: The Field Museum, Chicago; University of Michigan Museum of Zoology, Ann Arbor; and Carnegie Museum of Natural History, Pittsburgh (Fig. S9). Since our camera’s CMOS sensor incorporates an additional G channel (G2), we averaged both G-channel values to produce a single G-channel regression. The equations for each regression line can be found in Fig. S9. We uploaded the digital photograph of each specimen into RawDigger and sampled the uniform white patch on the ventral side of each specimen. We recovered median raw R, G/G2, and B channel sensor values from a sampling area that ranged from 25 to 900 mm2. Since feathers are a textured, heterogeneous surface, median values were used to minimize any effect of outliers. For each specimen, the sample area was determined by selecting a large continuous area without conspicuous portions of exposed skin, staining due to residual fat deposits, or other preparation and conservation issues (see Dataset S1 for sample areas). We used the collection-specific regression equations to calculate reflectance values separately for R, G/G2, and B channels for each specimen. We then averaged the three channel-specific reflectance values to obtain a composite reflectance value for each specimen. Fig. S9. Raw R, G, and B channel-specific regressions based on the Munsell Neutral Value Scale reflectance standards for each shooting location. The regression equations for each channel were used to calculate channel-specific reflectance from raw CMOS sensor data recovered in RawDigger for each specimen.

Acknowledgments We thank contributors past and present to natural history collections. Projects like this would not be possible without the commitment of individuals to collections, specifically: David Willard, Ben Marks, Sushma Reddy, Josh Engel, Shannon Hackett, and John Bates at The Field Museum; Janet Hinshaw and Benjamin Winger at the University of Michigan Museum of Zoology; and Steve Rogers at the Carnegie Museum of Natural History. David Willard and John Bates first introduced us to “sooty” birds in collections. We thank Julie Marie Lemon, Marissa Lee Benedict, and the Arts, Science + Culture Initiative at the University of Chicago, which funded this project. We thank Philipp Heck and Levke Kööp for help with SEM imaging; Iliah Borg for help with RawDigger software; the Statistics Consulting Program at the University of Chicago for advice with analyses; and Bret Hoffman for help with Fig. S3. We thank Hussein Al-Asadi, John Bates, Jonah Bloch-Johnson, João Capurucho, Jacob Cooper, Nick Crouch, Chad Eliason, Ryan Fuller, Shannon Hackett, Daniel Hooper, David Jablonski, Yanzhu Ji, Dallas Krentzel, Elizabeth Moyer, Amy Owen, Daniela Palmer, John Park, Trevor Price, Heather Skeen, Joel Snyder, Tim Sosa, K. Supriya, Alex White, and two anonymous reviewers for thoughtful comments that improved this manuscript.