The dataFIELD database described and applied here is an important step in capturing the breadth of food LCA studies in a form that can be linked to existing individual dietary data. It represents one of the more comprehensive compilations available of GHGE and CED data on food production. Further, organizing the database for straightforward linkage with NHANES data creates opportunities for a wide array of future research inquiries, including direct and indirect policy intervention simulations. The sections below provide further discussion on the database development and interpretation of the diet-level results.

It has become common practice in diet impact studies to assign proxy foods as approximations in the case of missing data, but to our knowledge, this paper is the first attempt to quantify the contributions from those proxy assignments. Table 2 indicates that proxy foods contribute 3% to the average diet GHGE, and 8% to CED. Proxy assignments are made based on foods with similar production characteristics. However, even if we assume that all of our proxy estimates are in error by a factor of 2 (i.e. all proxy impact factors are doubled), the mean diet-level impacts would still only increase by 2.6% for GHGE and 8.2% for CED.

Figure 2. Cumulative emission intensity of US 1 day diets using average impact factors. Diets are ranked in order of impact from low to high. Areas under the curve are proportional to the total impact, with percentage contributions by each quintile shown above the curve. The green box represents the cumulative emissions of those originally in the 5th quintile if their diets were to shift to diets with average emission intensities.

The literature review that underlies development of dataFIELD found that LCA studies that can be used to link to dietary choices have increased significantly in recent years, but data gaps still exist for many food types. This is consistent with other recent reviews (e.g. [ 22 ]). A scan of the foods requiring proxy assignments in table S4 (supporting information) offers a sense of current data gaps and a target for LCA practitioners interested in filling such gaps. In addition, many foods important to evaluation of healthful diets with low impact—nuts, legumes, meat substitutes—are poorly represented in the literature and deserve additional attention. Geographical representation is biased toward Europe. As has been customary in the diet-LCA literature, our main estimates for diet-level impacts are based on average LCA values applied to each food consumed. However, unlike other studies, we have addressed variability due to production practice, geography, or LCA method by calculating upper and lower bounds of impacts for each food and carrying these estimates through to diet-level impacts. As the NHANES dietary recall data does not specify production methods or geographical origin, we cannot be more precise in assigning impacts from LCA studies to foods eaten by NHANES respondents. However, geographical specificity becomes increasingly important with other impact categories such as water use, eutrophication, or land use. Although currently available data in these categories are limited, we plan to expand our database to water and land use impacts, specific to the US food market, in a future iteration.

This study demonstrates the disproportionate impacts that can be caused by some types of self-selected diets. Figure 2 displays the cumulative emissions of these diets when ranked in order of GHGE per person per day. GHGE associated with the fifth quintile of diets are nearly eight times that of the first quintile and three times that of the third (middle) quintile. If the top quintile of diets (representing 44.6 million Americans on a given day6) shifted such that their associated GHGE were aligned with the mean impact, this would represent a one-day reduction in GHGE of 0.27 million metric tons CO 2 eq. (mmt), equivalent to eliminating 661 million average passenger vehicle miles7 on a given day.

Table 5. Comparison of studies estimating impacts of the US diet or self selected diets in other countries. Country Diet data sourcea Impact factor data source GHGE kg CO 2 e capita−1 day−1 CED MJ capita−1 day−1 consumed consumed+losses consumed+losses This study US NHANES national survey (SS) Exhaustive lit. review 3.6 4.7 25.2 Heller and Keoleian 2015 [11] US USDA (FB) limited lit. review 3.6 5.0 Tom et al 2016 [12] US USDA (FB) [11], lit. review 5.1 34.5 Hallstrom et al 2017 [15] US USDA (FB) Lit. review 3.8 Vieux et al 2012 [17] France INCA2 national survey (SS) Lit. review 4.2 Meier and Christen 2013[19] Germany German National Nutrition Surveys (SS) Hybrid EIO LCA 5.6 37.0 Rugani et al 2013 [43] UK National Diet and Nutrition Survey (SS) + FB to estimate waste Lit. and other (cradle to point of sale) 8.8b Van Dooren et al 2014 [53] Netherlands Dutch National Food Consumption Survey (SS) Agri-footprint data [23] 4.1 Hendrie et al 2016 [54] Australia Australian Health Survey (SS) EIO LCA 18.7b (male) 13.7b (female) Bälter et al 2017 [44] Sweden LifeGene study (SS) Lit. identified sources 4.7 a(SS) = self-selected diet; (FB) = food balance. bRepresents broader boundary conditions than other studies; includes impacts through to the point of purchase.

This shift—which could be done by changing foods, reducing calories, or some combination of these two—would be represented graphically in figure 2 by removing the section of the curve above the average emission diet line for the fifth quintile. Current economy-wide US net emissions (based on 2015 data [34]) are 1023 mmt above the target levels in year 2025, as submitted to the U.N. Framework Convention on Climate Change (UNFCCC) [35]. The hypothetical diet shift described above, if implemented every day of the year and met by equivalent shifts in domestic production, would account for 9.6% of remaining reductions necessary to meet the target. (see supporting information for the emission reduction calculations.) Even if high emission diets (arbitrarily defined here as >25 kg CO 2 eq. person−1 day−1; the truncated tail extending above the representation in figure 2) are excluded from the estimate based on a presumption that they are either atypical or that such individuals are unlikely to shift diets, moving the remainder of the high quintile (GHGE >6.9 but <25 kg CO 2 eq. person−1 day−1) to the mean GHGE still accomplishes 9% of the reductions necessary for the US to meet the UNFCCC target. See supporting information for a parallel discussion on the cumulative impacts of food losses. Our estimates of reductions are likely to be somewhat exaggerated because a distribution of 1 day diets is known to be more dispersed than a distribution of usual diets [36]. This is one of the limitations of using NHANES. Since it is based on the 24 hour diet recall tool, it also tends to underestimate total energy intake, although this is true of all self-reported diet instruments [37]. In fact, 24 hour recalls provide more details about foods consumed and tend to be less biased than food frequency questionnaires [38]. Moreover, NHANES provides the only ongoing nationally representative source for information about individuals' diets. Our analysis highlights the importance of looking at individual behaviors rather than just population means, since there is clearly a wide range of impacts being caused by self-selected diets.

Table 5 offers a comparison of the results from this study with other reported estimates of the impacts of the US diet, as well as self-selected diets in other countries. When excluding studies that include broader boundary conditions, there is strong agreement across results, with a coefficient of variation of 3% for GHGE with US diets only, and 7% across all diets. Self-reported diet surveys carry a well-known under-reporting bias [39, 40] whereas food balance based estimates (production + imports—exports—non-food uses ± changes in stock) are often considered to be overestimates [41, 42]. A more refined food type characterization and a more exhaustive literature review were utilized in this study in comparison to that of Heller and Keoleian [11]. While beverages have not always been delineated as such in previous studies of diet impacts [11, 12, 15, 19, 43–44], we find them to be important contributors. This finding is further strengthened by the fact that packaging and use phases are not included within the boundary conditions of our estimates. Packaging often represents a hotspot in the life cycle impacts of beverages [45–49], and use phase activities (heating water, brewing coffee) can be important for hot beverages [50, 51].

The boundary conditions for the current study are cradle to farm gate for most food commodities, and include processing for the collection of FCID foods that are minimally processed ingredients (flours, oils, juices, etc). As such, our reported values should be considered underestimates of actual impacts associated with food consumption in the US as they include the production impacts of processed food ingredients, but not the impacts of processing itself. Using the US Environmentally Extended Input Output model developed by US EPA [52] and an approach detailed in supporting information, we estimate that food processing not captured in our bottom-up estimates amounts to 15% of the total cradle to processor gate (including agricultural production sectors) GHGE. Packaging materials represent an additional 6%. Inclusion of these missing food processing and packaging contributions would raise our estimates by ~27%, although it is important to note that these input-output based approximations are made for the food and agricultural sectors in aggregate, and will not apply evenly across different food types or for specific diets (i.e. they apply only at the mean).