Overview

We adjusted reported BMI data from the BRFSS to align the data with objectively measured BMI distributions from the National Health and Nutrition Examination Survey (NHANES), a nationally representative survey in which measured data on height and weight are collected with the use of standardized examination procedures.17 We estimated trends in the prevalence of BMI categories according to subgroup in each state and made projections through 2030. The first author designed the study, gathered and analyzed the data, and vouches for the accuracy and completeness of the data. All the authors critically revised the manuscript and made the decision to submit the manuscript for publication.

Data

We obtained BRFSS data from 1993 through 1994 and 1999 through 2016, periods during which annual data were collected for all 50 states and Washington, D.C. (except for Wyoming in 1993, Rhode Island in 1994, and Hawaii in 2004). We obtained nationally representative NHANES data from 1991 through 1994 (phase 2 of NHANES III) and from 1999 through 2016 (continuous NHANES). Data from pre-1999 BRFSS surveys were restricted to 1993 and 1994 to coincide with phase 2 of NHANES III. (Before 1993, not all states were included in the BRFSS.) We cleaned each data set to ensure that the variables of interest were not missing and ensured that reported height and weight in the BRFSS were biologically plausible. Our final BRFSS data set included 6,264,226 adults (18 years of age or older), and our NHANES data set included 57,131 adults. (Exclusion criteria and respondent characteristics are provided in Section 1 in the Supplementary Appendix, available with the full text of this article at NEJM.org.)

Adjustment for Self-Reporting Bias

We adjusted reported BMI data from the BRFSS so that the distribution was similar to measured BMI from NHANES. Because both the BRFSS and NHANES are designed to be nationally representative surveys, data from NHANES can be used to adjust participant-reported body measures in the BRFSS. By adjusting the entire distribution of reported BMI to be consistent with measured BMI in NHANES, we adjusted for self-reporting bias while preserving the relative position of each person’s BMI.8 Specifically, we estimated the difference between participant-reported BMI and measured BMI according to quantile and then fit cubic splines to smoothly estimate self-reporting bias across the entire BMI distribution. Each person’s BMI was then adjusted for this bias given his or her BMI quantile. We adjusted BMI distributions separately according to sex and time period (1993–1994, 1999–2004, 2005–2010, and 2011–2016) to control for potential time trends in self-reporting bias and composition of demographic subgroups. (Additional details are provided in Section 2 in the Supplementary Appendix.)

State-Specific Trends and Projections

BMI categories were defined according to the Centers for Disease Control and Prevention (CDC) guidelines: underweight or normal weight (BMI [the weight in kilograms divided by the square of the height in meters], <25), overweight (25 to <30), moderate obesity (30 to <35), and severe obesity (≥35).18 We used multinomial (renormalized logistic) regressions to predict the prevalence of each BMI category as a function of time. This method ensures that the prevalence of all categories sums to 100% in each year and allows estimation of nonlinear trends in the prevalence of BMI categories. Our reduced covariate model (i.e., with year as the independent variable) implicitly accounts for trends in the composition of demographic subgroups (e.g., age distribution and composition of race or ethnic group categories) within each state, since the relative contributions of these various factors (and their potential changing effect over time) are already reflected in the prevalence estimates. Such an approach also implicitly controls for trends in other variables that may affect BMI, such as smoking or illness. Although it is important to explicitly control for these variables when estimating the effect of BMI on related health outcomes, because our outcome of interest was the prevalence of BMI categories over time, it was not necessary to control for these variables because their effect was already reflected in the observed prevalence estimates used to fit the models. (Additional details and a discussion of previous approaches are provided in Sections 3.1 and 3.2 in the Supplementary Appendix.)

Regressions were performed nationally and for each state independently, while taking the complex survey structure of the BRFSS into account. We estimated historical trends and projections of the prevalence of each BMI category from 1990 through 2030, as well as the prevalence of overall obesity (BMI, ≥30). We also made projections for demographic subgroups to examine trends and explore the effect of geography (i.e., state of residence) on obesity trends within subgroups. We estimated trends according to sex (male or female), race or ethnic group (non-Hispanic white, non-Hispanic black, Hispanic, or non-Hispanic other), annual household income (<$20,000, $20,000 to <$50,000, or ≥$50,000), education (less than high-school graduate, high-school graduate to some college, or college graduate), and age group (18 to 39, 40 to 64, or ≥65 years) (Section 3.3 in the Supplementary Appendix). Because of the small sample sizes and changing BRFSS categories of race or ethnic group over time, we combined five groups (“American Indian or Alaskan Native,” “Asian,” “Native Hawaiian or Pacific Islander,” “other,” and “multiracial”) into one “non-Hispanic other” category.

In accordance with the CDC guidelines that consider BRFSS estimates unreliable if they are based on a sample of fewer than 50 people,19 we suppressed state-level estimates from subgroups with fewer than 1000 respondents; given our data set of 20 rounds of BRFSS surveys, we suppressed estimates from subgroups with fewer than 50 respondents on average per year in a state. Thus, estimates for the following subgroups were suppressed: non-Hispanic black adults in 12 states (Alaska, Hawaii, Idaho, Maine, Montana, New Hampshire, North Dakota, Oregon, South Dakota, Utah, Vermont, and Wyoming) and Hispanic adults in 2 states (North Dakota and West Virginia).

To account for uncertainty, we bootstrapped both data sets (NHANES and BRFSS) 1000 times, considering the complex structure of each survey (Section 3.4 in the Supplementary Appendix) and repeated all analyses (i.e., adjustment for self-reporting bias and state-specific projections). We report the mean and 95% confidence interval (calculated as the 2.5 and 97.5 percentiles of the bootstrapped values) for all estimates.

Assessment of Predictive Accuracy and Sensitivity Analyses

To evaluate the accuracy of our approach, we restricted our data sets (NHANES and BRFSS) to include only data from 1999 through 2010. We then repeated our analyses with this subset of data and predicted the prevalence of each BMI category in 2016 (i.e., 6 years after the last observed year in our truncated data). We compared our predictions with the observed prevalence (corrected for self-reporting bias) in 2016. This exercise allowed us to evaluate the accuracy of our approach in predicting future values and allowed us to assess the potential effect of the change in the BRFSS sample design in 2011 to include cell-phone interviews on our estimation of trends. For our predictions, we calculated the coverage probability (i.e., the percentage of observed estimates that fell within our 95% confidence intervals), the percentage of our mean predictions that fell within a certain distance (e.g., 10% relative error) of the observed estimate, and the mean absolute error.

In a sensitivity analysis, we also made projections based on self-reported body measures (i.e., no adjustment for self-reporting bias). Statistical analyses were performed with the use of R software, version 3.2.5 (R Foundation for Statistical Computing), with BRFSS bootstrapping performed in Java for computational efficiency.