We documented 16,162 incident T2D cases during 4,102,369 person-years of follow-up. In pooled multivariable-adjusted analysis, both PDI and hPDI were inversely associated with T2D (PDI: hazard ratio [HR] for extreme deciles 0.51, 95% CI 0.47–0.55, p trend < 0.001; hPDI: HR for extreme deciles 0.55, 95% CI 0.51–0.59, p trend < 0.001). The association of T2D with PDI was considerably attenuated when we additionally adjusted for body mass index (BMI) categories (HR 0.80, 95% CI 0.74–0.87, p trend < 0.001), while that with hPDI remained largely unchanged (HR 0.66, 95% CI 0.61–0.72, p trend < 0.001). uPDI was positively associated with T2D even after BMI adjustment (HR for extreme deciles 1.16, 95% CI 1.08–1.25, p trend < 0.001). Limitations of the study include self-reported diet assessment, with the possibility of measurement error, and the potential for residual or unmeasured confounding given the observational nature of the study design.

We included 69,949 women from the Nurses’ Health Study (1984–2012), 90,239 women from the Nurses’ Health Study 2 (1991–2011), and 40,539 men from the Health Professionals Follow-Up Study (1986–2010), free of chronic diseases at baseline. Dietary data were collected every 2–4 y using a semi-quantitative food frequency questionnaire. Using these data, we created an overall plant-based diet index (PDI), where plant foods received positive scores, while animal foods (animal fats, dairy, eggs, fish/seafood, poultry/red meat, miscellaneous animal-based foods) received reverse scores. We also created a healthful plant-based diet index (hPDI), where healthy plant foods (whole grains, fruits, vegetables, nuts, legumes, vegetable oils, tea/coffee) received positive scores, while less healthy plant foods (fruit juices, sweetened beverages, refined grains, potatoes, sweets/desserts) and animal foods received reverse scores. Lastly, we created an unhealthful plant-based diet index (uPDI) by assigning positive scores to less healthy plant foods and reverse scores to healthy plant foods and animal foods.

Plant-based diets have been recommended to reduce the risk of type 2 diabetes (T2D). However, not all plant foods are necessarily beneficial. We examined the association of an overall plant-based diet and hypothesized healthful and unhealthful versions of a plant-based diet with T2D incidence in three prospective cohort studies in the US.

We thus conceptualized a graded dietary pattern that positively weighs plant foods and negatively weighs animal foods, similar to the approach used by Martínez-González et al. [ 13 ]. We examined the association of this overall plant-based diet and, a priori, healthful and unhealthful versions of a plant-based diet with T2D incidence in three large prospective cohort studies in the US. We hypothesized that these plant-based diets would be inversely associated with T2D risk.

Prior studies on plant-based diets and T2D [ 7 – 9 ] have defined plant-based diets as “vegetarian” diets, categorizing study populations dichotomously into participants who do or do not consume some or all animal foods. An important question from clinical and public health standpoints, however, is whether gradually moving towards a plant-rich diet by progressively decreasing animal food intake lowers T2D risk. If so, public health recommendations could suggest incremental dietary changes. Existing studies of vegetarian diets and T2D are also limited by a lack of differentiation among plant foods with divergent effects on T2D, because less nutrient-dense plant foods, such as refined grains, potatoes, and sugar-sweetened beverages, are associated with higher T2D risk [ 10 – 12 ].

Type 2 diabetes (T2D) is associated with increased morbidity, mortality, and healthcare costs in the US [ 1 ]. Several plant foods, such as whole grains, fruits, and vegetables, are associated with a lower risk of T2D [ 2 – 4 ], while certain animal foods, such as red and processed meats, are positively associated with T2D risk [ 5 ]. Additionally, the recently released 2015 Dietary Guidelines Advisory Committee report recommends shifting away from intake of certain animal foods and moving towards a plant-rich diet [ 6 ]. Thus, we evaluated the hypothesis that a plant-based diet is protective against T2D.

All dietary variables were cumulatively updated, i.e., were averaged, over the entire follow-up duration to better capture long-term diet. Updating was stopped when major outcomes (CVD or cancer) developed, as diagnosis with these conditions could change an individual’s diet. Values of non-dietary covariates were updated every 2 y to account for changes in these variables over time. In order to examine potential nonlinear associations, we created continuous variables of the indices by assigning the median value to each decile and conducting tests for linear trend, examined associations per 10-unit increase in the indices, and used restricted cubic splines. We tested for effect modification by age, physical activity, family history of diabetes, and BMI, by including cross-product terms. The analysis was carried out separately for each cohort, and the cohort-specific HRs were combined using a fixed-effects model; the Cochrane Q statistic [ 26 ], the I 2 statistic [ 27 ], and the between-study coefficient of variation [ 28 , 29 ] were used to assess heterogeneity among the cohorts. All statistical tests were two-sided (α = 0.05). All analyses were performed using SAS version 9.4 for UNIX (SAS Institute).

We calculated person-time for each participant from questionnaire return date until T2D diagnosis, death, censoring, or end of follow-up (30 June 2012 in NHS, 30 June 2011 in NHS2, and 1 January 2010 in HPFS). For the primary analysis, we categorized the indices into deciles, so as to not make assumptions about linearity and to limit the influence of outlying observations. We used Cox proportional-hazards regression to evaluate the associations between deciles of each index and T2D incidence. Age (years) was used as the timescale, with stratification by calendar time (2-y intervals). We adjusted for smoking status, alcohol intake, physical activity, family history of diabetes, multivitamin use, margarine intake, energy intake, baseline hypertension and hypercholesterolemia, body mass index (BMI) categories, menopausal status and postmenopausal hormone use (women), and oral contraceptive use (NHS2). Continuous covariates were included in the model as categories for the reasons cited above for categorizing the indices.

Participants who self-reported physician-diagnosed diabetes were sent a supplementary questionnaire with established validity to confirm diagnosis [ 21 , 22 ]. Only confirmed cases that met ≥1 of the following criteria were included (as per the National Diabetes Data Group) [ 23 ]: (a) ≥1 classic symptoms plus fasting blood glucose ≥ 140 mg/dl (>=7.8 mmol/l) or random blood glucose ≥ 200 mg/dl (≥11.1 mmol/l); (b) no symptoms, but raised blood glucose levels (i.e., fasting blood glucose ≥ 140 mg/dl or random blood glucose ≥ 200 mg/dl or 2-h blood glucose after oral glucose tolerance testing ≥ 200 mg/dl) on two different occasions; (c) treatment with hypoglycemic drugs. The threshold for fasting plasma glucose was changed to ≥126 mg/dl (7.0 mmol/l) starting in 1998 [ 24 ]. HbA1c ≥ 6.5% was further added to the diagnosis criteria starting in 2010 [ 25 ].

S1 Table details examples of foods constituting the food groups. The 18 food groups were divided into quintiles of consumption, and each quintile was assigned a score between 1 and 5. For PDI, participants received a score of 5 for each plant food group for which they were above the highest quintile of consumption, a score of 4 for each plant food group for which they were above the second highest quintile but below the highest quintile, and so on, with a score of 1 for consumption below the lowest quintile (positive scores). On the other hand, participants received a score of 1 for each animal food group for which they were above the highest quintile of consumption, a score of 2 for each animal food group for which they were between the highest and second highest quintiles, and so on, with a score of 5 for consumption below the lowest quintile (reverse scores). For hPDI, positive scores were given to healthy plant food groups, and reverse scores to less healthy plant food groups and animal food groups. Finally, for uPDI, positive scores were given to less healthy plant food groups, and reverse scores to healthy plant food groups and animal food groups. The 18 food group scores for an individual were summed to obtain the indices, with a theoretical range of 18 (lowest possible score) to 90 (highest possible score). The observed ranges at baseline were 24–85 (PDI), 28–86 (hPDI), and 27–90 (uPDI) across the cohorts. The indices were analyzed as deciles, with energy intake adjusted at the analysis stage.

We created an overall plant-based diet index (PDI), a healthful plant-based diet index (hPDI), and an unhealthful plant-based diet index (uPDI). The procedure we used to create these indices is similar to the one used by Martínez-González et al. [ 13 ]; their “provegetarian food pattern” is similar in composition to our PDI. Frequencies of consumption of each food were converted into servings consumed per day. Then the number of servings of foods that belonged to each of 18 food groups were added up. The 18 food groups were created on the basis of nutrient and culinary similarities, within larger categories of animal foods and healthy and less healthy plant foods. We distinguished between healthy and less healthy plant foods using existing knowledge of associations of the foods with T2D, other outcomes (CVD, certain cancers), and intermediate conditions (obesity, hypertension, lipids, inflammation). Plant foods not clearly associated in one direction with several health outcomes, specifically alcoholic beverages, were not included in the indices. We also excluded margarine from the indices, as its fatty acid composition has changed over time from high trans fat to high unsaturated fat. We controlled for alcoholic beverages and margarine consumption in the analysis.

The Nurses’ Health Study (NHS) started in 1976 with 121,701 female nurses (aged 30–55 y) [ 14 ], the Nurses’ Health Study 2 (NHS2) started in 1989 with 116,430 female nurses (aged 25–42 y) [ 15 ], and the Health Professionals Follow-Up Study (HPFS) started in 1986 with 51,529 male health professionals (aged 40–75 y) [ 16 ]; all three studies recruited participants from across the US. In all three studies, follow-up questionnaires collect information on lifestyle and medical history biennially, with a response rate of ~90% per cycle. In the current analysis, the 1984, 1991, and 1986 cycles were the baselines for NHS, NHS2, and HPFS, respectively, because these are the cycles in which data on most covariates of interest were first comprehensively measured. Participants with diabetes, cancer (except nonmelanoma skin cancer), cardiovascular disease (CVD), reported energy intake levels outside predefined limits (<600 or >3,500 kcal/d for women and <800 or >4,200 kcal/d for men), or incomplete dietary data at baseline were excluded. The final analysis included 69,949 women in NHS, 90,239 women in NHS2, and 40,539 men in HPFS at baseline.

Previous analyses in these cohorts have found other dietary patterns such as the Mediterranean diet, the Alternate Healthy Eating Index (aHEI), and Dietary Approaches to Stop Hypertension (DASH) to be inversely associated with T2D [ 34 – 36 ]. Thus, in order to examine the independent associations of PDI and hPDI with T2D incidence, we individually controlled for these patterns ( S6 and S7 Tables). Pooled HRs for both PDI and hPDI remained largely unchanged when the Mediterranean diet was controlled for, and were only slightly attenuated with aHEI or DASH in the same model.

To examine the effect of consuming a healthful plant-based diet that is also high in intake of some animal foods known to be associated with reduced risk of several health outcomes (e.g., fish and yogurt [ 30 – 33 ]), we created two variations of hPDI. When we modified hPDI to score fish/seafood intake positively, the pooled HRs were slightly attenuated (HR for extreme deciles 0.73, 95% CI 0.68–0.79; HR per 10-unit increase 0.87, 95% CI 0.85–0.89, p trend < 0.001). Results for a modified hPDI with yogurt scored positively were not substantially different (HR for extreme deciles 0.65, 95% CI 0.60–0.71; HR per 10-unit increase 0.83, 95% CI 0.81–0.85, p trend < 0.001).

To examine the individual contributions of healthy plant foods, less healthy plant foods, and animal foods to T2D risk, we included variables for all three food types simultaneously in the fully adjusted model; this allowed for mutual adjustment of the food types for one another, and hence an evaluation of their independent associations with T2D incidence. Healthy plant foods were inversely associated with T2D, while animal foods were positively associated, and less healthy plant foods were not associated, with risk ( S3 Fig ).

Results were pooled across the three cohorts using a fixed-effects model. Adjusted for age (years), smoking status (never, past, current [1–14, 15–24, or ≥25 cigarettes/day]), physical activity (<3, 3–8.9, 9–17.9, 18–26.9, or ≥27 MET-h/wk), alcohol intake (0, 0.1–4.9, 5–9.9, 10–14.9, or ≥15 g/d), multivitamin use (yes or no), family history of diabetes (yes or no), margarine intake (quintiles), energy intake (quintiles), baseline hypertension (yes or no), baseline hypercholesterolemia (yes or no), and BMI (<21, 21–22.9, 23–24.9, 25–26.9, 27–29.9, 30–32.9, 33–34.9, 35–39.9, or ≥40 kg/m 2 ). Also adjusted for menopause status and postmenopausal hormone use in NHS and NHS2 (premenopausal or, if postmenopausal, current, past, or never postmenopausal hormone use) and for oral contraceptive use in NHS2 (never, past, or current use). p trend < 0.001 for both indices across all strata. p-Value obtained by assigning the median value to each decile and entering this as a continuous variable in the model.

Our findings remained robust in several sensitivity analyses. In restricted cubic spline analysis, we did not find evidence for a nonlinear association of either PDI or hPDI with T2D incidence. Thus, both indices had significant linear associations with T2D incidence, with a stronger dose-response relationship for hPDI ( S1 Fig ). Similar inverse associations were observed in strata defined by physical activity and family history of diabetes ( Fig 2 ). The inverse association of PDI with T2D incidence was stronger in non-obese than in obese participants (p interaction < 0.001), and the inverse associations of both PDI and hPDI were stronger in older participants (p interaction = 0.02) ( S3 Table ). The associations of both PDI and hPDI with T2D were virtually unchanged upon further adjustment for ethnicity, marital status, recent physical exam, diet beverage intake, and indicators of socioeconomic status ( S4 Table ). Results were also similar when the analysis was restricted to participants with fasting plasma glucose screening in the previous 2 y (PDI: HR for extreme deciles 0.78, 95% CI 0.71–0.85, p trend < 0.001; hPDI: HR for extreme deciles 0.65, 95% CI 0.59–0.71, p trend < 0.001). Continuously updating PDI and hPDI throughout follow-up did not change results ( S5 Table ). When we used baseline intakes of PDI and hPDI, associations were modestly attenuated but remained significant (PDI: HR for extreme deciles 0.86, 95% CI 0.80–0.93, p trend < 0.001; hPDI: HR for extreme deciles 0.70, 95% CI 0.64–0.75, p trend < 0.001). Associations were also modestly attenuated when we used the most recent scores prior to diagnosis of T2D (PDI: HR for extreme deciles 0.84, 95% CI 0.78–0.91, p trend < 0.001; hPDI: HR for extreme deciles 0.74, 95% CI 0.69–0.80, p trend < 0.001). Stratified analysis showed no significant effect modification by ethnicity for the diet indices (p interaction was 0.92 for PDI, 0.14 for hPDI, and 0.94 for uPDI; S2 Fig ).

Results were pooled across the three cohorts using a fixed-effects model. Adjusted for age (years), smoking status (never, past, current [1–14, 15–24, or ≥25 cigarettes/day]), physical activity (<3, 3–8.9, 9–17.9, 18–26.9, or ≥27 MET-h/wk), alcohol intake (0, 0.1–4.9, 5–9.9, 10–14.9, or ≥15 g/d), multivitamin use (yes or no), family history of diabetes (yes or no), margarine intake (quintiles), energy intake (quintiles), baseline hypertension (yes or no), baseline hypercholesterolemia (yes or no), and BMI (<21, 21–22.9, 23–24.9, 25–26.9, 27–29.9, 30–32.9, 33–34.9, 35–39.9, or ≥40 kg/m 2 ). Also adjusted for menopausal status and postmenopausal hormone use in NHS and NHS2 (premenopausal or, if postmenopausal, current, past, or never postmenopausal hormone use) and for oral contraceptive use in NHS2 (never, past, or current use). p trend < 0.001 for all indices. p-Value obtained by assigning the median value to each decile and entering this as a continuous variable in the model.

The distribution of age-adjusted baseline characteristics according to the PDI and hPDI are shown in Tables S2 and 1 , respectively. Participants with higher scores on PDI or hPDI were older, more active, leaner, and less likely to smoke than participants with lower scores. They also consumed a lower percentage of calories from saturated and monounsaturated fats, a higher percentage of calories from polyunsaturated fats and carbohydrates, and higher levels of fiber and folate.

Discussion

We found significant linear inverse associations of plant-based diets, especially a healthier version (captured by hPDI), with T2D incidence in three prospective cohorts in the US. In contrast, a less healthy version of a plant-based diet (captured by uPDI) was associated with increased T2D risk. These associations were independent of BMI and other diabetes risk factors.

There are several mechanisms through which a healthful plant-based diet could lower the risk of T2D [37,38]. Such a diet would be rich in dietary fiber, antioxidants, unsaturated fatty acids, and micronutrients such as magnesium, and low in saturated fat. Randomized clinical trials have shown beneficial effects of diets high in viscous and soluble fiber on improving postprandial glucose as well as long-term glucose metabolism [39]. In addition, several prospective studies have shown dietary fiber to be associated with reduced levels of inflammatory markers [40,41]. Animal studies and epidemiologic studies among humans have shown antioxidants such as polyphenols to have beneficial effects on glucose metabolism, probably through reduced oxidative stress and improved endothelial function [42]. High unsaturated fatty acid and low saturated fat contents in diets have also been shown to have anti-inflammatory properties [43], while specific micronutrients such as magnesium are known to play a key role in glucose metabolism [44]. Thus, a healthful plant-based diet could enhance glycemic control, improve insulin sensitivity, and decrease chronic inflammation, thereby reducing T2D risk. In addition, the high fiber and low calorie contents of many plant foods could further reduce T2D risk by promoting weight loss/maintenance [37,38]. Another less well understood mechanism could be through the gut microbiome. A healthful plant-based diet could promote a gut microbial environment that facilitates the metabolism of fiber and polyphenols and discourages the metabolism of bile acids, choline and L-carnitine, and amino acids, further reducing T2D risk [45]. An unhealthful plant-based diet, on the other hand, would have high glycemic index and load, reduced fiber, lower micronutrient content, and higher calorie content, which could adversely affect the above-mentioned pathways, resulting in increased T2D risk [2,10,12]. Such a diet would also have a high level of added sugar, which has been shown to be strongly associated with increased weight gain and T2D risk [12,46]. Given that BMI represents a pathway through which plant-based diets may affect T2D risk, controlling for it would have resulted in an underestimation of these diets’ true effects. Results from the final model controlling for BMI characterize plant-based diet associations that are independent of their potential beneficial effects on body weight. The association of PDI with decreased T2D incidence was also significantly stronger for non-obese individuals than for obese individuals, which could represent a true biological interaction of PDI with BMI (e.g., due to differential mediation by BMI in obese and non-obese individuals) or could be a methodological artifact (e.g., as a result of differential confounding or measurement error in the two strata).

Only a few prospective studies have examined the association of plant-based diets with T2D. The Adventist Health Studies found significantly higher T2D mortality (odds ratio 1.9, 95% CI 1.2–3.1) and incidence (odds ratio 1.38, 95% CI 1.06–1.80) among non-vegetarians than vegetarians [7,8]. They also found consumption of vegan, lacto-ovo vegetarian, and semi-vegetarian diets to be associated with lower T2D risk relative to non-vegetarian diets [9]. All of these studies were carried out among Seventh-day Adventists, a religious group that encourages a lacto-ovo vegetarian diet. Because the prevalence of vegetarianism is low in the US (~3% [47]), it is difficult to study the relationship between vegetarianism and health outcomes in the general US population. Defining a plant-based diet in terms of a continuous gradation of adherence to a diet high in plant and low in animal foods has allowed us to study the association of plant-based diets with T2D in more than 200,000 participants, utilizing detailed dietary data collected at multiple time points over more than two decades.

Our study highlights the varying risk profiles associated with different versions of plant-based diets, emphasizing the importance of considering the quality of plant foods consumed. Participants in the highest decile of uPDI consumed half the amount of healthy plant foods and almost double the amount of less healthy plant foods consumed by participants in the highest decile of hPDI. The healthier version of a plant-based diet proposed in this study may inform future public health recommendations regarding plant-based diets. We also found that even a modest lowering in animal food consumption was associated with substantially lower T2D incidence. For instance, in the highest decile of hPDI, participants consumed ~4 servings/day of animal foods, relative to 5–6 servings/day in the lowest decile. This has important public health implications, as plant-based diets need not completely exclude animal foods. Numerous studies have previously documented null or inverse associations of several animal foods (e.g., low-fat dairy, lean poultry, and fish and seafood) with T2D and other diseases, and consistent positive associations of certain animal foods (e.g., red and processed meats) with such diseases. Additionally, in our analysis the association of hPDI with T2D changed only slightly upon positively scoring fish and yogurt intake. Thus, the gradual reduction in animal food intake suggested here can be achieved largely through reducing intake of low-quality animal foods.

Our findings provide support for the 2015 Dietary Guidelines Advisory Committee conclusion that diets rich in healthy plant foods and lower in certain animal foods such as red and processed meats are beneficial for the prevention of chronic diseases [6]. Another rationale for shifting towards a plant-based diet is to improve food sustainability because food systems that rely heavily on animal foods require more natural resources than those more reliant on plant foods [48]. Thus, dietary guidelines that recommend a healthful plant-based diet would be compatible with the health of humans as well as our ecosystem. The hPDI was only moderately correlated with other commonly considered dietary patterns such as the Mediterranean diet, aHEI, and DASH, reflecting that this is a novel diet index that captures unique aspects of a healthful plant-based diet. This, coupled with the strong inverse association of the hPDI with T2D independent of these other dietary patterns, highlights the importance of focusing on a healthful plant-based diet for a potentially environmentally sustainable approach to T2D prevention.

Our study has several limitations. Because diet was self-reported, measurement errors are inevitable. However, the use of cumulative measures of diet over time not only reduces these errors but also represents long-term dietary habits [18]. We also made assumptions about the healthfulness of different plant foods, which, although based on prior evidence, has an element of subjectivity, and hence our findings need to be replicated in future studies. While we controlled for several potential confounders, given the observational nature of these studies, residual or unmeasured confounding cannot be ruled out. However, several randomized controlled trials have found vegetarian diets to positively impact intermediate endpoints, such as body weight, blood pressure, lipid profile, and insulin sensitivity, in those who were free of T2D [49–51] and in patients with the disease [52–56]. The socioeconomic homogeneity of the study population also enhances internal validity due to implicit control of confounders. Given that we found similar associations between the plant-based diet indices and T2D among different ethnic groups, it is likely that these findings are generalizable to diverse racial/ethnic groups. Nevertheless, these studies were carried out among health professionals in the US, and hence it would be important to replicate these findings in other populations representing diverse countries and occupational groups before translating these findings to other populations.