Analytical diagram

The datasets utilized in our study are visualized in a diagram (Supplementary Fig. 4) with the data preparation procedures. Detailed steps in the diagram are illustrated in the following subsections.

Study population

Our study population was drawn from the CFPS, an ongoing national survey on demographic and socioeconomic factors in China. The CFPS drew a representative sample of Chinese population using a multi-stage probability strategy with stratification, for multiple study purposes23. The CFPS surveyed > 30,000 adults and ~9,000 children from 25 provincial regions of China from 2010. Data on personal characteristics (e.g., age), socioeconomic status (e.g., education and income), behavior patterns (e.g., physical activity), lifestyle (e.g., diet type), mental health status, and so on were collected by trained interviewers using standard questionnaires. The study has been approved by the institutional review board at Peking University (Approval IRB00001052-14010). Although the CFPS collected the personal characteristics longitudinally, the surveyed variables slightly varied between years23. For instance, the surveys utilized the same mental health questionnaire in 2010 and 2014, but different ones in other years, which makes this study not qualified as a prospective cohort study.

In 2010, baseline mental health status was measured by a brief questionnaire based on the Center for Epidemiologic Studies Depression Scale test, consisting of six questions related to the following domains: feeling depressed and incapability to cheer up no matter what you are doing (Q 1 ), feeling nervous (Q 2 ), feeling upset (Q 3 ), feeling hopeless about the future (Q 4 ), feeling that everything is difficult (Q 5 ), and thinking life is meaningless (Q 6 ). Respondents were asked to rate the frequency with which they experienced these feelings on a scale ranging from 1 to 5 (1: almost every day, 2: 2–3 times a week, 3: 2–3 times a month, 4: once a month, and 5: never). Therefore, higher scores reflect better mental health. According to the CFPS user manual, the total score for the six questions constitutes an index of mental health status. In 2014, the mental health of subjects was examined using the same questionnaire. In total, 25,618 of the 33,600 adults surveyed in 2010 and 37,147 adults surveyed in 2014, participated in both evaluations. After excluding surveys with (1) incomplete answers to the mental health questionnaire or (2) a failure of geocoding (which will be described in following sections), the data obtained from 21,543 adults from 25 provinces (as shown in Fig. 1) during the first and second surveys were included in the final analysis. The characteristics of the involved samples were also compared with those of the total surveyed subjects in 2010 (Supplementary Fig. 5). The comparison showed that the data exclusion did not considerably changed the structures of the CFPS population, a representative sample of Chinese adults.

Air quality

To examine the effects of the environmental factors that affect mental health, this study obtained data on air quality, residential greenness, and ambient temperature. To evaluate air quality, from a well-established product29, we obtained monthly maps of PM 2.5 in China from 2000 to 2016, which had a spatial resolution of ~10 km × 10 km (in a regular grid of 0.1° × 0.1°). The gridded PM 2.5 maps were estimated based on historical satellite measurements of aerosol optical depth and simulations of the Community Multiscale Air Quality Model based on historical emission inventories, using a machine learning model. The estimates have complete spatiotemporal coverage and were shown to be in good agreement with the independent in-situ PM 2.5 values, based on the cross-validation (CV) results (R2 = 0.71; root mean square error [RMSE] = 17.8 μg m−3) on a monthly scale. In the CV, all observations of PM 2.5 within a calendar year were used as the test data to validate the estimates from a model trained by the rest of the data and then the procedure was iterated (both retrospectively and prospectively) for all the PM 2.5 observations during 2013–2016.

Greenness

To assess residential greenness, we obtained a monthly product (MOD13A3, version 6) of the NDVI for China for 2009–2016, which had a spatial scale of 1 km × 1 km. As environmental exposures were evaluated at county level (as described below) due to the limited geographic information of the CFPS subjects, we did not obtain NDVI at a finer scale for computing efficiency. Satellite NDVI is a general index (varying from −1 to 1), which indicates the richness of green vegetation over the surface of the Earth; it has been widely used to measure long-term exposure to residential greenness. The NDVI data used in this study were also obtained from the moderate resolution imaging spectroradiometer (MODIS) products, which are freely distributed by the Application for Extracting and Exploring Analysis Ready Samples (EEARS): https://lpdaacsvc.cr.usgs.gov/appeears/ (accessed at May 2018).

Temperature

To evaluate exposure to temperature, we obtained daily maps with a spatial resolution of ~10 km × 10 km (in a regular grid of 0.1° × 0.1°) from a data assimilation product for China from 2000 to 2016.

The surface temperature of the Earth can be obtained from multiple sources including routine climate monitors, satellite remote-sensing measurements, and climate model simulations such as the weather research forecast (WRF) model. Monitoring data are usually considered the gold standard, but is limited in spatial coverage, particularly in China. Although numerical outputs of climate models have a complete spatiotemporal coverage, they are less accurate. The temperature products of Earth-observing satellites, which scan the whole planetary surface within a 1–2 day time period, offer moderate coverage of spatiotemporal dimensions and have been utilized in health-related studies34. Recently, data assimilation products of monitoring and satellite-retrieved measurements have been derived to reduce errors in exposure assessment of ambient temperatures35. Inspired by such studies, we used the universal kriging36 approach to combine monitoring temperatures (T m ), WRF-simulated temperatures (T w ), and satellite temperatures (T s ) to produce an optimal predictor of daily temperatures (T optimal ) over China. Before universal kriging, we first prepared a product of satellite-based temperatures with complete spatiotemporal coverage (T sc = [T s , T s *]), where missing values (T s *) of satellite measurements for each day were interpolated using the following equation: T s * = T w * + IDW(T s − T w ). In the equation, T w * denotes the WRF output at the coordinates where satellite-retrieved temperatures do not exist and IDW(•) denotes an inverse-distance weighted average36 of the difference between the two measurements in the neighboring coordinates. Universal kriging is a two-stage model. In stage 1, we regressed T m with T sc and auxiliary variables, including satellite nightlight, altitude, and the monthly average of NDVI. In stage 2, we interpolated the residuals in stage 1 using kriging appraoch. The final estimates (T optimal ) were obtained by adding the fitted values of regression in stage 1 and the interpolated residuals in stage 2. Due to computational complexities, universal kriging analyses were performed for each separate day. Therefore, empirical variograms were calculated and Matérn covariance function were fitted, by days.

In-situ observations of daily mean temperature (T m ) during 2000–2016 were obtained from 225 monitors across China, from the global historical climatology network distributed by the National Centers for Environmental Information of the United States National Oceanic and Atmospheric Administration. Satellite-retrieved land surface temperatures (T s ) were collected from level 3 (MOD11C1, version 6) MODIS products, which have a spatial resolution of 0.05° and generated valid data after 24 February 2000. Altitude data with a spatial resolution of 1 km were obtained from GTOPO30, which is a global digital elevation model developed from a US geological survey (https://lta.cr.usgs.gov/GTOPO30). Nightlight data in 2013 at a spatial resolution of 1 km were produced from the visible and infrared sensors of the Defense Meteorological Satellite Program (https://ngdc.noaa.gov/eog/dmsp.html). We also simulated daily maps of temperature (T w ) during the study period using a well-developed WRF model (ver. 3.5.1) in China37.

The accuracy of the estimated temperature was evaluated using the tenfold CV approach, in which the in-situ observations were randomly divided and subjected to ten iterations of the validation procedure. According to the tenfold CV, the estimates were in excellent agreement with the daily in-situ observations (R2 = 0.96; RMSE = 2.46 °C), as shown in Supplementary Fig. 6.

Exposure assessments

To protect their privacy, the detailed addresses of respondents were redacted from the open-access CFPS data. We obtained the six-digit administrative code (each code identifies a county-level geographic unit) for each subjects and then geocoded all CFPS samples into a map of county-level administrative boundaries in 2010, through matching their administrative codes. Finally, we identified 162 counties (Supplementary Fig. 1) that were consistent with those identified in the official reports of CFPS. In this county-level exposure assessments, we assumed that all residents of a county lived and commuted within the corresponding administrative boundaries. Therefore, the within-county variations in the long-term environmental exposures can be much smaller than the between-county variations. However, this presumption may not be valid for those residents who lived far from the county center. Considering that, we further validated the geographic information by comparing their reported distances from the provincial capital cities according to the community-level questionnaire with the calculated distances based on the geocoded county center. When the relative difference was >10%, all records from the community were excluded.

Due to the limitations of the geographic information, exposure levels were assessed at the county level. To evaluate long-term exposure to air pollution, we first averaged the gridded maps of PM 2.5 into monthly county-level averages. All subjects from the same county were assigned to the same PM 2.5 time series. Then, we calculated the annual average for each subject based on the PM 2.5 value for the surveyed month and the values of the 11 preceding months (i.e., the 12-months moving average of PM 2.5 ). When evaluating the long-term level of residential greenness, we processed the NDVI data in the same way as we processed the PM 2.5 data, except that we calculated the population-density-weighted average instead of the direct average in each county. The 1 km × 1 km map of the population density in China was extracted from the 2010 Gridded Population of the World, which was also obtained from EEARS. The use of population-density weights reduces the misclassifications caused by the non-residential greenness of the NDVI, such as croplands and forests. Considering the complexity underlying the association between temperature and mental health, we prepared county-level time-series data on temperature in the same way as we prepared the PM 2.5 data, but calculated the μ T and σ T during the year before the survey time to measure the long-term level of, and variability in, the temperature, respectively.

Study design

We designed a difference-in-difference study to link changes in mental health to changes in long-term exposure to environmental factors. The difference-in-difference design has been widely used to explore the health effects of risk factors, such as ambient pollutants25, and is considered to yield results that are more relevant to causal relationships than those of cross-sectional studies. As the outcomes and exposure levels of a single subject are associated with each other in difference-in-difference studies, some confounders (e.g., genetic factors) that do not change with time are inherently controlled by the design.

In this study, we first derived the changes in the MHSs from 2010 to 2014, the long-term exposure levels to PM 2.5 , the NDVI, the μ T and σ T , and the socioeconomic data (i.e., alcohol consumption, education, migration, obesity, physical activity, and smoking). Under the assumption that the various subgroups might show different mental health trends over time, we also used the baseline values of some socioeconomic variables (e.g., age, alcohol consumption, education, diet type, gender, income level, marital status, nationality, physical activity status, obesity status, area of residence (urban or rural), and smoking status) obtained in 2010 as additional covariates. To control the spatial autocorrelations in the outcomes, we first parameterized the coordinates of residential counties as a two-dimensional thin-plate spline function and further involved the term into the regression models. The optimal degrees of freedom for the spline term were automatically determined by the penalized method38. Such approach has been utilized in previous studies to examine the health effects of environmental factors and difference-in-difference analyses25.

Statistical analyses

In purpose of good interpretability, the major analysis used a logistic model to examine the relationship between changes in total MHS and changes in each environmental variable after adjustment for multiple covariates, using the following equation:

$${\mathrm{Logit}}\left( {y_j} \right)\sim x_j\beta + {\mathbf{z}}_j{\mathbf{b}} + f\left( {{\mathbf{s}}_j} \right);\; \ldots \ldots$$ (1)

y j = 1, for \(Q_{i,\;j,\;2014} \ge Q_{i,\;j\;,\;2010}\),0, for \(Q_{i,\;j,\;2014} < Q_{i,\;j,\;2010}\);or y j = 1, for \(\mathop {\sum}

olimits_i {Q_{i,\;j,\;2014}} \ge \mathop {\sum}

olimits_i {Q_{i,\;j,\;2010}}\),0, for \(\mathop {\sum}

olimits_i Q _{i,\;j,\;2014} < \mathop {\sum}

olimits_i Q _{i,\;j,\;2010}\).

In the regression model, i or j denotes the index for mental health questionnaire or CFPS subject, respectively; Q i,j denotes the score of the ith question for the jth subject; y j denotes a binary variable to indicate the mental health change from 2010 to 2014; x j denotes the corresponding change in an environmental factor (PM 2.5 , NDVI, μ T , or σ T ); z j denotes the individual-level covariates as described above; f(s j ) denotes the spline function of spatial coordinates (s j ); β and b denote the regression coefficients. Using the PM 2.5 models as an example, the x j was calculated as ΔPM 2.5, j = PM 2.5, j, 2014 − PM 2.5, j, 2010 , where PM 2.5, j, t denotes the long-term exposure level for the jth subject at the t year. The regression coefficient (β) for an environmental variable, x, can be interpreted as a logarithmic scale of odds ratios (ORs) for per-unit increments in x. An OR < 1 indicates that increment of x is associated to a lowered score (i.e., worse mental health).

Besides the models adjusted by different combinations of covariates (Supplementary Table 3), the major results also present (1) the nonlinear associations between mental health changes and environmental changes (Fig. 1), and (2) the double-exposure models (Fig. 3), based on modified versions of Eq. 1. To conduct the nonlinear analyses, we replaced the linear terms of the environmental variables with the thin-spline terms in the regression models. In addition, because the environmental variables were pairwise-correlated (Supplementary Table 2), they could act as confounders for each other. We used double-exposure models to explore these confounding effects. A double-exposure model simultaneously linked the health outcome with two environmental variables39. A comparison between a single-exposure model (e.g., a model of PM 2.5 ) and the corresponding double-exposure model (e.g., a model of PM 2.5 + NDVI) can reveal whether the estimated effect of the target variable (i.e., PM 2.5 ) is sensitive to extra-adjustment of another variable (i.e., NDVI). A robust association suggests that the effect on mental health is more likely attributable to the target variable rather than its correlated variables.

In the sensitivity analyses, we first explored variations in the associations between total MHS and environmental factors using an indicator variable for three geographic regions and indicators for different demographic characteristics, including age, alcohol consumption, education, gender, income, obesity status, physical activity status, smoking status, and urban/rural residence (Supplementary Fig. 2). The variations were examined using interaction terms between the indicators and the environmental variables. Next, we examined alternative time windows for exposure to PM 2.5 or σ T (Supplementary Fig. 3), which had been estimated to be robustly linked with mental health in previous analyses. Finally, we modeled the MHS as alternative types of variable (Supplementary Table 4). In the major results, the changes in MHS were categorized into binary outcomes, to increase the interpretability of statistical analyses. However, the binned outcome might be insufficient to characterize the variations in mental health. Using modified version of Eq. 1, we also modeled the change in MHS as (1) a continuous outcome (ΔQ ∈ [−24, 24]) using a linear regression or (2) an ordinal outcome (ΔQ ∈ |−24, −23, …, 23, 24|) using an ordinal logistic regression (also known as the proportional odds model). Furthermore, we also directly associated MHS to environmental variables using the linear mixed-effect model, an alternative approach for the difference-in-difference design (Supplementary Table 4). The details of these alternative models are documented in Supplementary Table 4.

All statistical analyses were performed using R software (ver. 3.4.1; R Development Core Team, Vienna, Austria). The associations were presented by point-estimates with 95% CIs and their significances were evaluated by two-sided Wald’s tests.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.