Overview

Building from our previous study of CGF in Africa9, we used Bayesian model-based geostatistics14—which leveraged geo-referenced survey data and environmental and socioeconomic covariates, and the assumption that points with similar covariate patterns and that are closer to one another in space and time would be expected to have similar patterns of CGF—to produce high-spatial-resolution estimates of the prevalence of stunting, wasting, and underweight among children under five across LMICs. Stunting, wasting, and underweight were defined as z-scores that were two or more standard deviations below the WHO healthy population reference median for length/height-for-age, weight-for-length/height, and weight-for-age, respectively, for age- and sex-specific curves6. Using an ensemble modelling framework that feeds into a Bayesian generalized linear model with a correlated space–time error, and 1,000 draws from the fitted posterior distribution, we generated estimates of annual prevalence for each indicator of CGF on a 5 × 5-km grid over 105 LMICs for each year from 2000 to 2017 and mapped results at administrative levels to provide relevant subnational information for policy planning and public health action. For this analysis, we compiled an extensive geo-positioned dataset, using data from 460 household surveys and reports representing 4.6 million children. To ensure comparability with national estimates and to facilitate benchmarking, these local-level estimates were calibrated to those produced by the Global Burden of Disease (GBD) Study 20171, and were subsequently aggregated to the first administrative level (for example, states or provinces) and second administrative level (for example, districts or departments) in each LMIC. We also predict CGF prevalence for 2025 based on 2000–2017 trajectories and estimate the AROC required to meet the WHO GNTs by 2025. In addition, we estimate the 2017 absolute numbers of children under five affected by each CGF indicator in LMICs based on our prevalence estimates and the size of the populations of children under five15,16. Furthermore, we provide figures that demonstrate subnational disparities between each country’s second administrative-level units with the highest and lowest estimated prevalence for 2000 and 2017 (Extended Data Figs. 2, 4, 6). We re-estimate CGF prevalence for the 51 African countries included in our previous analysis9 using 28 additional surveys, and extend time trends to model each year from 2000 to 2017. Owing to these improvements in data availability and methodology, the estimates provided here supersede our previous modelling efforts.

Countries were selected for inclusion in this study using the socio-demographic index (SDI)—a summary measure of development that combines education, fertility, and poverty, published in the GBD study1. The analyses reported here include countries in the low, low-middle, and middle SDI quintiles, with several exceptions (Supplementary Table 3). China, Iran, Libya, and Malaysia were included despite high-middle SDI status in order to create better geographical continuity. Albania and Moldova were excluded owing to geographical discontinuity with other included countries and lack of available survey data. We did not estimate for the island nations of American Samoa, Federated States of Micronesia, Fiji, Kiribati, Marshall Islands, North Korea, Samoa, Solomon Islands, or Tonga, where no available survey data could be sourced. The flowchart of our modelling process is provided in Extended Data Fig. 9.

Surveys and child anthropometry data

We extracted individual-level height, weight, and age data for children under five from household survey series including the Demographic and Health Surveys (DHS), Multiple Indicator Cluster Surveys (MICS), Living Standards Measurement Study (LSMS), and Core Welfare Indicators Questionnaire (CWIQ), among other country-specific child health and nutrition surveys7,17,18,19 (Supplementary Tables 4, 5). Included in our models were 460 geo-referenced household surveys and reports from 105 countries representing approximately 4.6 million children under five. Each individual child record was associated with a cluster, a group of neighbouring households or a ‘village’ that acts as a primary sampling unit. Some surveys included geographical coordinates or precise place names for each cluster within that survey (138,938 clusters for stunting, 144,460 for wasting, and 147,624 for underweight). In the absence of geographical coordinates for each cluster, we assigned data to the smallest available administrative areal unit in the survey (termed a ‘polygon’) while correcting for the survey sample design (16,554 polygons for stunting, 18,833 for wasting, and 19,564 for underweight). Boundary information for these administrative units was obtained as shapefiles either directly from the surveys or by matching to shapefiles in the Global Administrative Unit Layers (GAUL)20 or the Database of Global Administrative Areas (GADM)21. In select cases, shapefiles provided by the survey administrator were used, or custom shapefiles were created based on survey documentation. These areal data were resampled to point locations using a population-weighted sampling approach over the relevant areal unit with the number of locations set proportionally to the number of grid cells in the area and the total weights of all the resampled points summing to one16.

Select data sources were excluded for the following reasons: missing survey weights for areal data, missing sex variable, insufficient age granularity (in months) for calculations of length/height-for-age z-scores and weight-for-age z-scores in children ages 0–2 years, incomplete sampling (for example, only children ages 0–3 years measured), or untrustworthy data (as determined by the survey administrator or by inspection). We excluded data for children for whom we could not compute age in both months and weeks. Children with height values ≤0 cm or ≥180 cm, and/or with weight values ≤0 kg or ≥45 kg were also excluded from the study. We also excluded data that were considered outliers according to the 2006 WHO Child Growth Standards recommended range values, which were values <−6 or >6 length/height-for-age z-score for stunting, <−5 or >5 weight-for-length/height z-score for wasting, and <−6 or >5 weight-for-age z-score for underweight3,4. Details on the survey data excluded for each country are provided in Supplementary Table 6. Data availability plots for all the CGF indicators by country, type, and year are included in Supplementary Figs. 2–16.

Child anthropometry

Using the height, weight, age, and sex data for each individual, height-for-age, weight-for-height, and weight-for-age z-scores were calculated using the age-, sex-, and indicator-specific LMS (lambda-mu-sigma) values from the 2006 WHO Child Growth Standards3,4. The LMS methodology allows for Gaussian z-score calculations and comparisons to be applied to skewed, non-Gaussian distributions22. We classified stunting, wasting, or underweight if the height/length-for-age, weight-for-height/length, or weight-for-age, respectively, was more than two standard deviations (z-scores) below the WHO growth reference population6. These individual-level data observations were then collapsed to cluster-level totals for the number of children sampled and total number of children under five affected by stunting, wasting, or underweight.

Temporal resolution

We estimated the prevalence of stunting, wasting, and underweight annually from 2000 to 2017 using a model that allows us to account for data points measured across survey years. As such, the model would also allow us to predict at monthly or finer temporal resolutions; however, we are limited both computationally and by the temporal resolution of the covariates.

Seasonality adjustment

Owing to the acute nature of wasting and its relative temporal transience, wasting data were pre-processed to account for seasonality within each year of observation. Across LMICs, large proportions of the population live in rural areas and have livelihoods that rely on agriculture and livestock. Seasonality affects the availability of and access to food, sometimes owing to natural disasters or climate events (for example, floods, monsoons, or droughts) that vary by season. Generalized additive models were fit to wasting data across time using the month of interview and a country-level fixed effect as the explanatory variables, and the wasting z-score as the response. A 12-month periodic spline for the interview month was used, as well as a spline that smoothed across the whole duration of the dataset. Once the models were fit, individual weight-for-height/length z-score observations were adjusted so that each measurement was consistent with a day that represented a mean day in the periodic spline. The seasonality adjustment had relatively little effect on the raw data9.

Spatial covariates

To leverage strength from locations with observations to the entire spatiotemporal domain, we compiled several 5 × 5-km raster layers of possible socioeconomic and environmental correlates of CGF in the 105 LMICs (Supplementary Table 7, Supplementary Fig. 17). Covariates were selected based on their potential to be predictive for the set of CGF indicators, after reviewing literature on evidence and plausible hypotheses as to their influence. Acquisition of temporally dynamic datasets, where possible, was prioritized to best match our observations and thus predict the changing dynamics of the CGF indicators. Of the twelve covariates included, eight were temporally dynamic and were reformatted as a synoptic mean over each estimation period or as a mid-period year estimate: these covariates included average daily mean rainfall (precipitation), average daily mean temperature, enhanced vegetation index, fertility, malaria incidence, educational attainment in women of reproductive age (15–49 years old), population, and urbanicity. The remaining four covariate layers were static throughout the study period and were applied uniformly across all modelling years; growing season length, irrigation, nutritional yield for vitamin A, and travel time to nearest settlement of >50,000 inhabitants.

To select covariates and capture possible nonlinear effects and complex interactions between them, an ensemble covariate modelling method was implemented23. For each region, three sub-models were fit to our dataset using all of our covariate data as explanatory predictors; these sub-models were: generalized additive models, boosted regression trees, and lasso regression. Each sub-model was fit using fivefold cross-validation to avoid overfitting, and the out-of-sample predictions from across the five holdouts were compiled into a single comprehensive set of predictions from that model. In addition, the same sub-models were run using 100% of the data, and a full set of in-sample predictions were created. The three sets of out-of-sample sub-model predictions were fed into the full geostatistical model14 as the explanatory covariates when performing the model fit. The in-sample predictions from the sub-models were used as the covariates when generating predictions using the fitted full geostatistical model. A recent study demonstrated that this ensemble approach can improve predictive validity by up to 25% over an individual model23.

Geostatistical model analysis

Binomial count data were modelled within a Bayesian hierarchical modelling framework using a logit link function and a spatially and temporally explicit hierarchical generalized linear regression model to fit prevalence of each of our indicators in 14 regions24 of LMICs (North Africa, western SSA, central SSA, eastern SSA, southern SSA, Middle East, Central Asia, East Asia, South Asia, Southeast Asia, Oceania, Central America and the Caribbean, Andean South America, and Tropical South America; see Extended Data Fig. 10). For each region, we explicitly wrote the hierarchy that defines our Bayesian model.

For each binomial CGF indicator, we modelled the average number of children with stunting, wasting, or who were underweight in each survey cluster, d. Survey clusters are precisely located by their GPS coordinates and year of observation, which we map to a spatial raster location, i, at time, t. We observed the number of children reported to be stunted, wasted, or underweight, respectively, as binomial count data, C d , among an observed sample size, N d . As we may have observed several data clusters within a given location, i, at time, t, we refer to the probability of stunting, wasting, or underweight, p, within a given cluster, d, by its indexed location, i, and time, t, as p i(d),t(d) .

$$\begin{array}{l}{C}_{d}|{p}_{i(d),t(d)},\,{N}_{d}\sim {\rm{B}}{\rm{i}}{\rm{n}}{\rm{o}}{\rm{m}}{\rm{i}}{\rm{a}}{\rm{l}}({p}_{i(d),t(d)},\,{N}_{d})\,{\rm{\forall }}\,{\rm{o}}{\rm{b}}{\rm{s}}{\rm{e}}{\rm{r}}{\rm{v}}{\rm{e}}{\rm{d}}\,{\rm{c}}{\rm{l}}{\rm{u}}{\rm{s}}{\rm{t}}{\rm{e}}{\rm{r}}{\rm{s}}\,d\end{array}$$

$$\begin{array}{l}{\rm{l}}{\rm{o}}{\rm{g}}{\rm{i}}{\rm{t}}({p}_{i,t})=\,{\beta }_{0}+{{\bf{X}}}_{i,t}{\boldsymbol{\beta }}+{Z}_{i,t}+{{\epsilon }}_{{\rm{c}}{\rm{t}}{\rm{r}}(i)}+{{\epsilon }}_{i,t}+{Z}_{i,t}\,{\rm{\forall }}\\ i\in {\rm{s}}{\rm{p}}{\rm{a}}{\rm{t}}{\rm{i}}{\rm{a}}{\rm{l}}\,{\rm{d}}{\rm{o}}{\rm{m}}{\rm{a}}{\rm{i}}{\rm{n}}\,{\rm{\forall }}\,t\in {\rm{t}}{\rm{i}}{\rm{m}}{\rm{e}}\,{\rm{d}}{\rm{o}}{\rm{m}}{\rm{a}}{\rm{i}}{\rm{n}}\end{array}$$

$$\begin{array}{l}\mathop{\sum }\limits_{h=1}^{3}{\beta }_{h}\,=1\\ {{\epsilon }}_{{\rm{c}}{\rm{t}}{\rm{r}}}\sim {\rm{i}}{\rm{i}}{\rm{d}}\,{\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(0,\,{\gamma }^{2})\\ {{\epsilon }}_{i,t}\sim {\rm{i}}{\rm{i}}{\rm{d}}\,{\rm{N}}{\rm{o}}{\rm{r}}{\rm{m}}{\rm{a}}{\rm{l}}(0,\,{\sigma }^{2})\\ {\bf{Z}}\sim {\rm{G}}{\rm{P}}(0,{\Sigma }^{{\rm{s}}{\rm{p}}{\rm{a}}{\rm{c}}{\rm{e}}}\otimes {\Sigma }^{{\rm{t}}{\rm{i}}{\rm{m}}{\rm{e}}})\\ {\Sigma }^{{\rm{s}}{\rm{p}}{\rm{a}}{\rm{c}}{\rm{e}}}=\,\frac{{\omega }^{2}}{\Gamma (

u ){2}^{v-1}}\times {(\kappa D)}^{

u }\times {{\rm K}}_{

u }(\kappa D)\\ {\Sigma }_{j,\,k}^{{\rm{t}}{\rm{i}}{\rm{m}}{\rm{e}}\,}={\rho }^{|k-j|}\end{array}$$

For indices d, i, and t, *(index) is the value of * at that index. The probabilities, p i,t , represent both the annual prevalence at the space–time location and the probability that an individual child was afflicted with the risk factor given that they lived at that particular location. The annual prevalence, p i,t , of each indicator was modelled as a linear combination of the three sub-models (generalized additive model, boosted regression trees, and lasso regression), rasterized covariate values, X i,t , a correlated spatiotemporal error term, Z i,t , and country random effects, ϵ ctr(i) , with one unstructured country random effect fit for each country in the modelling region and all ϵ ctr sharing a common variance parameter, γ2, and an independent nugget effect, ϵ i,t , with variance parameter, σ2. Coefficients in β h in the three sub-models h = 1, 2, 3 represent their respective predictive weighting in the mean logit link, while the joint error term, Z i,t , accounts for residual spatiotemporal autocorrelation between individual data points that remains after accounting for the predictive effect of the sub-model covariates, the country-level random effect, ϵ ctr(i) , and the nugget independent error term, ϵ i,t . The residuals, Z i,t , are modelled as a three-dimensional Gaussian process (GP) in space–time centred at zero and with a covariance matrix constructed from a Kronecker product of spatial and temporal covariance kernels. The spatial covariance, Σspace, is modelled using an isotropic and stationary Matérn function25, and temporal covariance, Σtime, as an annual autoregressive (AR1) function over the 18 years represented in the model. In the stationary Matérn function, Γ is the gamma function, Κ v is the modified Bessel function of order v > 0, κ > 0 is a scaling parameter, D denotes the Euclidean distance, and ω2 is the marginal variance. The scaling parameter, κ, is defined to be \(\kappa =\sqrt{8v}/\delta \) in which δ is a range parameter (which is about the distance where the covariance function approaches 0.1) and v is a scaling constant, which is set to 2 rather than fit from the data26,27. This parameter is difficult to reliably fit, as documented by many other analyses26,28,29 that set this to 2. The number of rows and the number of columns of the spatial Matérn covariance matrix are both equal to the number of spatial mesh points for a given modelling region. In the AR1 function, ρ is the autocorrelation function (ACF), and k and j are points in the time series where |k − j| defines the lag. The number of rows and the number of columns of the AR1 covariance matrix are both equal to the number of temporal mesh points (18). The number of rows and the number of columns of the space–time covariance matrix, Σspace ⊗ Σtime, for a given modelling region are both equal to: (the number of spatial mesh points × the number of temporal mesh points).

This approach leveraged the residual correlation structure of the data to more accurately predict prevalence estimates for locations with no data, while also propagating the dependence in the data through to uncertainty estimates14. The posterior distributions were fit using computationally efficient and accurate approximations in R-INLA30,31 (integrated nested Laplace approximation) with the stochastic partial differential equations (SPDE)27 approximation to the Gaussian process residuals using R project v.3.5.1. The SPDE approach using INLA has been demonstrated elsewhere, including the estimation of health indicators, particulate air matter, and population age structure9,32,33,34,35. Uncertainty intervals were generated from 1,000 draws (that is, statistically plausible candidate maps)36 created from the posterior-estimated distributions of modelled parameters. Further details on model and estimation processes are provided in the Supplementary Information.

Post estimation

To leverage national-level data included in the 2017 GBD study1 that were not within the scope of our current geospatial modelling framework, and to ensure alignment between these estimates and GBD national-level and subnational estimates, we performed a post hoc calibration to the mean of the 1,000 draws. We calculated population-weighted aggregations to the GBD estimate level, which was either at the national or first administrative level, and compared these estimates to our corresponding year estimates from 2000 to 2017. We defined the calibration factor to be the ratio between the GBD estimates and our current estimates for each year from 2000 to 2017. For some selected countries where GBD estimates were at the first administrative level, the calibration factors were also calculated at the lowest available subnational level. These countries included Brazil, China, Ethiopia, India, Indonesia, Iran, Mexico, and South Africa. Finally, we multiplied each of our estimates in a country-year (or first-administrative-year) by its associated factor. This ensures consistency between our geospatial estimates and those of the 2017 GBD1, while preserving our estimated within-country geospatial and temporal variation. To transform grid-cell-level estimates into a range of information useful to a wide constituency of potential users, these estimates were aggregated at first and second administrative-level units specific to each country and at national levels using conditional simulation37.

Although the models can predict all locations covered by available raster covariates, all final model outputs for which land cover was classified as ‘barren or sparsely vegetated’ on the basis of the most recently available Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data (2013) were masked38. Areas where the total population density was less than ten individuals per 1 × 1-km grid cell were also masked in the final outputs.

Model validation

We assessed the predictive performance of the models using fivefold out-of-sample cross-validation strategies and found that our prevalence estimates closely matched the survey data. To offer a more stringent analysis by respecting some of the spatial correlation in the data, holdout sets were created by combining sets of data at different spatial resolutions (for example, first administrative level). Validation was performed by calculating bias (mean error), variance (root mean square error), 95% data coverage within prediction intervals, and correlation between observed data and predictions. All validation metrics were calculated on the out-of-sample predictions from the fivefold cross-validation. Furthermore, measures of spatial and temporal autocorrelation pre- and post-modelling were examined to verify correct recognition, fitting, and accounting for the complex spatiotemporal correlation structure in the data. All validation procedures and corresponding results are included in Supplementary Tables 14–22 and Supplementary Figs. 24–41.

Projections

To compare our estimated rates of improvement in CGF prevalence over the last 18 years with the improvements needed between 2017 and 2025 to meet WHO GNTs, we performed a simple projection using estimated annualized rates of change (AROC) applied to the final year of our estimates.

For each CGF indicator, u, we calculated AROC at each grid cell, m, by calculating the AROC between each pair of adjacent years, t:

$${{\rm{AROC}}}_{u,m,t}={\rm{logit}}\left(\frac{{p}_{u,m,t}}{{p}_{u,m,t-1}}\right)$$

We then calculated a weighted AROC for each indicator by taking a weighted average across the years, where more recent AROCs were given more weight in the average. We defined the weights to be:

$${W}_{t}={(t-2000+1)}^{\gamma }$$

in which γ may be chosen to give varying amounts of weight across the years. For any indicator, we then calculated the average AROC to be:

$${{\rm{AROC}}}_{u,m}={\rm{logit}}(\mathop{\sum }\limits_{2001}^{2017}{W}_{t}\times {{\rm{AROC}}}_{u,m,t})\,$$

Finally, we calculated the projections, Proj, by applying the AROC in our 2017 mean prevalence estimates to produce estimates in 8 years from 2017 to 2025. For this set of projections, we selected γ = 1.7 for stunting, γ = 1.9 for wasting, and γ = 1.8 for underweight1.

$${{\rm{P}}{\rm{r}}{\rm{o}}{\rm{j}}}_{u,m,2025}={{\rm{l}}{\rm{o}}{\rm{g}}{\rm{i}}{\rm{t}}}^{-1}({\rm{l}}{\rm{o}}{\rm{g}}{\rm{i}}{\rm{t}}({p}_{u,m,2017})+{{\rm{A}}{\rm{R}}{\rm{O}}{\rm{C}}}_{u,m}\times 8)$$

This projection scheme is analogous to the methods used in the 2017 GBD measurement of progress and projected attainment of health-related Sustainable Development Goals1. Our projections are based on the assumption that areas will sustain the current AROC, and the precision is dependent on the level of uncertainty emanating from the estimation of annual prevalence.

Although the WHO GNT for wasting was to reduce prevalence to less than 5%, the WHO GNT for stunting was a 40% relative reduction in prevalence. For our analyses, we defined the WHO GNT for stunting and underweight (for which no WHO GNT was established) to be 40% reduction relative to 2010, the year the World Health Assembly requested the development of the WHO GNTs39.

Limitations

The accuracy of our models depends on the volume, representativeness, quality, and validity of surveys available for analysis (Supplementary Tables 4, 5, Supplementary Figs. 2–16). Persistent data gaps in national surveys include a lack of CGF data or household-level characteristics, such as hygiene and sanitation practices. The associated uncertainties of our estimates are higher in areas where data are either missing or less reliable (Figs. 1d, 2d, Extended Data Fig. 5d), and rely more heavily on covariates and borrowing from neighbouring areas for their modelling (Supplementary Table 7, Supplementary Fig. 17). Investments in improvements of health surveillance systems and including child anthropometrics as part of routine data collection for profiling population characteristics could improve the certainty of our estimates and better monitor progress towards international goals. In addition, measurement error in collecting anthropometric information, including the child’s age, height, and weight, could have introduced bias or error in the data across different survey types. The accuracy of age data may be affected by differences in sampling approaches and self-reporting bias, such as long recall period or selective recall. Weight and height measurements may be inaccurate owing to improper calibration of equipment, device inaccuracy, different measurement methods, or human error. We did not include a survey random effect to account for between-survey variability in data accuracy; given that most surveys represent a country-year, it would be difficult to distinguish these biases from temporal effects. Our calibration approach in the post-estimation process used only a ratio estimator and did not account for an additive effect, which may have introduced bias. Owing to the complexity of the boosted regression tree sub-model, we were unable to account for the uncertainty of our three sub-models in our final estimates (see Supplementary Information section 3.2.2 for more detail). It is worth noting that our analyses are descriptive and do not support causal inferences on their own. Future research is required to determine the causal pathways for each CGF indicator across and within LMICs.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.