Forest inventory and climate data

We assembled forest inventory data from the United States Forest Inventory and Analysis (FIA) program and the Canadian Permanent Sample Plots (PSP) program in six provinces (British Columbia, Alberta, Saskatchewan, Manitoba, Ontario, Quebec). The FIA program applies a nationally standardized sampling protocol with a sampling intensity of one plot per 2428 ha31. FIA inventory plots in forested areas consist of four 7.2 m fixed-radius subplots spaced 36.6 m apart in a triangular arrangement with one subplot in the center. All trees (standing live and dead), with a diameter at breast height (DBH) of at least 12.7 cm, are inventoried in each subplot. Within each subplot, a 2.07 m radius microplot offset 3.66 m from subplot center is further established where only live trees with a DBH between 2.5 and 12.7 cm are inventoried. All stems are measured and identified to species. For each plot, the age is determined by coring three dominant or co-dominant trees that represent a plurality of non-overtopped trees. The stand age is estimated as the average of these three trees31. The PSP program applies slightly different field protocols with FIA, with varying plot size and minimum measured DBH in six provinces32. Each PSP plot was designed as one square or rectangular plot in shape, or consist of four squared subplots. The average PSP plot size is 0.07 ha, ranging from 0.04 to 0.81 ha. All live trees with DBH > 1 cm were measured for most PSP plots, and the live trees with DBH > 5 cm were measured for a small group of PSP plots.

In this analysis, the FIA and PSP data were extracted from two periods: 2000–2016 (current) and 1990–1999 (past). We used the current period to develop a forest biomass–age relationship and the past period to independently validate this relationship. We excluded plots that reported any natural or human-caused disturbances, such as fire, logging. For the current period, we collected 67,065 FIA plots and 10,342 PSP plots. For the past period, we collected 54,127 FIA plots and 8733 PSP plots. The stand age was directly reported from plot estimates of coring trees. Our assumption was not that forest stands across North America are even-aged; rather we assumed that the age of the dominant or co-dominant trees represents the age of the forest ecosystem. For each plot, the aboveground biomass was estimated using allometric equations from DBH measurements33 and was summed across all live trees to obtain a live-tree plot biomass34. We classified these plots into 23 common forest types in North America. For the FIA plots, the forest types were included in the raw data, as derived from the dominant species to reflect the main species composition35. For the PSP plots, we classified the forest types following a similar approach by assigning the first 1–3 species with the highest percentage in aboveground biomass as the dominant species for each plot, and then we matched the forest types with the FIA classification.

Climate data in this study were extracted from ClimateNA version 5.1036. For forest inventory plots in both the past and current periods, we extracted the mean annual temperature and precipitation resampled at 1 km resolution using PRISM data37 from ClimateNA. For the future period, we extracted the downscaled (1 km) and calibrated (bias corrected) mean annual temperature and precipitation using the Coupled Model Intercomparison Project phase 5 (CMIP5) database corresponding to the 5th IPCC Assessment Report38. We used the ensemble projections averaged across the 15 CMIP5 models, under the Representative Concentration Pathway (RCP) 4.5 (low emission) and RCP8.5 (high emission) scenarios, for three future periods: the 2020s, 2050s, and 2080s. We acknowledged that the FIA plot coordinates have been perturbed in an unbiased direction not exceeding 1.67 km, and typically within a 0.8 km radius of the actual plot location, so as to facilitate study repeatability without introducing bias39. However, this perturbation is similar to the spatial resolution of our climate data (1 km). We therefore used the publicly available perturbed plot coordinates to match the forest inventory with climate data.

Growth model selection

Prior to formal modeling, we performed an exploratory data analysis to select the most appropriate growth model. A growth model quantifies how forest biomass changes with stand age,

$$\begin{array}{*{20}{c}} {y_i = f\left( {x_i} \right) + {\it{\epsilon }}_i} \end{array}$$ (1)

where for plot i, y i is the aboveground biomass, x i is the stand age, and ε i is the error term. In theory, a growth model has to go through the origin for the biomass–age relationship, i.e., \(f\left( 0 \right) = 0\). Thus, we tested the following four commonly used growth models. Linear growth model, where the biomass increases with the age linearly.

$$\begin{array}{*{20}{c}} {f\left( {x_i} \right) = \beta x_i} \end{array}$$ (2)

Exponential growth model, where the biomass is a function of the natural-logarithm-transformed age9.

$$\begin{array}{*{20}{c}} {f\left( {x_i} \right) = \beta \log \left( {x_i + 1} \right)} \end{array}$$ (3)

Chapman–Richards growth model, where the biomass increases with the age and reaches an asymptote40.

$$\begin{array}{*{20}{c}} {f\left( {x_i} \right) = \mu \left( {1 - \exp \left( { - kx_i} \right)} \right)} \end{array}$$ (4)

Monod (Michaelis–Menten) growth model, where the biomass also increases with the age and reaches an asymptote4.

$$\begin{array}{*{20}{c}} {f\left( {x_i} \right) = \mu \frac{{x_i}}{{k + x_i}}} \end{array}$$ (5)

We fitted these four growth models to the data by different forest types (Supplementary Figure 1) and calculated their Akaike information criterion (AIC) scores. AIC measures the relative quality of models for a given set of data, by estimating the information lost when the model represents the process that generates the data. Practically, AIC rewards goodness of fit (likelihood) and penalizes overfitting (number of parameters). A lower AIC indicates a preferred model. Supplementary Figure 2 shows that, by AIC scores, the non-saturating models (linear and exponential) are outcompeted by the saturating models (Chapman–Richards and Monod). The Chapman–Richards model and Monod (Michaelis–Menten) model have similar AIC scores. Empirically, the Monod model growth trajectory is determined by two parameters with simple and clear definitions—the asymptote for the saturated biomass (μ) and the age when the plot reaches the half-saturation (k). We therefore chose the Monod model for similar performance and simplicity.

Hierarchical Bayesian growth model

Our analysis quantifies the forest biomass recovery with stand age, with considerations of climate, in the past, current, and future forests across North America, in a hierarchical Bayesian (HB) framework. Theoretically, the HB framework was motivated by the transient dynamics of terrestrial carbon storage10. In a steady state environment, a growth function mathematically describes forest recovery in an autonomous system. When the environmental conditions are undergoing changes, the carbon storage capacity is no longer a constant but becomes a moving attractor, toward which the ecosystem carbon storage trajectory chases. In this non-steady state environment, a hierarchical growth model describes modified forest recovery in a non-autonomous system. A similar HB framework was recently developed to quantify climatic controls of postfire plant regeneration in South Africa41.

Our road map can be divided into three steps. First, we developed a HB growth model of the forest biomass, stand age, and climate relationship based on forest inventory and climate data in the current period (2000–2016). For each plot, the biomass recovery with stand age was effectively described by a Monod function, where the biomass gradually increases with the age but eventually saturates4. The Monod recovery trajectory is further determined by forest types and modified by climate conditions10. Second, we validated our model by using the current period to independently hindcast (backward predict) forest biomass in the past period (1990–1999). The model parameters obtained from the current period were used to hindcast the past biomass based on the past stand age and climate. The comparison of hindcast biomass vs. observed biomass provided an additional test of model performance and assumptions. Third, we used the validated model, future stand age, and future climate to predict forest biomass in the future periods of the 2020s, 2050s, and 2080s. We assumed no major disturbances as the best-case scenario for biomass recovery. Our final step was to compare the current biomass with the predicted future biomass. With this road map, we describe our analysis for the current, past, and future periods.

Current period—For the forest plot i in forest-type j, we modeled the aboveground biomass (y ij ) and stand age (x ij ) using the Monod function,

$$\begin{array}{*{20}{c}} {y_{ij} = \mu _{ij}\frac{{x_{ij}}}{{k_{ij} + x_{ij}}} + {\it{\epsilon }}_{ij},{\it{\epsilon }}_{ij}\sim N\left( {0,\sigma _j^2} \right)} \end{array}$$ (6)

where μ ij is the asymptote for the saturated biomass that a plot can achieve, k ij is the age when the plot reaches the half-saturation (μ ij /2), and ε ij is the normal error term with a forest-type-level variance. The Monod function assumes that the forests recover by increasing biomass with stand age, but eventually reach an asymptote of the biological capacity, as suggested by both forest succession theory11 and empirical global meta-analysis42. The Monod recovery trajectory is governed by the two critical parameters (μ ij , k ij ), which are assumed to further depend on the climate covariates of temperature (T ij ) and precipitation (P ij ) in each plot, motivated by the model of climate dependence in biomass accumulation rates43. The climate covariates (T ij , P ij ) were centered to facilitate interpretation.

$$\begin{array}{*{20}{c}} {\mu _{ij} = \beta _{0j} + \beta _{1j}\left( {T_{ij} - \bar T_j} \right) + \beta _{2j}\left( {P_{ij} - \bar P_j} \right)} \end{array}$$ (7)

$$\begin{array}{*{20}{c}} {k_{ij} = \gamma _{0j} + \gamma _{1j}\left( {T_{ij} - \bar T_j} \right) + \gamma _{2j}\left( {P_{ij} - \bar P_j} \right)} \end{array}$$ (8)

where β 0j quantifies the asymptotic saturated biomass on an average climate condition, β 1j quantifies the saturated biomass change per 1 °C change in temperature, with \(\bar T_j\) being the average temperature in forest-type j, and β 2j quantifies the saturated biomass change per 1 mm change in precipitation, with \(\bar P_j\) being the average precipitation in forest-type j; γ 0j quantifies the half-saturation stand age on an average climate condition, γ 1j quantifies the half-saturation age change per 1 °C change in temperature, and γ 2j quantifies the half-saturation age change per 1 mm change in precipitation. All parameters were assigned priors sufficiently noninformative, so that the posterior estimates were driven by the observed data.

$$\begin{array}{*{20}{c}} {\beta _{0j},\beta _{1j},\beta _{2j},\gamma _{0j},\gamma _{1j},\gamma _{2j}\sim U\left( { - 10^5,10^5} \right)} \end{array}$$ (9)

$$\begin{array}{*{20}{c}} {\sigma _j^2\sim \mathrm{IG}\left( {10^{ - 5},10^{ - 5}} \right)} \end{array}$$ (10)

where U is the uniform distribution, and IG is the inverse gamma distribution. We also tested weakly informative priors on \(\beta _{0j},\gamma _{0j}\sim U(0,10^5)\), implying the positive asymptotic saturated biomass and half-saturation stand age on an average climate condition, and we obtained almost identical posterior estimates. The model performance was checked using in-sample predictions by composite sampling. As another way of model checking, we performed a spatial cross-validation by randomly selecting 75% of the plots as the training dataset and 25% of the plots as the testing dataset. We fitted the model using the training plots, predicted biomass in the testing plots, and compared against the observed biomass. A good agreement between the out-of-sample predicted and observed biomass would verify the model assumptions and performance.

Past period—The model fitted using the current data assumes that the recovery trajectory in each plot differs by the spatial variation in climate, which can substitute the temporal variation in climate over the recovery e.g., ref.9 To validate this space-for-time assumption, we conducted an independent hindcast of the past biomass based on the stand age, climate, and fitted parameters from the current model. Plot distributions of both the past and current periods also show the dynamics of forest succession (Supplementary Figure 3). To conduct the test, for each plot i in forest-type j, the observed stand age in the past period (x ij ), we obtained out-of-sample predictions of biomass (\(\hat y_{ij}\)), which is independent of the observed biomass in the past period.

$$\begin{array}{*{20}{c}} {\hat y_{ij} = \hat \mu _{ij}\frac{{x_{ij}}}{{\hat k_{ij} + x_{ij}}}} \end{array}$$ (11)

$$\begin{array}{*{20}{c}} {\hat \mu _{ij} = \hat \beta _{0j} + \hat \beta _{1j}\left( {T_{ij} - \bar T_j} \right) + \hat \beta _{2j}\left( {P_{ij} - \bar P_j} \right)} \end{array}$$ (12)

$$\begin{array}{*{20}{c}} {\hat k_{ij} = \hat \gamma _{0j} + \hat \gamma _{1j}\left( {T_{ij} - \bar T_j} \right) + \hat \gamma _{2j}\left( {P_{ij} - \bar P_j} \right)} \end{array}$$ (13)

where \(\hat \beta\)’s and \(\hat \gamma\)’s are fitted parameters from the current model. The model hindcast biomass (\(\hat y_{ij}\)) was first calculated from Eqs. (11–13) and then compared against the observed biomass. A good agreement between the independent hindcast and the observed biomass would suggest the current model performed well under the assumptions.

Future period—We assumed that the future North American forests continue to recover without major disturbances (e.g., fire, pest). This simplified assumption led to the best-case scenario of forest recovery, and it was likely to over-predict the future biomass. Our goal was not to accurately project the future biomass; rather we aimed to quantify how much the forest biomass can grow (biomass potential) given no further disturbances as the best-case scenario. With the validated model, we predicted the future biomass (\(\tilde y_{ij}\)) based on the future climate and stand age (\(\tilde x_{ij}\)).

$$\begin{array}{*{20}{c}} {\tilde y_{ij} = \tilde \mu _{ij}\frac{{\tilde x_{ij}}}{{\tilde k_{ij} + \tilde x_{ij}}}} \end{array}$$ (14)

$$\begin{array}{*{20}{c}} {\tilde \mu _{ij} = \hat \beta _{0j} + \hat \beta _{1j}\left( {\tilde T_{ij} - \bar T_j} \right) + \hat \beta _{2j}\left( {\tilde P_{ij} - \bar P_j} \right)} \end{array}$$ (15)

$$\begin{array}{*{20}{c}} {\tilde k_{ij} = \hat \gamma _{0j} + \hat \gamma _{1j}\left( {\tilde T_{ij} - \bar T_j} \right) + \hat \gamma _{2j}\left( {\tilde P_{ij} - \bar P_j} \right)} \end{array}$$ (16)

where \(\tilde T_{ij}\) and \(\tilde P_{ij}\) are the projected mean annual temperature and precipitation in the 2020s, 2050s, or 2080s. With no further disturbances, the future stand age was extrapolated from the current stand age extending to the future. The predicted future biomass \(\left( {\tilde y_{ij}} \right)\) was compared with the current biomass \(\left( {y_{ij}} \right)\). A ratio of current vs. future biomass \(\left( {y_{ij}/\tilde y_{ij}} \right)\) is defined to quantify the forest recovery in reference to the best-case biomass potential. A ratio close to one would suggest that the forests have limited remaining growth potential. Note that the actual ratio is likely to be higher given further disturbances would result in lower future forest biomass.

For the HB growth model, posterior distributions were simulated using Markov chain Monte Carlo (MCMC), and convergence was checked by both visually assessing trace plots and Geweke diagnostics after 100,000 iterations for five Markov chains. All the analyses were performed in R version 3.4.344 and JAGS version 4.3.045.

Data availability

The United States forest inventory data are available at https://www.fia.fs.fed.us/. The Canadian forest inventory data are available from forestry sectors in each province. The climate data are available at https://adaptwest.databasin.org/pages/adaptwest-climatena.

Code availability

The code used in this study is available from the corresponding author on request.