Vegetation plot data

Tree community data consisting of measurements of all trees ≥10 cm diameter at breast height (DBH) were obtained from 58 vegetation plots with a total sampled area of 55 ha from 10 sites spanning the humid tropics (Supplementary Table 4). Data were obtained from three sources: (a) 1 ha plot data available from the Tropical Ecology Assessment & Monitoring Network (TEAM) open data portal (http://www.teamnetwork.org/), (b) 1 ha plot data from an open access data set published by Ramesh et al.40 and (c) 0.5 ha plot data from an open access data set published by Bradford et al.41. For multi-temporal data sets containing repeated measures, only data from single surveys conducted during 2011–2013 were considered. Species names were checked and standardized in accordance with currently accepted taxonomies42. Tree DBH values were cross-checked against measurements from previous survey periods, when available, to detect and correct obvious errors. Entries for conifers (0.03% of all individuals recorded in plots) and woody vegetation other than trees, such as lianas (0.09%) and palms (5%), were excluded because the general allometric equations used for biomass estimation (see below) are not applicable to these groups. However, in the case of palms which are quite common in some tropical forests, we performed a comparison between simulations run with palms excluded and simulations with palms included, using the same biomass equation for palms as for trees43 and found that excluding palms had little influence on overall carbon responses (Supplementary Fig. 6). The data set comprised 26,512 trees, including 23,822 trees identified to the species level and 1,660 identified only to the genus level. The data set also comprised 1,030 trees that were not identified to the genus level, which were excluded from subsequent analyses.

Functional traits data

Data were collected on seed dispersal mode, seed length, wood density and maximum attainable height for the species in the vegetation plot data sets (key secondary data sources listed in Supplementary References 45–77). First, species were classified based on seed dispersal mode into four categories: (a) animal (vertebrates and invertebrates), (b) wind, (c) unassisted (explosive dehiscence, gravity and water) and (d) multiple (modes combining a with b and c). Groups a and d were then aggregated under the category of animal-dispersed species while groups b and c were grouped as abiotically dispersed species. Data on seed dispersal modes were primarily obtained from the Seed Information Database of Kew Gardens44, as well as floras and scientific studies focusing on seed dispersal in tropical forests. In the absence of species-level classifications, dispersal mode was assigned based on data from congeners or using resources which provide genus-level classifications45,46.

Species' seed sizes were estimated as the average length (cm) of the longest seed axis. This metric was selected to index seed sizes in this study because (a) it is the most widely reported measure of seed size, and (b) it shares generally consistent relationships with dimensions of other seed axes13,47. Comparison of 295 tropical tree species in our database showed that seed lengths were closely correlated with seed dry weights (Spearman correlation R s =0.87, P<0.0001), another trait widely used to index seed size in studies of animal seed dispersal. Seed lengths were obtained from direct measurements, as well as from regional floras accessed through online databases, the Biodiversity Heritage Library (http://www.biodiversitylibrary.org/), journal articles and digital images of specimens. In case of images, seed lengths were estimated using the reference scales provided. Large-seeded animal-dispersed species were classified as those larger than the 75th percentile seed length amongst all animal-dispersed species within each community (Supplementary Data 1). This approach to defining large-seeded species was used because although size distributions of both seeds and their dispersers are known to vary considerably across tropical regions46, guideline values for defining large-seeded species are not available for all sites. However, where such values were available from other sites in the Americas (1.5 cm) (refs 48, 49), Africa (1.8 cm) (ref. 13) and the Orient (1.5–2.0 cm; refs 15, 50), we found them to be similar to the 75th percentile cutoffs used in this study. When species-level seed lengths were not available for defining large-seeded species, proxies derived from genus-level information were used to classify species for the simulations. These values were obtained from the following sources in decreasing order of preference:

1 Seed length as a function of fruit length: data on fruit lengths are generally reported alongside seed lengths in most flora resources, and are often available even when seed lengths are not reported. We developed genus-wise linear models of seed length as a function of fruit length, and observed a strong fit amongst models (mean R2=0.92) based on an assessment of 982 species across 139 genera. Thereafter, in genera that had at least three species contributing to the model, these models were used to predict seed lengths for congeners which only had data on fruit lengths. 2 Genus-level seed length: genus-level seed lengths were obtained from Dennis et al. 45. Based on a comparison of 1,125 species across 325 genera, we observed that seed lengths at the species and genus levels are strongly positively correlated (R p =0.69, P<0.0001).

Wood density (g cm−3) data were obtained from primary sources and from the Global Wood Density Database51,52. When using secondary sources, only data that were collected in the same continent as the location of the target species were used, that is, for example, only records from Africa contributed to wood density estimates for African species, and so on. In cases where species-level wood densities were not available, average values across members of the same genus or family were assigned. These genus- and family-level estimates were used only in the simulations, and not for assessing relationships with other species traits.

Maximum attainable heights (m) were obtained from regional floras accessed through online databases and the Biodiversity Heritage Library (http://www.biodiversitylibrary.org/). Among species for which data on maximum height and maximum diameter were available from floras, there was a strong and positive correlation between these two indices of adult size (R s =0.75, P<0.0001, N=813).

Carbon storage estimation

Carbon stored by individual trees was estimated using the following equation developed by Chave et al.22 for moist forests:

where W is wood density in g cm−3, and D is DBH in cm.

This equation was selected because (a) it is developed based on a large pan-tropical data set of tree measurements, and (b) height measurements are not required for carbon estimation (but height-diameter relationships are implicit), which is crucial because the plot data sets used here did not contain information on tree heights. Moreover, the structure of equation 1 permits ready deconstruction into its wood density (W, term 1) and volume (term 2 comprising the DBH-related terms within the exponent) components. Stand-level carbon stocks and volume estimates were obtained by summing over all individuals within each site, and dividing by the total sampled area for per-hectare estimates. Stand-level wood density was estimated as the average wood density across species, weighted by basal area25.

Although the allometric equation used here is known to overestimate aboveground biomass53, we make no between-site comparisons of aboveground biomass, and thus, do not expect our choice of equation to bias our results in any way.

Simulations

For the simulations, data from plots within sites were pooled to create single community data sets per site, ranging in sampled area from 3 to 6 ha (Supplementary Table 4). Shifts in tree community composition and aboveground carbon storage were simulated using a two-step procedure, comprising a removal, followed by a recovery step. In the removal step, a certain number of individuals (N Rem ) were removed from tree communities following a set of rules (see below), which resulted in a reduction of total basal area (BA Loss ), as well as incidental reductions in species richness (S Loss ). In the recovery step, BA Loss was recovered (with a target accuracy of ±1%) by repopulating the community with individuals (N Rec ) selected through a random draw, with replacement, from the remaining pool of individuals21. As the target was to recover lost basal area, and not tree densities, N Rem and N Rec could differ in value.

Three sets of scenarios were simulated for each community, namely (1) a defaunation scenario in which declines of large-seeded animal-dispersed species were simulated, and two control scenarios to distinguish defaunation effects from (2) simulation artefacts resulting from removing and replacing individuals, and (3) the effects of species loss per se.

In the defaunation scenario, declines of species dispersed by large animals were simulated by randomly removing individuals belonging to large-seeded animal-dispersed (>75th percentile seed length) tree species and repopulating the community with individuals drawn at random from the remaining pool, to recover original total basal areas. Four levels of defaunation-driven losses were simulated, with N Rem equalling 25, 50, 75 and 100% of individuals belonging to large-seeded animal-dispersed species, respectively. One thousand iterations were simulated for each of the four levels at each site.

The first control scenario was simulated to distinguish the effects of defaunation on aboveground carbon stocks from numerical effects arising solely from the removal and replacement of individuals in the simulations. One control scenario was simulated for every iteration of the defaunation scenario, with N Rem values matching those of the corresponding defaunation scenario, but with removal of individuals through a random draw from the overall pool, irrespective of seed dispersal category. Recovery of lost basal areas was simulated through random draws of individuals from the remaining community.

The second control scenario was simulated to control for the effects of species loss per se, which could accompany the removal of individuals in the defaunation scenarios, and is bound to occur in the 100% removal scenario. One control scenario was simulated for every iteration of the defaunation scenario, with S Loss values matching those of corresponding defaunation scenarios, but with removal of species through a random draw from the overall pool. Recovery of lost basal area was simulated through random draw of individuals from the remaining community.

The R code used to run the simulations has been uploaded to a GitHub repository and is available here: https://github.com/aosuri/defaunation_carbon_project.

Analysis

Before analysis, each response variable was recalculated as the percentage change from the corresponding value for the original community. The distributions of percentage change in carbon stocks, stand volumes, basal area-weighted wood densities and relative abundances of large trees (≥70 cm DBH, following Slik et al.25) in the defaunation and control scenario simulations are graphically represented using boxplots. We considered cases where the entire inter-quantile range of a response in any given scenario does not overlap with zero to indicate consistent effects of species/individual extirpations. The percentage of simulation runs in each scenario that showed declines in aboveground carbon stocks was also recorded. Defaunation effects were estimated as the difference in median percentage change between the defaunation and control scenarios for each response variable. We present these effect sizes without conducting statistical significance tests to estimate P values, as recommended by recent papers on the inappropriateness of significance testing in simulation studies54,55.

The R statistical environment version 3.0.2 (ref. 56) was used for running simulations, analyses and preparation of figures.