Study site and design

Data were collected at the Stability of Altered Forest Ecosystems Project in Sabah, Malaysia15. Logged forests had been through one round of selective logging (removing 113 m3 ha−1) in the 1970s and a second round of salvage logging16 (removing 66 m3 ha−1) that occurred between 2000 and 2008. Data collection took place between 2010 and 2012, ∼5–10 years after logging ended.

All data collection had a nested structure with sample sites clustered at up to four spatial scales based on a fractal sampling pattern15,33 (Supplementary Fig. 1). At the finest scale, sites were separated by 101.75 m (first order), with those clusters of sites separated by 102.25 m (second order) and again at 102.75 m (third order). All third-order sites were nested within blocks separated by >1 km and the average area of a block was 71 ha. There are 17 sampling blocks at the Stability of Altered Forest Ecosystems Project, which vary in the level of historical disturbance15. For these analyses, we used data collected from eight blocks: two blocks located in unmodified primary forest (blocks OG1 and OG2) and six blocks located in salvage logged forest (blocks A, B, C, D, E and F). We used larger sample sizes in the logged forest because data on leaf area index, an index of forest structure, demonstrated the habitat there was more variable than in the primary forest, with higher variance (0.51 versus 0.39) and a larger range (0.83–5.56 versus 3.12–5.48) of values. Not all datasets were collected at all points, so sample sizes differ among the datasets used in analyses.

Ecosystem process experiments

Litter decomposition: naturally senesced whole leaves of Macaranga sp. were collected within 24 h of falling. We used a morphospecies (M. cf. pearsonii) that is a common early successional tree species, is highly abundant within logged forest and also common along riverine margins and in treefall gaps within primary forest. This standardization of the litter does, however, mean our experiment was unable to control for any potential biases in invertebrate preference behaviour among the two habitats. Leaves were cut into roughly 2 cm2 pieces, discarding large veins and petioles, and dried to constant weight. Litter pieces were separated into 4-g units and placed in nylon bags with a 1-mm mesh. To allow invertebrates access to the litter, five 1-cm2 perforations were made on each side of the bag. Fungi were excluded by treating the filled litter bags with broad-spectrum fungicide containing 40% chlorothalonil34. Data were collected at second-order sites with nine primary forest (block OG2) and 16 logged forest (block E) sites. At each site we placed nine litter bags; three each for the control, invertebrate exclosure and fungal exclosure treatments. One bag from each treatment was collected after 14, 27 and 40 days (±2 days), respectively, and on each day, five or six additional bags were carried to and from the field to calculate the amount of litter mass lost through handling. Although a short period, this was long enough to detect invertebrate impacts on decomposition that are consistent with results reported elsewhere in the tropics34,35,36. Collected bags were dried to constant weight. Decomposition rate for each site × treatment combination was quantified as the slope of linear regression modelling litter mass as a function of log e -transformed number of days in the field, using handling loss as an offset in the model.

Seed disturbance: to compare seed disturbance rates between vertebrates and invertebrates, we required an intermediate seed size and type that was attractive to both taxa. We conducted pilot trials that indicated large seeds such as peanuts were not disturbed by invertebrates, and small seeds such as sesame were ignored by vertebrates. We found that pumpkin seeds were disturbed by both vertebrates and invertebrates, and thus were chosen for the experiments.

Data were collected at first-order sites with 48 primary forest (block OG2) and 146 logged forest (blocks D and F) sites. At each site we placed 20 pumpkin seeds on each of three standardized brown plastic ‘leaves’ (80 × 120 mm) left sitting on the forest floor. One of the three leaves had ground-moving invertebrates excluded by surrounding the leaf with a wooden frame coated with insect-trapping glue, a second had vertebrate predators excluded by placing the artificial leaf inside a 30 × 30 × 30 cm wire cage with a 1-cm mesh, and the third was left as a control. We defined the seed disturbance rate as the proportion of seeds that were either removed from the leaf, or partially eaten but left in place on the leaf, over a 24-h period.

Invertebrate predation: data were collected at first-order sites with 21 primary forest (block OG2) and 30 logged forest (block E) sites. At each site we attached one live larval mealworm (Tenebrio sp.) to each of three standardized green plastic ‘leaves’ (80 × 120 mm) using fine cotton thread and clear tape. Enough cotton thread was provided to allow the mealworm to crawl around the leaf, and artificial leaves were attached to saplings 1.5 m above ground. One of the three leaves had ground-moving invertebrate predators excluded by applying insect-trapping glue to the sapling stem 20 cm above and below the artificial leaf, a second had vertebrate predators excluded by placing the artificial leaf inside a 30 × 30 × 30 cm wire cage with a 1-cm mesh, and the third was left as a control. Predation of mealworms was recorded over a 24-h period.

Functional composition of the rainforest fauna

Invertebrate collections: leaf litter beetles were collected at 54 first-order sites in primary forest (blocks OG1 and OG2) and 144 first-order sites in logged forest (blocks A, C and E) over 3 days during the wet season. We used modified flight intercept traps dug into the ground to simultaneously act as a pitfall trap (23 cm diameter × 60 cm high). Traps were part-filled with 70% ethanol to act as a killing agent. Traps collected frogs as by-catch that allowed us to analyse the presence or absence of frogs in relation to habitat. The total invertebrate biomass of samples was estimated by placing all invertebrates with body length >5 mm on blotting paper and weighing the blotted invertebrates. We excluded small invertebrates despite their high abundance because their small body size means they are likely to contribute less to total community biomass37,38, and also to community-level energy fluxes37,39, than the less abundant, but larger, organisms. We calibrated sample-level wet weight biomass of invertebrates with sample-level dry weight biomass using 42 samples that were first wet weighed and then oven dried to constant weight. Wet weight was strongly and linearly correlated with dry weight (linear regression through the origin, F 1,41 =1,578, P<0.001, R2=0.97, slope=0.217±0.005 s.e.). Canopy invertebrates were collected by fogging at 12 second-order sites in primary forest (block OG2) and 95 second-order sites in logged forest (blocks A, B, C, D, E and F). At each site, four trays of 1-m diameter were laid out with collecting pots filled with 95% ethanol attached. Fog formulation was synthetic pyrethrum insecticide (active compound: alphacypermethrin with synergist 2.27 %) and diluted in diesel by a ratio of 15:1. Fogging activity started at 07:00 and lasted for 4 min at each site. Arthropods were collected after a 2-h period and identified to order.

Termites: we hand-collected termites from 16 soil pits (12 cm diameter × 10 cm deep) at each of nine second-order primary sites (block OG2) and 32 second-order logged forest sites (blocks C and F), and were identified to genus40. Earthworms were collected from four soil monoliths per site (three monoliths of 50 × 50 cm wide × 30 cm deep and a fourth, smaller monolith of 25 × 25 cm wide × 30 cm deep) at each of 9 second-order primary sites (block OG2) and 18 second-order logged forest sites (blocks B and F). Foraging ant abundance and species richness was quantified at 18 first-order primary forest and 192 first-order logged forest sites by counting and identifying the number of ants entering a 12 × 14 cm plastic card, laid flat in the leaf litter and baited with 30 compressed dried earthworm pellets, over a 40-min period. All observations were conducted between 10:00 and 15:00.

Invertebrate functional traits: beetles were identified to family and families were classified according to whether they contained predominantly predatory species or not41. All foraging ants were identified to genus, split into morphospecies and assigned species names where possible. Termites were also identified to genus and grouped according to feeding position along a four-step humification gradient, with the second group representing those that feed on grass, dead wood and leaf litter42. We measured Weber’s length, a common proxy for body size in ants, for each of the 192 species of foraging ant, averaging measurements from between one and five minor workers per species. Ant genera were also classified according to whether they belonged to the specialist predator functional group or not43. We used the abundance-weighted mean body size (log 10 -transformed) of all ants, and the total abundance of specialist predators, visiting each site as the response variable in analyses. Arthropods from canopy fogging samples belonging to the Orthoptera, Phasmida, Homoptera and Heteroptera were classified as herbivores.

Small mammals: we trapped small mammals within 1.75 ha grids overlaying the nested sampling design used in other data collection. Each grid consisted of a 4 × 12 rectangular arrangement of points separated by 23 m. Two locally made small mammal traps (280 × 140 × 140 mm), baited with oil palm fruit, were placed at or near ground level (≤1,500 mm) within 10 m of each grid point, making 96 traps in total per grid. For this analysis, we included captures only from those traps that lay within 30 m of a first-order sampling point, including 265 traps in primary forest (at 27 first-order primary sites in blocks OG1 and OG2) and 983 traps in logged forest (at 103 first-order sites in blocks D, E and F). Trapping sessions ran for seven consecutive days and we used the capture rate of all species combined (captures per seven days) in the analyses.

Bats: four-bank harp traps were set across trails and logging skids at nine second-order sites in primary forest (block OG2) and 81 second-order sites in logged forest (blocks A, B, C, D, E and F) to target insectivorous bats foraging in the forest understory22. Up to seven traps were set each night, 50 to 150-m apart, and moved to a new position the following day. All bats were marked and released at the capture point.

Birds: we sampled birds using 15-minute point counts of 50 m radius at 18 second-order primary forest sites (blocks OG1 and OG2) and 96 second-order logged forest sites (blocks A, B, C, D, E and F). Birds were identified by a single experienced observer (David P. Edwards) and any unknown vocalizations were recorded using a Sennheiser ME-66 directional microphone and Edirol R09-HR digital recorder, and subsequently identified against reference collections available from http://www.xeno-canto.org/. Swifts (Apodidae) and swallows (Hiurundae) were not recorded because they are difficult to observe in closed-canopy forest. We recorded the combined abundance of all bird species classified as belonging to insectivore (including both obligate and generalists) and granivore guilds44.

Environmental variables

Leaf area index was calculated from 13 hemispherical photographs per plot at 18 second-order primary forest sites (blocks OG1 and OG2) and 80 second-order logged forest sites (blocks A, B, C, D, E and F), and processed following the methods of Pfeifer et al.45. The presence of flowers or fruits in 25 × 25 m vegetation plots was recorded at 18 second-order primary forest sites (blocks OG1 and OG2) and 96 second-order logged forest sites (blocks A, B, C, D, E and F). Within each vegetation plot we counted the number of clumps of the litter-trapping fungi belonging to the genus Marasmius spp., which forms abundant and easily recognizable networks of brown rhizomes that trap leaf litter above the ground21. We placed iButton DS1923-F5 dataloggers (Dallas Semiconductor) to record air temperature and relative humidity 1-m above ground every 3 h, from which we determined the average maximum daily air temperature and minimum daily humidity in the dry season (March and April)46, when the forest experiences the most extreme microclimatic conditions, at 34 first-order primary forest sites (blocks OG1 and OG2) and 127 first-order logged forest sites (blocks A, B, C, D, E and F).

Statistical analyses

We fitted generalized linear mixed effect models to the data with random effects reflecting the nested structure of data collection (multiple observations within first-order sites within second-order sites within third-order sites within blocks). Error distributions for the models were selected according to the nature of the response variables and are recorded in Supplementary Tables 1–6. Count data, which was used for all estimates of animal abundance and species richness, was modelled with a Poisson error distribution. Presence–absence and occurrence data, along with binary response variables such as invertebrate predation and seed disturbance, were modelled with a binomial error distribution. All other variables were modelled using Gaussian errors, with response variables log 10 -transformed where this improved normality. All models were fitted using the lme4 package47 of the R statistical computing environment48. We used likelihood ratio tests to determine parameter significance by comparing models with habitat (primary versus logged forest) to a null model with no predictor. The proportion of variance explained by fixed effects was calculated and is reported in Supplementary Tables 1–6.

Posthoc significance tests using the glht function in the multcomp package49 were used to compare treatment effects and their interactions. We determined the proportion of variance explained by fixed effects and by the model as a whole including variance explained by the random effects50. Residuals of all models were tested for spatial autocorrelation to estimate the spatial dependence of model residuals as a continuous function of distance51. In no case did we detect significant spatial autocorrelation (Supplementary Tables 1–6) so this is not discussed further. Full results of the partitioning of variance explained and tests of spatial autocorrelation are presented in Supplementary Tables 1–6.