Distribution data

Trends were estimated from occurrence records of hoverflies and bees extracted from the Hoverfly Recording Scheme (http://www.hoverfly.org.uk/) and the Bees, Wasps and Ants Recording Society (BWARS: http://www.bwars.com/). Combined, the dataset used in this study consisted of 715,392 (Hoverfly = 417,856, Bee = 297,536) records, defined as a unique combination of 1 km grid cell, date, and species. By excluding records pre-1980 and post-2013, we focussed on a core period of recording activity for both taxonomic groups. We excluded grid cells with < 2 years of data, removing the most poorly sampled regions. These observations constitute presence-only data, so we inferred non-detections from records of other species within the taxonomic group on the same grid cell and date (henceforth visit)17,18. The analysis was based on 12,849 and 12,076 unique 1 km grid cells for hoverflies and bees, respectively. The 1 km grid was chosen to reflect the scale at which hoverfly and bee populations use the landscape. Species with taxonomic issues during the time frame of the study and species not considered to be pollinators (following expert guidance from BWARS) were excluded from the analysis. In addition, we follow the species exclusion criteria of ref. 31, dropping species with fewer than 50 records. The final dataset was based on 139 bee and 214 hoverfly species (covering ~75% of the British bee and hoverfly fauna).

Statistical analysis

Much of these data were collected by volunteer recorders without specific sampling design. Therefore, the data contain a variety of forms of bias that inhibit the ability to extract robust trends from them. For example, the occurrence data suffered from temporal bias, with greater numbers of records in recent years. A host of techniques have been proposed to account for such bias while estimating trends, with recent studies suggesting hierarchical occupancy, models fitted within a Bayesian framework perform particularly well17,18. In this study, we used a Bayesian occupancy modelling approach based on the models of refs. 17 and 31, to estimate occupancy (the proportion of occupied 1 km grid cells) each year between 1980 and 2013 for each species. By using two hierarchically coupled sub-models (1 and 2, below), the occupancy model simultaneously estimates and accounts for variation in detectability, while estimating species presence for a given site, year combination.

$${\mathrm{State}} {\mathrm{model}}: z_{{\mathrm{it}}} {\mathrm{\sim}} 2 {\mathrm{Bernoulli}}\left( {{\mathrm{\psi }}_{it}} \right); {\mathrm{logit}}\left( {{\mathrm{\psi }}_{it}} \right){\mathrm{ = b}}_t + {\mathrm{u}}_i$$ (1)

$${{\mathrm{Observation}}\;{\mathrm{model}}: y_{itv}|z_{it}\sim {\mathrm{Bernoulli}}\left( {z_{it} \ast p_{itv}} \right); {\mathrm{logit}} \left( {p_{itv}} \right) = a_t + \delta _1.{\mathrm{DT2}}_{itv} + \delta _2.{\mathrm{DT3}}_{itv}}$$ (2)

where, z it and ψ it are the true (unknown) occupancy and probability of occupancy of site i in year t, respectively. b t and u i are categorical fixed and random effects for year and site (1 km grid cell), respectively. Y itv represents the observed data, this is a 1 or 0 based on whether the species was detected or not at site i, in year t, on visit v. p itv is the probability of detection at site i, in year t on visit v, and is conditional upon z it = 1. Probability of detection was modelled as a function of a t a random year level effect (accounting for variation in detectability over time), and δ 1 and δ 2 the effects of list categories 2 and 3, relative to category 1. For most species, we expect detectability to be lower on shorter lists, we therefore included list category (δ) as a covariate in the detection model to account for variation in recorder effort. Visits were grouped into one of three categories based on the number of species recorded as follows: (1) single species lists, (2) short-day lists, 2 or 3 species recorded (DT2), and (3) comprehensive day lists, visits with >3 species recorded (DT3)18. Visits were defined separately for each taxonomic group; e.g., for any given bee occupancy model, the list length data was based solely on bee records.

Predicted presences (z it ) were combined to calculate the annual proportion of occupied sites (occupancy). For clarity, an occupancy value of 1 indicates the species occupied every 1 km grid cell included in the study (12,849 and 12,076 cells for hoverflies and bees, respectively). We used the random walk half-cauchy prior formulation of ref. 31, which enabled the sharing of information between the current and previous year in the state model, essentially adding a smoother for the annual occupancy estimates. We used uninformative priors for the remaining parameters within the model. For further detail of the occupancy model, see ref. 31. Occupancy models were fitted using R2jags32, with 3 chains, 20,000 iterations, and a burn in of 10,000 and a thinning rate of 3. This was sufficient to achieve convergence (Rhat < 1.1) for the vast majority of occupancy estimates across species and years: we retained the small minority of combinations for which Rhat > 1.1, as they are unlikely to exert directional bias on our high-level summary statistics.

As with all modelling approaches, the approach we used has several key assumptions. First, the model assumes no false presences, which we feel was a reasonable assumption given the data were validated by recording scheme organizers along with several automated checks. A second key assumption is that the detection sub-model reflects a true representation of observation process. There may be examples where this assumption is not met. For example, intense targeted surveys for certain species may not be fully accounted for in the detection model, leading to unreliable occupancy estimates for the species in question. Furthermore, strong temporal bias in recording intensity can lead to increased uncertainty in the occupancy estimates in earlier years. Bearing these issues and assumptions in mind, we chose hierarchical occupancy models, as they have been shown to perform well at dealing with such forms of bias17, and although the detection model may not be perfect for all species, it is likely to be better than a model that ignores variation in detectability. It is worth noting that alongside these trends, the recent development of a standardized pollinator monitoring scheme28 will increase the understanding of future changes in pollinator abundance and potential consequences for pollination services. Finally, as with the majority of unstructured UK biological records datasets, there was a southern bias to the data in the study; thus, the trends predominantly reflect changes within this region. However, bees and hoverflies are two of the more well-recorded taxonomic groups in the UK, with an active recorder base and scheme organizers who aim to improve the spatial coverage of data. Given this, and the inclusion of a large number of records from northern Britain, we feel the trends in this study are representative of national-scale trends.

Our full set of model outputs consists of 10,000 samples from the posterior distribution of occupancy for 353 species in each of 34 years (>108 samples in total). To reduce the computational load of subsequent calculations, we restricted our analysis to a random set of 1000 samples from the posterior of each species:year combination. All trends and other summary statistics were calculated from this set, from which we report median and 95% CIs. Species occupancy time series were clipped, with annual occupancy estimates before the first record and after the final record, dropped from the study. Individual species trends were estimated as the annual growth rate (percent change per year) between the first and final year of the clipped series.

We calculated multispecies composite trends to provide an indicator of the overall trend trajectory for different ecologically significant groups of pollinators (as seen in Fig. 2 and Supplementary Figures 3, 4, and 5). Occupancy estimates were logged and fed into a linear model with year and species treated as categorical explanatory variables. Sum contrasts were used to ensure the composite trend reflects the average species response. The parameter estimates for the year effects were converted back to the occupancy scale and used as our composite trend metric, effectively a geometric mean occupancy estimate each year across species.

In addition to calculating geometric mean occupancy, we examined temporal patterns in the balance between rare and common species, defined in terms of low and high occupancy, and measured using Simpson’s evenness (the − log e D j formulation26). Decreases in evenness are indicative of diversity loss and can be considered a signal of biotic homogenization, i.e., communities becoming dominated by a small number of widespread species. Again, using 1000 sampled values from the posterior distribution allowed full propagation of uncertainty. We extracted the first derivatives (i.e., the difference between adjacent years) of geometric mean occupancy and evenness to highlight notable years of change.

Trait and assemblage classification

We examined change across five grouping variables aimed at improving our insight into the key drivers of change and potential implications for pollination services. First, we divided species into their broad taxonomic group (splitting bees and hoverflies). This reflects fundamental differences in breeding ecology, with bees being fixed-place foragers that must provision a nest. Next, with a particular focus on implications for pollination services, we examined composite trends for bee species known to be dominant crop pollinators33 compared with those of other wild bee species. We used CLUSTASPEC34 to split species into four categories based on their distribution patterns at the 10 km2 grid square scale, resulting in the following four categories, upland species, southern species, widespread southern species, and widespread species (predominantly hoverflies). To aid visualization of these species clusters, richness maps (using 10 km records between 1980 and 2013) of the clusters are shown in Supplementary Figure 6. Finally, evidence from previous studies suggests that sociality can affect species’ sensitivity to environmental change through links to reproductive and foraging capacity35. Eusocial species are functionally distinct from other bee species24 and many are economically important pollinators (7 species were included in the 22 species of dominant crop pollinator). As a group, they have been actively targeted by conservation measures, including the planting of legumes in flower-rich field margins as part of agri-environment schemes. However, the increased foraging capacity of social species may lead to increased pesticide exposure6 compared with solitary species. We therefore compare composite metrics separately for eusocial and solitary species. A detailed breakdown of which species were in each category can be found in Supplementary Data 1.

Code availability

The code used to produce the results and figures in this paper is available from the corresponding author upon request. The occupancy models in this study were run using R2jags32 via the occDetFunc function, which is freely available as part of the R package Sparta36.