Wild bee distributional data and foraging preferences

There are ∼250 species of native bees (Order Hymenoptera) known to occur in England, comprising solitary bees (for example, some Apidae, Andrenidae, Megachilidae and Halictidae), 24 species of bumblebee (Apidae: Bombus spp.) and the domesticated honeybee (A. mellifera). Approximately 20% of these species are known to act as pollinators of oilseed rape and so occur within arable farmland37,39,40. To assess distributional changes in these wild bee species we analysed long-term occurrence records from 1994–2011. These records were collected and verified by the Bees, Ants and Wasps Recording Society (BWARS: http://www.bwars.com/) and represent a largely volunteer-collected, national-scale distributional database that is globally unique in coverage and detail. We used only bee distributional records from England to match available data on insecticide use and oilseed rape cropping patterns. We focused on the period 1994–2011 to quantify trends both before and after the date of first use of neonicotinoids on oilseed rape in England in 200241. We excluded honeybees from the current study as their hives are both artificially managed and moved around landscapes and so are not comparable with wild species42.

As citizen science data can be collected via wide participation projects using non-experts it has a reputation for being variable in quality. However, the UK recording schemes for invertebrates are typically more refined. Individual records are normally collected by local experts/entomologists rather than the general public (that is, who have no taxonomic experience). Under the auspices of the Bees, Wasps and Ants Recording Society, identifications are verified through photographic evidence and/or physical specimens in questionable cases. Records are also compiled centrally and subject to computer checks to identify potential outliers, such as those outside the previously known range, from atypical habitats or outside typical flight periods. In terms of the taxonomic rigour of individual records this data set is of high quality. As the data on bee distributions were collected by volunteers, not all areas are sampled with the same effort. As such while the data set is taxonomically robust there is no structured framework for how records are collected in terms of sites sampled or methods used43. It is for this reason that these data sets contain information only on occupancy of grid squares and not abundance. However, variation in recorder activity is a potentially significant issue in the analysis of these data sets. We use methods recommended by Isaac et al.21 to account for the effect of variation in recorder activity on trend estimation which are described in detail in the statistics section below. The data sets used here have been used as a basis for the identification of declines in pollinator species richness in the UK4,44.

Wild bee distributional foraging preferences for oilseed rape

We classified wild bee species according to whether or not they have been observed foraging on oilseed rape in England. This was based upon published surveys from 30 English farms comprising 114 h of direct observations7,8 to produce a list of 50 bee species (including honeybees) recorded as foraging on oilseed rape (Supplementary Table 1). This list was used to classify species of bee into two categories: oilseed rape foragers and non-foragers. Due to differences in methods used to collect the data from which this list was compiled it is solely qualitative, and makes no assessment of the relative use of the crop by different species. However, the list is derived from surveys undertaken in areas of high diversity of wild bees in the UK, in particular those associated with Salisbury Plain (the largest area of pristine chalk grassland in Europe). To our knowledge this is the most comprehensive UK list of bees that forage on oilseed rape, and comprises c. 20% of UK bee species. All of the 20 wild bee species identified as pollinators of oilseed rape by the Kleijn et al.36 study of world crop pollinators were represented in this list, with an additional 29 wild bee species.

Criteria for species inclusions in analysis

We converted the occurrence records into a data frame suitable for analysis by first selecting all 1 km2 grid cells in England with surveys in at least two-years during the period 1994–2011. We then identified all unique combinations of date and 1 km2 grid cell, which we henceforth refer to as a survey. Surveys with coarser spatio-temporal resolution were excluded. Our final data set contained 31,818 surveys from 4,056 1 km2 grid cells in England. Half the surveys were of just a single bee species, but the maximum number of species per survey was 45 species out of the total bee fauna of c. 250 (Fig. 1). We selected the 62 species that were recorded on at least 500 surveys from our final data set, representing 28 species of non-foraging and 34 species of oilseed rape foraging bees (Supplementary Table 2). This 500 survey threshold served two purposes. First, it excluded data poor species that could affect model convergence. Second, by including only relatively well-represented species we increased the reliability of our classification of bees as either oilseed rape foragers or non-foragers. Specifically species that may potentially have fed on oilseed rape, but due to their rarity would have been unlikely to be observed doing so, were excluded from the analysis using this threshold. In the analysis we treat B. terrestris and B. lucorum as an aggregate: workers of these species are extremely difficult to distinguishing from one another. Treating them as an aggregate avoids the possibility that our model could be biased by misidentifications, while minimizing the amount of discarded data (these are two of the commonest bees in the database).

OSR and neonicotinoid exposure rates

Oilseed rape represents an important forage resource for many wild bees, so we hypothesized that the cover of this crop had a positive effect on population persistence27,31. This contrasted with the potentially negative impacts of exposure to neonicotinoids expressed in the pollen and nectar of pesticide-treated crop8,9,10,12. To account for this, our analysis defined two separate variables that describe both the area of oilseed rape grown and neonicotinoid exposure resulting from the treatment of that crop with neonicotinoid seed treatments. The area of sown oilseed rape was derived from the Department for Environment, Food and Rural Affairs June Survey of Agriculture and Horticulture19. This quantified OSR in each 5 × 5 km grid square of England from 1994 to 2010. These data were collected every two-years, so we interpolated the data values for alternate years (for example, 2005 values were set as the mean of 2004 and 2006 values).

To define the temporal and spatial changes in the exposure of wild bees to neonicotinoids we defined the extent of neonicotinoid seed treatment use as recorded by the UK Pesticide Usage Survey45. Neonicotinoids were widely used in oilseed rape from 2002 following the first full commercial UK licensing of this insecticide for this crop (first, Imidacloprid in 2002, followed by Clothianidin and Thiamethoxam41). Note, our data set includes a small number of grid cells where regulatory trials were conducted before this (1999–2001). The use of neonicotinoid seed treatments rose rapidly from 37.4% (s.e.±8.0) in 2002, to 83.0% (s.e.±5.2) of the crop treated by 2011. Data on the use of neonicotinoids was collected as part of UK commitments to the EU Statistics Regulation (1185/2009/EC) by the Food and Environment Research Agency. These data are collected every two-years (and so concurrent with the crop cover data) but are derived at a considerably coarser scale of eight Department for Environment, Food and Rural Affairs regions in England (East Midlands, Eastern, London and South East, North East, North West, South West, West Midlands and Yorkshire and the Humber). The Pesticide Usage Survey is based on information provided from 1,200 surveyed farms, stratified by region and size. Surveys incorporate inbuilt anomaly checks, including verification that application rates on individual sites lie within manufacturer recommendations. For each of the eight regions standard errors for the extrapolated rates of application are derived using non-parametric bootstrapping techniques. Following regulatory requirement these standard errors must fall below 5% (ref. 46). As such this data are considered highly reliable both within and between years. Neonicotinoid exposure in each 25 km2 grid square (5 × 5 km) for each year was estimated as the product of the area of oilseed rape and the proportion of that crop treated with neonicotinoids in the region within which the grid square was located. We provide a discussion of issues relating to potential collinearity problems between OSR and the proportion of the crop treated with neonicotinoids in Supplementary Note 1, and show that they do not affect the general reliability of our conclusions.

FII index

The extent of foliar insecticide use (that is, that applied as a spray as opposed to a systemically expressed seed treatment) was defined by an aggregate index describing both the application rates of foliar insecticides on oilseed rape, as well as information on their relative toxicity for bees. This FII index produces a composite estimate of the impact of foliar sprayed non-systemic insecticides (of which the most frequently used were pyrethroids) and was based on the bee component of the Environmental Impact Quotient (EIQ)47. The FII was defined as:

where: Z ai is a measure of the toxicity of the active ingredient (ai) to bees. The factor 3 in equation 1 represents a weighting for comparing the relative exposure of bees to other taxa and is an integral component of the full EIQ calculation (bees and birds are given a weighting of 3, beneficial arthropods are given a weighting of 5). While redundant in the current equation it has been retained to provide consistency with the original EIQ assessment47. To calculate this measure of toxicity for bees, each foliar insecticide was classified by its lethal dose score (LD 50 ) into high, medium or low toxicity compounds. Following established protocols47, high toxicity compounds (LD 50 <1 ug per bee) were given a coefficient (Z ai ) of 5, medium toxicity compounds (100 ug per bee>LD 50 >1 ug per bee) a coefficient of 3 and low toxicity compounds (LD 50 >100 ug per bee) a coefficient of 1. This 5:3:1 ratio is a developed as part of the EIQ and has been widely applied in a variety of assessments of the impacts of pesticides, including studies on wild bees48,49,50. The variable P ai is the plant surface half-life for active ingredient ai, which is estimated by dividing the soil deterioration half-life of the insecticide (DT 50 ) by four51. Lethal dose toxicity (LD 50 ) and soil degradation (DT 50 ) data for insecticide active ingredients were derived from the Pesticide Properties Data Base20. M ai, R is the mass of active ingredient applied in a region R, and A R is the area of crop sprayed in that same region. Treating the mass and area separately was avoided to limit the potential impact of correlations between the area of oilseed rape and the pesticide pressure score. Summary data on the mass of active ingredient applied and the area treated for each region of England was derived from the Pesticide Usage Survey undertaken on alternate years45. Regional FII scores were calculated for each year by interpolation (as above) and assigned to each 25 km2 grid cell based on the region that the majority of the cell area fell within.

Landscape structure

While landscape structure has an important role in the population persistence of many bee species3, its inclusion as a fixed effect in the current analysis was precluded by the absence over the 18 year period of spatially explicit data of an appropriate temporal resolution (for example, annual or biennial). However, evidence from the UK Countryside Survey undertaken in 1990, 2000 and 2007 (ref. 52) indicates there has been no significant change in the cover of Broad Habitats in England between these three time periods52. The main change in land use over the period has been in crop types, with the total area of cropped land and wheat cover remaining relatively constant53 and the cover of oilseed rape increasing largely at the cost of barley. Neither wheat nor barley are used by wild bees. It is worth noting that this survey shows that plant species richness on arable and horticulture land increased by 30% between 2000 and 2007, partly due to an increase in sown wildflower field margins which are used by wild bees52. Arable land therefore improved rather than declined in its quality for many wild bees over the period we study. This conclusion is supported by Carvalheiro et al.44, who show that the great loss of semi-natural habitat to agricultural intensification—which is linked to declines in wild bees—occurred before the 1990s in NW Europe.

Statistical analysis

We created a multi-species dynamic Bayesian occupancy-detection model18,54 (BOD) to characterize distributional changes in wild bee species, implemented in the BUGS language (see Supplementary Note 2 for the BUGS code). A key feature of BOD models is that the occupancy of each grid cell (presence or absence) is separated statistically from the data collection process (detection versus non-detection): specifically, observations are conditional on the species being present. This makes BOD models well-suited to modelling change using opportunistic surveys collected by volunteers22, and the resulting trends are robust to multiple sources of error and bias21. The model we employed is ‘dynamic’17, in that persistence and colonization of individual grid cells is modelled explicitly (equation 2), and ‘multispecies’16, in that we fitted a single model to the full data set with species-specific parameter estimates. We modelled occupancy at 25 km2 resolution (that is, 5 × 5 km grid square) to match the spatial scale at which our covariates were calculated. In the model the expected value of z i,j,t (occupancy of species i in gird square j in year t) was modelled as a function of occupancy in the previous year, z i,j,t-1 . Unoccupied grid squares could be colonized with species-specific probability γ i , while occupied grid squares could persist from one year to the next with a probability ϕ i,j,t.

Population persistence, ϕ i,j,t , was modelled as a linear function of OSR, the neonicotinoid exposure and the FII in the previous year (t−1). Specifically:

Parameter β 2i is an estimate of the annual change in the log odds ratio of the persistence from one year to the next for species i within the average occupied 25 km2 grid square. This is assessed for each unit increase in the neonicotinoid exposure; parameters β 1 and β 3 estimate the effect sizes of oilseed rape area and FII. Our central hypothesis is that high doses of neonicotinoids cause a reduction in population persistence (that is, β 2 <0).

Our detection sub-model states that the kth survey to a site occupied by species i will yield an observation with probability p i,k . We modelled this probability as a function of the total number of species recorded on that survey, since this provides a convenient measure of sampling effort55. Specifically, p i,k is a function of two binary variables indicating whether the survey produced a short (two or three species) or long (>3 species) list22:

Parameter β 4 is the probability that a single-species list is a survey of the focal species in the average year; parameters β 5 and β 6 estimate how the detection probability changes with survey effort and α t is a random effect for year. This formulation treats short lists, long lists and single species surveys as separate data sets with different statistical properties22 and does not assume that all surveys record ‘complete lists’ of what was present43. An alternative would be to use the list length as a continuous covariate on detectability55. However, such a monotonic function is not appropriate for bee records in the UK where a large (but unknown) proportion of records derive from casual (or ‘incidental’) observations rather than formal surveys. Such incidental records disproportionately represent charismatic and easy to identify species, so that the probability of recording such a species on an incidental observation (that is, list length 1) could be higher than the probability of being recorded on a complete list derived from a short survey (list length 2–3).

Our species-specific parameter estimates treat species identity as a random effect. For parameters γ, β 0 , β 4 , β 5 , and β 6 we assumed all species are drawn from a common distribution by estimating a single mean and variance for each parameter. For the covariates on population persistence (β 1 , β 2 and β 3 ) we assumed that species foraging on oil seed rape were drawn from a different distribution from non-foragers, following Ruiz-Gutiérrez et al.16. Comparing the posterior distributions for these groups allowed us to test the hypothesis that foragers and non-foragers respond differently, whilst fully accounting for all forms of uncertainty in the model. The covariates (OSR, neonicotinoid exposure and FII) were centered on their mean values for analysis. We ran the model described above using uninformative priors in three Monte Carlo chains of 10,000 iterations each, following a burn-in of 5,000 iterations and a thinning rate of three. We confirmed that the parameter estimates had reached convergence through a combination of quantitative (Rhat statistics56) and qualitative assessments (for example, visual inspection of the posterior density). We implemented the BOD model using BUGS57 and conducted all other analysis in the R statistical environment58.

There were many sites for which there were several years between surveys. In these cases, the state variable (presence-absence) was imputed following standard practice in Bayesian statistics. This imputation is likely to have smoothed our estimates of persistence (and hence occupancy) across years. Note that parameter estimates (β 1 , β 2 , and so on) were estimated over all the entire state space (that is, a large number of permutations of which sites were occupied in different years), so the posterior distributions that we derived from the model were unbiased with respect to the sparseness.

Data availability

The wild bee distributional data that support the findings of this study are available at the BWARS data holdings accessible via the National Biodiversity Network’s Gateway http://data.nbn.org.uk/ as are the Food and Environment Research Agency Pesticide Usage Survey Statistics https://secure.fera.defra.gov.uk/pusstats/myindex.cfm. The PPPB: Pesticide Properties Data Base that support the findings of this study available at http://sitem.herts.ac.uk/aeru/ppdb/en/. Finally, OSR data that support the findings of this study are available from the EDINA agcensus http://edina.ac.uk/agcensus/description.html.