Distribution data

We used distribution maps of 434 European butterfly species30. After matching these with the species illustrated in ref. 31, we were able to consider 366 species across 1,825 grid cells of 50 km × 50 km. We created a database of 107 dragonfly species across 1,845 grid cells32 (Supplementary Tables 1–3, and Supplementary Methods). The butterfly distribution maps were presence/absence maps; the dragonfly distribution maps were outline maps.

Computer-assisted digital image analysis

Computer programs for the analysis of digital images provide possibilities to measure the colour value from scanned published illustrations33,34. We scanned ventral and dorsal butterfly wings and dragonfly bodies and converted the scanned RGB (red, green, blue) colour illustrations into 8-bit grey values. We assigned each pixel a value between 0 (complete black) and 255 (complete white), and averaged these values across all pixels of the considered area to obtain a single colour value for each species. Usually, software such as PhotoShop weights the three RGB channels according to the perception of human eyes when converting to grey values (for example, 0.21 × R+0.72 × G+0.07 × B). This is because some colours are subjectively perceived darker than others. For example, blue looks darker than green for humans at the same value of colour lightness. We decided to calculate the unweighted mean across the three RGB channels. Note that dark-coloured species have a low value and light-coloured species have a high value.

In particular, we scanned the ventral and dorsal part of the wings of female butterflies illustrated in ref. 31 with an EPSON Perfection 4490 Photo Scanner with 1,200 dpi and 24 bit in the RGB colour spectrum. All further steps to estimate the colour value were done with Adobe Photoshop CS2, Adobe Photoshop Elements 2, Epson Scan Ver. 2.75G and ImageMagick 6.5.5-2. In our study, we predominantly considered females because their parental investment is higher and colouration of males may be biased by sexual selection35. Therefore, females can be expected to be closer to a ‘thermal optimum’. Only if an illustration of a female was not available, we used an illustration of the male (butterflies: 28 male and 338 female). Note that dorsal and ventral colour lightness was closely and significantly correlated between the sexes (t-test, P<0.001 dorsal, ventral ; adjusted r2=0.62 dorsal , 0.9 ventral ; n=387 dorsal , 391 ventral , all available images). For butterflies, the most relevant part of the wing for thermoregulation is the wing area near the body36. Therefore, we decided to measure the body and 1/3 of the wing area closest to the body.

We scanned the thorax and abdomen of dragonflies illustrated in ref. 37. For some species, illustrations of both sexes were not available (74 of 114). To make use of as many species as possible, we solely processed dorsal drawings of male dragonflies. In doing so, we were able to use 107 of 114 dragonfly species presented in ref. 37. Note that in ref. 37, female drawings are often depicted from lateral, making comparisons difficult. Note also that beside these limitations, colour lightness of male and female species was significantly correlated (t-test, P<0.001, adjusted r2=0.23, n=40 because of the limited availability of female drawings). For some species, the thorax was not drawn because they differ from each other only in shape and colour of the abdomen. In such cases, we composed the images of the abdomen with corresponding drawings (Lestes viridis with the female image; Cordulegaster heros, C. picta and C. principes with C. boltoni). All further calculations followed the procedure for butterflies.

The colour value used in our study was derived from a three-channel evaluation of wing patterns in the visible spectrum and is therefore only a proxy for the overall properties. However, a major advantage of our method is the acquisition of standardized and hence comparable colour values. But it only gives information on pigment colouration, not structural colouration or iridescence (angel-dependent colour; for example, wings of some male Lycaenids), and values represent mean phenotypes without intra-specific variation.

We tested the robustness of our procedure to estimate the colour value, and confirmed that the extracted values represent the physical ability of the species to absorb and reflect radiation energy (Supplementary Methods; and Supplementary Figs 3–5).

Phylogenetic analysis

We constructed phylogenies for butterflies and dragonflies (Supplementary Methods). Lynch’s comparative method38 was used to partition the colour value of each species into a phylogenetic and a specific component. The phylogenetic component describes the part inherited from the ancestor; the specific component measures the deviation from the relatives. We decided to apply Lynch’s comparative method because it not simply ‘corrects’ for phylogeny but also partitions the trait phenotype into a phylogenetic and a non-phylogenetic part that can be used for further analyses. Furthermore, it has the advantage to allow to incorporate the phylogenetic signal39 and different modes of evolution40 in its correlation structure (for details, see refs 38, 41, 42). Lynch’s model describes the observed trait phenotype (y i for the ith taxon) as the sum of a grand phylogenetic mean (μ), a heritable component (a i ) and a residual deviation (e i ): y i =μ+a i +e i . The grand phylogenetic mean μ is a scaling term that can be interpreted as the state of the ancestor at the root of the phylogeny. The quantity μ+a i is the predicted value by the phylogeny and can be interpreted as the phylogenetically heritable component of the observed phenotype of the ith taxon. It is termed ‘phylogenetic component’ in this article. Hence, the residual deviation e i from the phylogenetic component is termed ‘specific component’. Subsequently, we calculated for each grid cell in Europe the mean values of the phylogenetic and specific components of the subset of the occurring species. In addition, we used phylogenetic eigenvector regression43 to disentangle the phylogenetic and the specific components of colour lightness and found similar results (not shown).

Using Lynch’s comparative method, we addressed three fundamental evolutionary assumptions of phylogenetic approaches, namely that the phylogeny is constructed without error (phylogenetic uncertainty), that more closely related species tend to show more similar characteristics than expected by chance (phylogenetic signal), and that the evolutionary model used is appropriate (evolutionary model uncertainty)44.

Phylogenetic uncertainty

We resolved the multifurcations in our trees within a Bayesian framework using R and BEAST45 as described in ref. 46 (see also refs 47, 48, 49, 50 for recent developments in the field). We calculated two maximum clade credibility trees with mean node heights for butterflies and dragonflies and used these trees in subsequent phylogenetic analyses. To give more probabilistic support to our results, we randomly draw 1,000 trees from the BEAST output, repeated the analyses and checked for the robustness of the results (see inserted histograms in Figs 1 and 3). In particular, we calculated the phylogenetic signal lambda51,52 and used this value to scale the phylogenetic correlation matrix (see refs 39, 53, 54 for examples within a phylogenetic generalized least-squares framework) in Lynch’s comparative method for each tree (see ref. 38 and functions corPagel and compar.lynch in the R package ape55). Here, the correlation structure followed a Brownian motion model with the off-diagonal elements multiplied by lambda.

Phylogenetic signal

We calculated the phylogenetic signal (lambda) for every tree with fitContinuous in the R package geiger56 version 1.99–3 and found high values of lambda, with higher values for butterflies than for dragonflies (median and mean >0.80; Supplementary Fig. 6). This indicates that related species have more similar colour lightness than expected by chance, and hence the phylogenetic variance–covariance matrix needs to be scaled by lambda39. When compared with the phylogenetic signal in body size—which we assume to be a key niche trait—there was even no significant difference to the ventral colour lightness of butterflies. However, potential micro-evolutionary causes of this pattern, for example, gene flow, pleiotropy or lack of genetic variability cannot be distinguished within the framework of this study.

Evolutionary model uncertainty

The resulting correlations of the trait values of species in the phylogeny can be seen as a function of an evolutionary model-specific transformation of the branch lengths of the original phylogeny. Following ref. 44, a fundamental assumption of phylogenetic analyses is that the model of character evolution effectively recapitulates their history, and this implies to compare several evolutionary models. Hence, to support our main conclusions that (i) colour lightness is consistently correlated to the thermal environment, and that (ii) dragonfly assemblages became on average lighter-coloured during the last century, we simultaneously accounted for evolutionary model and phylogenetic uncertainty. However, we would like to highlight that it is not our intention to make statements about the most adequate model for the evolution of colour lightness in butterflies and dragonflies, so we arbitrarily chose four evolutionary models to underline the robustness of our conclusions: (i) the model ‘lambda’51,52 fits the extent to which the phylogeny predicts covariance among trait values for species by multiplying the off-diagonal elements in the correlation structure by the value of lambda. Interpreted as a tree transformation, values of lambda near 0 cause the phylogeny to become more star-like, and a lambda value of 1 recovers the Brownian motion model. Bounds for the estimation of lambda were set to 0 and 1. This model was used for the results presented in the main text. (ii) In the model ‘kappa’51, character divergence is related to the number of speciation events between two species. Interpreted as a tree transformation, the model raises all branch lengths to the power kappa. Bounds for the estimation of kappa were set to 0 and 1. (iii) The Early-burst57 or also called accelerating-decelerating (ACDC)58 model is a model where the rate of evolution increases or decreases exponentially through time. Bounds for the estimation of the scaling parameter a were set to −10 and 10. (iv) The Ornstein–Uhlenbeck model59 fits a random walk with a central tendency and an attraction strength proportional to the parameter alpha. Bounds for the estimation of alpha were set to 0 and 105.

Model parameters were estimated for 1,000 randomly selected phylogenetic trees in each model with the function fitContinuous in the R package geiger56 version 1.99–3. All trees were transformed with their estimated parameters (function transform.phylo) and used to partition colour lightness into a phylogenetic and a specific component with Lynch’s comparative method38. For each tree, we calculated for each grid cell in Europe the mean values of the phylogenetic and specific components of the subset of the occurring species and regressed these values against our thermal component 1. Regarding our main conclusion (i) that colour lightness is consistently correlated to the thermal environment, we found that all 24,000 regressions were highly significant (t-test, P<0.001) with positive slopes throughout the models, indicating that different evolutionary models do not alter the positive relationship between colour lightness and the thermal environment (Supplementary Fig. 7).

To give more probabilistic support to our main conclusion (ii) that dragonfly assemblages became on average lighter-coloured during the last century, we followed the procedure described above and calculated for each grid cell in Europe the mean value of the phylogenetic and specific components according to the dragonfly distributional information of 1988 and 2006. We calculated the change in colour lightness for each grid cell and averaged these changes across Europe for each tree and evolutionary model. We found the majority of values to be positive, indicating that different evolutionary models do not alter the overall shift towards lighter-coloured dragonfly assemblages (Supplementary Fig. 8).

Environmental variables

For each grid cell across Europe, 25 environmental variables were extracted from public resources using Grass GIS version 6.4.0RC6. Since data were not available for all grid cells, the data set was reduced to grid cells for which information was available for all environmental variables (1,825 for butterflies and 1,845 for dragonflies). (i) Yearly sum of solar irradiation on horizontal surface (INS): in the data source used (ref. 60, http://re.jrc.ec.europa.eu/pvgis/download/download.htm), atmospheric corrections were already applied to account for spatial differences in, for example, average cloud cover, atmospheric turbidity and shadowing by the local terrain. (ii) Temperature-related variables: we used AMT, mean diurnal temperature range, isothermality, temperature seasonality, maximum temperature of warmest month, minimum temperature of coldest month, temperature annual range, mean temperature of wettest quarter, mean temperature of driest quarter, mean temperature of warmest quarter and mean temperature of coldest quarter to characterize the thermal environment. Data were taken from BioClim61, available at http://www.worldclim.org/current. (iii) Precipitation-related variables: since actual evapotranspiration is reported to be a strong predictor of species richness of European butterflies62, we considered annual precipitation, precipitation of wettest month, precipitation of driest month, precipitation seasonality, precipitation of wettest quarter, precipitation of driest quarter, precipitation of warmest quarter and precipitation of coldest quarter. Data were taken from BioClim61, available at http://www.worldclim.org/current. (iv) Topography-related variables: since precipitation and ambient temperature are related to topography, we used five measures describing the topography within each grid during subsequent analyses: average elevation, lowest elevation, highest elevation, elevation range and s.d. of elevation. Data are available at http://www.biochange-lab.eu/files/europe50km.zip63.

Statistical analysis

Our primary interest was to identify the strongest predictors of the mean colour value of butterfly or dragonfly species co-occurring within grids across Europe. Therefore, we applied a principal component analysis (PCA) based on the correlation matrix to reduce the dimensionality in the environmental variables64. We applied separate PCAs for the categories thermal environment, precipitation and topography. Of course this leads to correlated components across the separate PCAs. The idea was, however, to characterize the different aspects of the climate and naturally the different aspects are correlated (for example, topography and thermal environment). The PCAs extracted two components with eigenvalues >1 from the categories temperature and insolation, which characterize the thermal environment of the respective grid cell (Thermal 1 and 2), as well as two components for precipitation (Prec 1 and Prec 2) (Supplementary Table 5). The PCA extracted one eigenvector (Topo) for the topography (Supplementary Table 5). For temperature and precipitation, the first axis (called Thermal 1 and Prec 1) characterized the overall mean, and the second axis (Thermal 2 and Prec 2) characterized the variability of temperature or precipitation across the year. Since the directions of ecological associations between response and predictors in models with PCs are difficult to interpret because the sign of axis scores is arbitrary, we plotted selected variables of each category for illustration (Supplementary Fig. 9).

We calculated multiple regression models using the phylogenetic and specific components of the mean colour value within grids and the principal components of the environmental variables. Assumptions of an ordinary least-squares regression are that the standard error of the error term is constant over all values of the response, and that explanatory variables and all estimates provide equally precise information. Because the standard error of the calculated mean colour value of assemblages within grids differed in relation to the number of species recorded within grids, using the number of species per grid cell as weights to the least-squares regression improves parameter estimation (see also ref. 65). To further improve the information quality of the data, we included only grid cells in the analysis of butterflies with more than five recorded species.

To account for spatial autocorrelation in the model residuals after fitting the environmental components to the data, we build spatial simultaneous autoregressive models, where the error term is predefined from a spatial neighbourhood matrix, and autocorrelation in the dependent variable is modelled within a generalized least-squares framework using maximum likelihood66. Calculations were performed with the function errorsarlm in the R package spdep version 0.5–65 for neighbours within 1,000 km distance.

We applied hierarchical partitioning to assess whether multicollinearity between the PCs of the different categories might affect the results. Hierarchical partitioning decomposes the variation contained in a response variable into independent parts, which reveal the absolute importance of each predictor by considering all possible variable combinations in a hierarchical multivariate regression setting67. Calculations were performed in R with the package hier.part version 1.0–3 using ‘r2’ as the goodness-of-fit measure. We also included phylogenetic and specific parts of body size (digitally measured fore-wing length of butterflies and body length of dragonflies) in the analysis. We found qualitatively similar results compared with the findings of Table 1, indicating that multicollinearity and body size does not change our main results (Supplementary Fig. 10).

All statistical analyses were performed with the software R ( www.r-project.org). Intensive calculations were conducted using the MaRC2 High Performance Computing cluster in Marburg.

Change in dragonfly colour lightness

To analyse changes in the colour value of dragonfly assemblages, we compared the data used in our statistical analysis (distribution of 2006, ref. 32) to maps of the distributions before 1988 (ref. 37), and plotted the distribution of the shift in colour lightness within grids and the average shift across all grids for multiple phylogenetic trees (Fig. 3). In addition, we extracted climatic data corresponding to the two distribution data sets of dragonflies in Europe from public resources (CRU TS3.2, ref. 68). We calculated the AMT for the period 1900–1988 and for 1988–2006; determined the change in AMT between these two periods; and correlated this change to the change in colour lightness of dragonfly assemblages (Fig. 4). We found positive correlations (t-test, P<0.001) for the raw data, the phylogenetic component and the specific component of colour lightness. Correlations using simultaneous autoregressive models to incorporate spatial autocorrelation were also all significant (t-test, P<0.05) and positive. This indicates that dragonfly assemblages became lighter coloured in regions where temperature increased during the last century, giving further support to our conclusion that climate warming favours light-coloured insects in Europe.