Building the supertree

Potential source papers were identified using the Web of Science. We used the following search terms: phylog*, taxonom*, systematic*, and clad*, coincident with any scientific and common names constituent within the Culicidae, from families to genera. Each source paper was inspected manually for one or more phylogenetic trees, and the references cited by each paper were similarly trawled for additional trees. In this manner, we collated 550 source trees from 284 source papers published up to the end of July 2014 (Supplementary References: Part 1). Each tree was digitised precisely in its published form using Mesquite 3.11 49. The Supertree Toolkit 2 (STK)50 was used to standardise these sources and to produce a single matrix representation of their structure using group inclusion characters51. We excluded synonyms and standardised species names using the mosquito taxonomic inventory website (http://mosquito-taxonomic-inventory.info/). The protocols described by Davis and Page18 were followed to ensure the independence of each source. Outgroup taxa were removed from the entire data set, and replaced by an all zero outgroup. The resulting matrix representation was analysed using flat-weighted maximum parsimony in TNT 1.5 52. The analysis followed Davis et al.10. We ran multiple replications at level 10, with 1000 random additions of taxa and TBR branch swapping. One thousand most parsimonious trees, each of 5321 steps, were saved and summarised as a Maximum Agreement Subtree in PAUP*4.0a151 53. The resulting supertree comprised 1000 taxa (a coincidentally round number) within 103 genera (according to the classification of Reinert et al.19).

Time calibration

Only nine extant genera are known as fossils, and we used the earliest known record of each of these as calibration points in our supertree. A patchy fossil record meant that many branches of our tree had no reliable fossil calibration dates, and we therefore used molecular estimates of divergence times for 18 additional genera39, as well as for the entire family and the subfamily Culicinae. This method of supplementing fossil divergence dates with molecular dates was successfully used in a recent supertree of caridean shrimps54. Genera classified within Aedes in the classification of system of Wilkerson55 were treated as Aedes during the time calibration. The node calibration dates are summarised in Supplementary Data 1. The R packages Paleotree56 and Strap57 were used to scale and plot the supertree.

Data on disease vectors

Data on the diseases vectored by each species of mosquito were obtained from Norbert et al.6. We also searched Web of Science for the names of all pathogens and the names of the diseases that they cause, coincident with any of the stems: vect*, transmit*, mosquit*, culicid*, pathog* or vir*. All of the sources consulted are listed in Supplementary References (Part 2), while the resulting list of vectors and diseases is available as Supplementary Data 2.

Speciation rate and correlations

Speciation rate parameters were modelled on the time-calibrated tree using BAMM12. We sampled four MCMC chains of 10 million generations every 10,000 iterations, with a burn-in of 10%, for each phylogeny. Full analytical parameters and settings are given in Supplementary Note 1, while the sampling file is provided as Supplementary Note 2. The R package BAMMtools58 was then used to identify significant speciation rate shifts across the entire tree (all Culicidae), and for the subclades Anophelinae and Culicinae. We also modelled vector and non-vector species separately. Speciation rates through time were modelled using both λ and μ parameters and then tested in both λ and μ calculations. Global temperature data and R code were derived from Davis et al.10. Atmospheric CO 2 concentration data were acquired from Bergman et al.59 (Supplementary Data 3). For the DCCA analysis, we used a window for temperature and CO 2 concentration data from 200 mya to the present. DCCA correlations of all realisations of the speciation curve in the MCMC analysis (9000 data pairs in total) were performed for all Culicidae and the subclades against the temperature and atmospheric CO 2 concentration curves. Specifically, we tested whether the distribution of all 9000 correlation coefficients differed from the null hypothesis of zero, which implies no correlation. The correlations between clade (subfamily and vector/non-vector) and tip speciation rates were tested using the pgls function implemented in R package caper60.

Information transfer

Information Theory has previously been used to explore causality in palaeontological and geological data sets with great success. Transfer entropy (TE) is a directional information flow method that quantifies the coherence between continuous variables in time61. It is an extension of the mutual information method, but can take into account the direction of information transfer by assuming that the processes can be described by a Markov model. Transfer Entropy reduces to a linear Granger causality process (whereby a signal in one time series gives a linear response to the second time series) when the two time series can be linked via autoregressive processes. However, TE makes fewer assumptions regarding the linearity of the processes involved, and is therefore more suitable for analysing causality when the processes involved are unknown62,63. Transfer Entropy is calculated using:

$$T_{X \to Y} = {\sum} {p\left( {Y_{n + 1},Y\frac{{(k)}}{n},X\frac{{(l)}}{n}} \right){\mathrm{log}}\left( {\frac{{p\left( {Y_{n + 1} \vee Y\frac{{(k)}}{n},X\frac{{(l)}}{n}} \right)}}{{p\left( {Y_{n + 1} \vee Y\frac{{(k)}}{n}} \right.}}} \right)},$$

where T X→Y is the TE from time series X to time series Y, both of which have data at time n, and k and l are the embedding dimensions of the two time series respectively. As in Davis et al.64, we used the R (R Core Team 2017) package TransferEntropy65 which implements the above equation using a nearest neighbour algorithm66. This function returns a numeric value where 0 indicates no information transfer, positive numbers indicate information transfer, and negative numbers indicate misinformation transfer. The latter implies that there are other, unspecified processes interacting29. The embedding dimensions of the time series were estimated using the R package nonlinearTseries67.

We calculated TE for four time series: speciation rate in Culicidae, speciation rate in Mammalia, global palaeo-temperature and atmospheric CO 2 . We derived TE values from each climatic variable to each taxon and also between mosquitoes and mammals, in both directions of putative transfer. The mammalian speciation rate time series was inferred from a BAMM analysis of a recent, almost complete mammal phylogeny68. The significance of TE values was determined by randomising the source time series 250 times to create surrogates, and leaving the target time series unchanged. If the TE value was outside the 95% interval of the surrogate TE values, it was deemed significant69.

Speciation rates verification

Molecular trees of Culicidae and Anophelinae were built using data from previous studies. The data set underpinning the phylogeny of Culicidae was obtained from Reidenbach et al.39, and comprised six nuclear gene protein-coding genes (arginine kinase, CAD, catalase, enolase, hunchback and white) coded for 25 ingroup genera and 2 outgroup taxa. The data set underpinning the tree of Anophelinae comprised three mitochondrial genes (COI, ND2, ND3) sequenced for 19 ingroup and 2 outgroup species. The subgenera Cellia, Nyssorhynchus, Anopheles, Lophopodomyia, Kerteszia and Stethomyia each had 2−4 representatives. Genbank accession numbers of the sequences used in the two data sets are presented in Supplementary Tables 4 and 5. The sequences were aligned in MEGA 7 70. Tree topologies were obtained by ML analyses. Heuristic searches were performed in PAUP*4.0a151 using a GTR + G + I model. We also produced neighbour-joining (NJ) trees and estimated clade support using 1000 bootstrap resampling replicates. Estimations of divergence times and diversification rates were calculated using a fossilised birth−death model implemented in BEAST 2 26. The following five fossil nodes were used for time-calibration: Culex pipiens (55.8 Mya), Culiseta gedanica (55.8 Mya), Ochlerotatus serafini (55.8 Mya), Toxorhynchites mexicanus (28.5 Mya), Anopheles dominicanus (40.4 Mya). The following parameters were used during the two analyses: GTR substitution site model, relaxed clock exponential model, running two MCMC chains of 200,000,000 generations with 100,000 sampled and 25% burned-in.