Phenotypic classification

To create a list of mimics in the New World and RBB snakes in the Old World (Supplementary Data 1), we classified snake colouration following the phenotypic codes defined in ref. 44. For New World species, we included any species that had (1) uniform red coloration, usually with black accessory colouration such as a black nuchal collar (codes that start with U), (2) any variation of bicolour red and black crosswise banding (codes that start with B), (3) any variation of tricolour red, black and yellow/white banding (codes that start with T) or (4) any variation of quadricolour grey, red, black and yellow/white banding (codes that start with Q). Following ref. 44, the criteria for including a New World species in this list of mimetic species was (a) an emphasis on the presence of red pigment in either full dorsal coloration or in crosswise bands, and (b) including all colour patterns found in New World coral snakes. Thus, species with red dorsal or lateral striping in the absence of black bands (for example, Thamnophis sirtalis) or a pure black dorsum and red lateral/ventral coloration (for example, Boiruna maculata, Pseudoboa martinsi) were not included as mimics. For Old World species, we conservatively included only species with bicolour, tricolour or quadricolour banding that included both red and black pigmentation. Although we used refs 44 and 45 as the basis for our phenotypic scoring, we updated all species to current taxonomy and added both newly described species and Old World species that were not included in either original publication. We classified species as polymorphic if individuals had (a) discrete variation in presence/absence of red or black pigmentation, or (b) multiple major phenotype classes (for example, both bicolour and tricolour morphs). We assigned 255 snakes worldwide to these colouration categories (231 New World species, 24 Old World species), confirming that RBB colouration is a rare phenotype with viewed across all snakes (out of ∼3,500 described species, 7.3%). Note that we considered any species that did not have RBB coloration as ‘cryptic’ for simplistic comparison, but we do not imply anything formal about conspicuity by using this designation or degree of mimicry by using ‘mimetic’.

Geographic range construction

We constructed geographic ranges for 1,107 out of 1,328 (83.4%) continental species of New World snakes. Out of these species, 78 (out of 80 identified) were coral snake species, 133 (of 151) were mimetic species and 46 (of 50) were polymorphic mimics. We compiled geographic range data in three ways: (1) when available, using known species ranges downloaded from IUCN Red List of Threatened Species (www.iucnredlist.org, accessed on 1 February 2015), (2) hand compiling ranges from published sources or (3) acquiring point occurrence data (n=299,376 unique records) from the VertNET (www.vertnet.org) and GBIF (www.gbif.org, both accessed on 26 May 2015) data portals to assemble ranges de novo. To standardize taxonomy among sources, we matched species names to synonym listings scraped from the Reptile Database (reptile-database.reptarium.cz, accessed on 5 October 2015) using a two-step process that searched by date range (first 1950 to present, then 1800–1950 if there were no hits), multiple combinations of former genus and species synonyms, and allowed fuzzy character matching (n=2) to accommodate gender agreement changes. We also filtered occurrence data by the species-specific country listings of Reptile Database (country names standardized) and cross-referenced these listings with the global invasive species database (www.issg.org, accessed on 15 January 2015) so that we retained only occurrences from the species’ native distributions. For all occurrences that were not georeferenced but did have text-based municipality information, we jittered occurrence points randomly within the political boundaries of that municipality (usually ‘state’).

To construct species range polygons from occurrence data, we used a mixture of alpha hull, minimum convex hull and point buffering methods depending on the number of records for each species. For all species with five or more occurrences, we built alpha-hull polygons with the ‘alphahull’ package46 in R v3.1.2, sequentially increasing the alpha parameter value until a contiguous species range was created that contained at least 99 per cent of the occurrence records. For species with 3–5 occurrences, we generated a minimum convex hull using the ‘rgeos’ R package47. For species with 1–2 records, we buffered each point with a 0.5° radius. All species ranges were then clipped to the coastline using ‘rgeos’ and the GSHHS coastlines data set48.

For all coral snakes and five mimics, we then replaced the ranges we had generated from occurrences, or species with no occurrences, with ranges that were hand compiled from published sources as part of the Global Assessment of Reptile Distributions (http://www.gardinitiative.org) international consortium49. Because coral snakes were the independent variable of interest in our statistical models, it was important to have the most accurate ranges possible and justified using time-intensive construction ‘by hand.’ However, our results and overall conclusions were qualitatively similar even when analyses were restricted to only ranges reconstructed from occurrence data (thus excluding both IUCN and coral snake range replacements; permutation test for mimicry, P<0.001).

To tally species counts for subsequent spatial analyses, we created a raster of species richness for each species set (all snakes, coral snakes, mimetic snakes and polymorphic mimics) at two different latitude/longitude grid resolutions from the overlay of all assembled species ranges. First, we chose a 2° by 2° grid resolution for its balance between accurately capturing species richness patterns without overly zero-weighting the distributions for the dependent variables subsequently analysed, which is a consequence of smaller grid cell size (Supplementary Data 2). Next, we used a 0.5° by 0.5° resolution to reduce the influence of habitat-driven spatial heterogeneity and increase the probability that two species co-occurring in a cell were truly coexisting on a spatial scale more relevant to avian predators (0.5°≈55 km at the equator; Supplementary Data 3). Species were considered present in a cell if their range covered 50% or more of that cell, and we used each cell’s midpoint as its geographic coordinate. After discarding cells not on the mainland, we parsed this data set into two subsets of grid cells containing (a) at least one mimetic species (n=520 cells at 2°, 8,656 cells at 0.5°), or (b) at least one coral snake species (n=324 cells at 2°, 5,609 cells at 0.5°), and we ran all spatial analyses on each subset. We conducted the latter test to reduce distributional skew towards low values, but the results were qualitatively the same in all analyses (Supplementary Fig. 3). We conducted analogous pruning for analyses of mimetic polymorphism (n=471 cells at 2°, 7,196 cells at 0.5°).

For analyses of relative abundances of coral snakes and their mimics, we used all the individual occurrence points and generated the same rasters above, but with individual specimen counts for all snakes, coral snakes and mimetic species (Supplementary Data 2 and 3). To compare abundances of models and mimics for each cell, we added 1 to each total before taking the log2 of the ratio of mimetic snake to coral snake individuals.

To test whether randomizing non-georeferenced points within states/provinces biased our conclusions, we also implemented a ‘conservative’ approach in which we first reduced the occurrence data set for each species to include only its georeferenced records (n=211,614 records total; mean=40% of total records per species), constructed ranges as above, and designated that the ‘known range.’ Then, we added each non-georeferenced record along the border of its collection state at the point with the shortest straight line distance to the known range before reconstructing a revised range polygon. As our overall conclusions do not change with this more conservative approach (Supplementary Figs 2c and 3d, this comparison suggests that our original municipality-guided randomization added minimal inappropriate bias to the data, but did work to smooth ranges, especially in Brazil, and thus functioned similarly to a standard interpolation algorithm. For a conservative approach to the abundance analyses, we repeated analyses using only georeferenced records and again found the same results (Supplementary Fig. 3l).

All of the above methods for scraping collections database aggregators, filtering records and using occurrence information to reconstruct species ranges have been generalized to accommodate a range of vertebrate taxa and are now available in the R package ‘rangeBuilder’ downloadable from CRAN (https://cran.r-project.org).

Spatial analysis

To test the spatial relationship between coral snakes and their mimics, we performed SAR error models using the ‘spdep’ R package50. We tested for spatial autocorrelation using sequentially increasing neighbourhood sizes (stepwise by 50 km) beginning at the smallest size for which neighbours could be computed for every cell and compared model AIC scores (unweighted, weighted and inversely weighted by distance) to choose the best performing neighbourhood size and weighting regime for final analysis (Supplementary Fig. 1a–d). All models were multiple regressions of mimetic (or polymorphic) snake species richness on coral snake richness+total species richness. Although we found a significant interaction effect between coral snake and total species richness, it explained negligible variation and was removed (mimicry: adjusted R2=0.8992 for interaction model, adjusted R2=0.8980 for additive model; polymorphism: adjusted R2=0.3202 for interaction model, adjusted R2=0.3106 for additive model; full model P<0.001 in all cases). We computed Moran’s I for each analysis to confirm model adequacy in removing spatial autocorrelation (Supplementary Fig. 1e–h). Note that due to the complexity of the neighbourhood matrices for 0.5° resolution analyses, we only ran SAR models at the 2° resolution.

To further account for complex spatial covariance effects (for example, continental shape) and distributional skew of diversity unable to be explicitly specified in SAR analysis, we also conducted a permutation test (n=1,000 iterations) to compare observed mimicry data with a null distribution that preserved spatial and species diversity structure. To generate null distributions, we randomized (without replacement) the species identity tags of all non-coral snake species range polygons before creating species richness rasters. Thus, each simulated raster had identical (a) range distributions, (b) species richness totals for each grid cell and (c) distribution of coral snakes as the observed data. However, the species name codes, which correspond to mimetic and polymorphic snakes, were randomized across the New World independently of coral snakes. We then compared linear regressions of both observed and simulated data to determine the increase of mimetic/polymorphic snake richness due to coral snake richness beyond the known positive effect of overall species richness and spatial autocorrelation. We calculated one-tailed P values as the number of simulated slopes equal to or greater than our observed slopes divided by the total number of simulations.

To calculate standardized residuals for each individual grid cell (for example, Fig. 1d), we used a basic z calculation in a χ2 framework:

Where f was our observed rasterized species count of mimics for cell i and was the mean mimetic species count for the same grid cell from the randomized simulations.

Phylogenetic tree estimation

For analyses conducted across all snakes, we used the time-calibrated squamate phylogeny from ref. 29 and pruned the tree to the monophyletic Serpentes clade. For analyses within the colubrid snake tribe Sonorini, we sequenced 6 nuclear (c-mos, RAG1, AKAP-9, CILP, FSHR, and NTF-3; primers from refs 51 and 52, sequences in Supplementary Data 6) and 2 mitochondrial loci (cytb and ND4; primers from ref. 52, sequences in Supplementary Data 6) for 60 tissue samples representing 21 species (Supplementary Data 5) for a total of 6,375 bp. Sequences for 11 additional species were taken from GenBank (for cytb and c-mos) or published sources (for ND4 (refs 53, 54)). Forty-two of the tissue samples were population-level sampling within the Sonora semiannulata (Western Ground Snake) species complex, targeting populations for which we already had phenotypic data on colour pattern from the sampling of 1,760 snakes (collection details and individual population sample sizes given in ref. 31). Sanger sequencing was conducted using standard conditions and protocols.

We edited and aligned all sequences by locus in Geneious v6.1.8 using the Muscle algorithm after translating each sequence to manually check for erroneous amino acid stop codons indicating editing mistakes. We then constructed phylogenetic trees with the concatenated data set partitioned by gene in RaxML (ref. 55) using the GTRGAMMA model. We calculated nodal support for the best scoring ML tree by bootstrapping (1,000 replicates).

Ancestral state reconstruction

To reconstruct colouration and infer the evolutionary history of mimicry, we first used a random local clock model56 (‘RLC’) implemented in BEAST v1.8 (ref. 57) and analysed RBB coloration as a single character alignment along a fixed tree29 following the approach of ref. 58 by modifying the model of molecular evolution to reconstruct discrete character states. We chose this method due to its ability to accommodate rate heterogeneity among clades within large phylogenetic trees, which can cause erroneous character reconstructions when using traditional methods that all assume homogenous rates across clades (see ref. 58 for detailed argument and extensive model testing). We used similar parameter priors, starting values and linked distributions following the xml file in ref. 58, conservatively using a relatively flat prior on mean rate of trait evolution (0.001709; parsimony-inferred changes divided by tree length) to adequately explore both fast and slow rates of evolution. We performed three independent runs of 300 million generations each and confirmed convergence by comparison in Tracer59. All parameters had ESS values >100. From our posterior distributions of reconstructed characters, we used a ‘maximum credibility’ approach implemented in the R package ‘BAMMtools’60 to accommodate uncertainty and choose the best supported configuration of gains and losses of mimicry.

Next, we analysed RBB coloration using discrete, time-reversible models of character evolution61 in a combination of maximum likelihood and Bayesian frameworks (‘ML’). Again to accommodate evolutionary rate heterogeneity, we developed a maximum likelihood approach for identifying the number and location of distinct evolutionary rate partitions across the snake phylogeny similar to the R package ‘MEDUSA’62 but that allowed for the modelling of discrete instead of continuous traits. After initially fitting a model with a single transition rate matrix to the full data set, we fit a series of nested character evolution models to the phylogeny using stepwise AIC. This approach assumes that the shifts to and from mimetic coloration anywhere on the phylogeny can be described by rate parameters characterizing origins (q01) and reversals (q10) of mimetic coloration. We then fit a derived model where one portion of the tree evolved under a second rate matrix, thus allowing a single subclade to have transition rates that are decoupled from those across the remainder of the phylogeny.

We split the phylogeny at all internal nodes with at least five descendant species and maximized the likelihood of the model with two rate matrices. We then compared model fit using AIC scores, where we treated the location of the rate shift as a free parameter. Thus, a model with two transition matrices has four rate parameters (q01-A, q10-A, q01-B and q10-B) plus a fifth parameter for the topological location of the rate shift. We rejected the simpler model in favour of the more complex model if ΔAIC was >2. We increased the complexity of the model incrementally, holding as fixed the location of the shift for the two-matrix model but attempting to add a third transition matrix (3-matrix model: 6 rate parameters+2 shift location parameters). We repeated the optimization and model selection procedure described above until further model complexity failed to improve the fit of the model to the data, identifying a total of nine distinct transition rate partitions across the tree (Supplementary Fig. 7b).

Given this set of transition rate partitions, we then estimated evolutionary rates for each partition in a Bayesian framework. We simulated posterior distributions of rate parameters for each partition using Markov chain Monte Carlo (MCMC). We used a hybrid maximum likelihood (location of partitions) and Bayesian (parameters on partitions) framework because transition rate estimates for several partitions with rare character state changes were associated with large confidence intervals. We chose a prior by using maximum likelihood to estimate the ‘whole tree’ transition rate under a symmetric Markov model. We used an exponential distribution with a mean 10 × greater than this whole-tree rate as a prior on all transition rate parameters. This is a relatively flat distribution that allows substantially faster rates relative to the background rate while also minimizing extreme rates across the tree.

We reconstructed ancestral character states across the tree by sampling transition rate parameters from their joint posterior distribution and performing stochastic character mapping across the tree63, following previous implementations (for example, Phytools64). First, we sample a set of transition rate parameters from their joint posterior probability distribution. We then calculate conditional likelihoods down the tree using Felsenstein’s pruning algorithm65. At the root of the tree, we obtain paired conditional likelihoods for our two states. We placed a prior of 0.05 on mimetic coloration at the root of the tree, because this state has an observed frequency of only 0.023 across the portion of the tips described by this rate partition, and because the equilibrium frequency of this state implied by the rates themselves is very low (for example, 1—q01/(q10+q01)=0.037). We then sampled a root state from the posterior at the root and then compute conditional probabilities at each successive descendant node and choose character states proportional to these probabilities. Finally, we simulate character histories on each branch conditional on the starting and ending node states, and the associated transition rate parameters. We performed 500 stochastic character maps to obtain ‘pseudo-histories’ of the evolution of mimetic colouration. The frequency of the trait at any point in time was computed at 1,000 time points by counting all the branches at t i that were assigned to each state. Because this is a stochastic character map, any given branch may include multiple character state changes; these were tracked explicitly.

To assess the robustness of our colour pattern reconstructions across methods, we also reconstructed ancestral states using a simple parsimony-based method of ancestral state estimation (which accommodates infinite rate heterogeneity across clades) implemented in the R package ‘phangorn’66 and compared the results to both the RLC and ML reconstructions. The results from all three methods were generally congruent (Supplementary Figs 4–6), although the RLC model did not perform as well as the other two methods in handling extreme rate heterogeneity among clades, as evidenced by (a) reconstruction of a few ancestral nodes as mimetic when none of the descendent tip taxa have RBB coloration (for example, the Echinathera/Taeniophallus/Sordellina clade in Dipsadines; Supplementary Fig. 5) and (b) a strong bias for inferring extra losses in large clades with the greatest variation in tip states, which was not seen in the ML or parsimony reconstructions.

Due to increased risk of erroneous inflation of transitions near the tree tips due to taxonomic uncertainty, the ‘number of transitions’ to mimicry in New World colubrids that we report in the main text (‘at least 19’) is a conservative estimate qualitatively inferred from the trait reconstructions and thus is much smaller than the average number of reconstructed origins across the entire tree (∼60; Supplementary Fig. 8), as a broad accommodation of uncertainty in the phylogenetic reconstruction of taxon relationships.

We repeated these analyses for the time-rooted (root calibrated to the Sonora-Gyalopion MRCA node date from ref. 19), species-level version of the Sonorini tree (Fig. 3c) but not for the population-level data (Fig. 3d), as intraspecific inference violates the key assumption of a strict branching process necessary for state reconstruction67.

Geographic distribution of RBB gains and losses