Phylogenetic analyses

Ancestral state reconstruction using model-based methods requires a phylogenetic tree with branch lengths proportional to time (that is, a chronogram) or to the number of inferred molecular substitutions (that is, a phylogram). We preferred the first option because we did not want to assume a strict correlation of molecular and morphological evolutionary rates. The recent relaxed clock molecular dating analysis of Magallón et al.1 was chosen as the starting point for this study because it was calibrated with the largest number (136) of well-justified fossil age constraints ever used at this scale, while at the same time including a very large number of terminal taxa (792), representing 63 orders (98%) and 372 families (86%) of angiosperms. We also reanalysed this data set in a number of alternative ways to evaluate the impact of various parameters of this dated tree on our analyses.

The A series of analyses refers to the original BEAST analyses of Magallón et al.1, which provided a maximum clade credibility (MCC) tree, used in our parsimony and ML analyses, and a collection of 1,042 trees sampled from the posterior stationary distribution, which we used for our Bayesian analyses of trait evolution. These trees, however, presented two drawbacks for our analyses. First, their topology had been heavily constrained according to the results of Soltis et al.16, and thus represented only one of the several alternatives for deep-level relationships in angiosperms. Second, the BEAST analyses had been conducted with a fixed topology, producing a collection of trees that differed in branch lengths (times) but not topology. Thus, integrating phylogenetic uncertainty in our Bayesian analyses of trait evolution was the primary motivation for reanalysing the data set in BEAST without fixing the topology.

The B series of analyses refers to the reanalysis of the data set of Magallón et al.1 in BEAST 1.8 (ref. 42) without using any topological constraints (that is, topology estimated, not fixed), and with all other parameters equal (see below). These analyses produced trees with Amborella sister to Nymphaeales rather than to all other angiosperms, and with monocots sister to Chloranthaceae+Magnoliidae rather than to Ceratophyllaceae+Eudicotyledoneae (see Supplementary Discussion and Supplementary Fig. 1). Other relationships and divergence times were very similar to those found in the A series, but with some variation among trees of the posterior sample regarding the more weakly supported nodes.

The C series of analyses refers to the same setup as the B series, but with two topological constraints for deep-level angiosperm relationships: (1) Amborella sister to the rest of angiosperms; (2) Monocotyledoneae+Ceratophyllaceae+ Eudicotyledoneae together forming a clade (excluding Chloranthaceae and Magnoliidae; Supplementary Fig. 1). These two constraints are supported by the majority of phylogenomic analyses based on complete plastid genomes17,43,44,45 and are consistent with the 17-gene analyses of Soltis et al.16. The results from the C series were very similar to those of the A and B series (see Supplementary Discussion).

The D and E series were set up with two alternative topological constraints for major clades of angiosperms suggested by recent nuclear phylotranscriptomic analyses (Supplementary Discussion and Supplementary Fig. 1). In the D series, we constrained Chloranthaceae, Magnoliidae, Ceratophyllaceae and Eudicotyledoneae to form a clade23. In the E series, we constrained Chloranthaceae and Ceratophyllaceae to be sister taxa46,47.

In addition, we tested the impact of the age of the angiosperms on our ancestral state reconstructions. The original analyses of Magallón et al.1 included a narrow age constraint of 136–139.35 Ma on the crown-group age of angiosperms based on a quantitative analysis of the fossil record. The A200, B200, C200, D200 and E200 series refer to the exact same setups as the A, B, C, D and E series, but with this constraint removed, resulting in chronograms with crown angiosperms typically over 200 Ma old.

All new phylogenetic and molecular dating analyses were conducted with BEAST 1.8 (ref. 42) using the same settings, fossil calibrations and protocols as in the A series1. For the B series, five independent Markov Chain Monte Carlo (MCMC) runs of different length (up to 20M generations) were conducted, for a total of ca. 71M generations (after discarding the first 2M generations from each run as burn-in). The posterior was resampled every 50K generations to produce a set of 1,412 trees used in the Bayesian trait analyses. For the C series, six runs were conducted for a total of ca. 85M generations, which were resampled every 50K generation to produce a set of 1,706 trees. For each of the D, E, B200, C200, D200 and E200 series of analyses, two runs were conducted for a total of 36M generations, which were resampled every 35K generation to produce sets of 1,028 trees. We note that the effective sample size for some parameters of these analyses did not all reach 200 as recommended, suggesting that longer runs might be needed for accurate estimation of phylogenetic relationships and divergence times, consistent with the previous finding that this large data set is difficult to analyse with a Bayesian relaxed clock without fixing the topology1. We argue that the posterior samples we obtained here are acceptable for the purpose of this study, because the goal of our reanalyses of the Magallón et al.1 data set was to take into account and evaluate the impact of phylogenetic uncertainty on the results from the A series (the original trees from Magallón et al.1, with fixed topology). As we report in detail in the Supplementary Discussion, the estimated general topology, divergence times and ancestral states were remarkably similar across tree series (Supplementary Data 1 and Supplementary Tables 1 and 2). However, we recommend caution with the use of these trees for purposes other than this study. The MCC tree from each BEAST analysis is provided as Supplementary Data 3–12.

Terminal taxa in the original molecular data set of Magallón et al.1 were either species or genera, with different species sampled for different genes. For this study, we transformed the trees of hybrid terminal taxa into trees of species by choosing the species with the most genes sampled for each hybrid (genus-level) terminal taxon. This allowed us to produce trees of 792 species and prepare a matching data set of floral traits for exactly the same species, following a strict exemplar approach (see below). All of our trees also included six outgroup gymnosperm species. Because floral traits are not applicable outside angiosperms (unless controversial homology statements are made), these species were not included in our data set of floral traits and were pruned out of the trees before ancestral state reconstruction. Clade names in this paper follow APG IV48 and the Angiosperm Phylogeny Website49 for orders and families, and Cantino et al.50 and Soltis et al.16 for all clades above order.

Data set of floral traits

We recorded 21 floral traits in 792 species of angiosperms using the collaborative database PROTEUS51. The floral traits were chosen and defined to be as broadly applicable as possible. For instance, we do not have a character for the number of petals in this data set, because not all angiosperms have petals and all petals are not necessarily homologous. Instead, we recorded the total number of perianth parts (sepals plus petals, or tepals). All characters are explained and justified in detail in the Supplementary Methods.

Floral traits were recorded from a diversity of published and online sources, including many focused morphological studies and a few personal observations. Each data record in PROTEUS is linked to an explicit source, which allowed us to cross-check, validate or correct many records following initial entry. In total, the data set presented here contains 13,444 floral trait data records obtained from 947 distinct sources. The complete list of records and linked sources (references) is available in Supplementary Data 13.

All primary characters used in data entry were transformed for analysis (discrete characters were simplified and continuous characters were discretized; see Supplementary Methods for justification and details of these transformations). Some characters were transformed in more than one way, leading to a final data matrix of 27 characters and 792 species (Supplementary Data 13). Data files were then exported from PROTEUS in appropriate formats for analysis.

We used a strict exemplar approach for scoring traits, which means that data were only scored for a species if we could confirm that they were observed in this species (that is, we did not use any general family descriptions or make any assumptions that all species of a genus share the same character states). The species were selected because of their inclusion in a recent molecular dating study1. Thus, our sample is independent from the floral traits. While this approach is both desirable and suitable for the methods we used, we acknowledge that it implies that our data set does not represent the complete variation of floral traits across all angiosperms. Thus, our study was not designed to reconstruct the finer-scale evolution of flowers near the tips of the tree (for example, within orders), and our results remain conditional on future denser sampling of the angiosperm phylogeny. Our strict exemplar approach also means that data are missing for some traits in some species (total missing data: 27%, including cases of inapplicability). Because missing or inapplicable data are more or less evenly and haphazardly distributed across our tree, and species with such data are in effect pruned out in the ancestral reconstruction analyses, it is unlikely that missing data had a strong impact on our results.

Ancestral state reconstruction

Each floral trait was analysed for each series of trees (A, B, C, D, E, A200, B200, C200, D200, E200) using three complementary approaches52: MP using the ancestral.pars function of the phangorn 2.0.2 package53 in R54, ML using the rayDISC function of the corHMM 1.18 package55 in R54, and a Bayesian rjMCMC approach56,57 using BayesTraits 2 (ref. 58). MP and ML reconstructions were conducted on the MCC tree from each BEAST analysis, whereas Bayesian rjMCMC analyses were conducted on collections of at least 1,000 trees sampled from the posterior stationary distribution from the BEAST analyses. Here, we focus on and report results for 15 key nodes in the phylogeny of angiosperms, corresponding to well-recognized major clades (including Angiospermae, Mesangiospermae, Magnoliidae, Monocotyledoneae, Eudicotyledoneae, Pentapetalae, Rosidae and Asteridae). However, graphical MP and ML reconstructions for the entire tree are available (Supplementary Data 14–23).

For each floral trait, we tested and compared at least two distinct Markov models of discrete character evolution in our ML analyses: the equal rates (ER) or Mk model59, which assumes a single rate of transition among all possible states, and the all rates different (ARD) or AsymmMk model60,61, which allows a distinct rate for each possible transition between two states. In addition, we tested two unidirectional models for all binary characters (UNI01 and UNI10: rates from 1 to 0 or 0 to 1, respectively, set to zero)52,62, a symmetrical model for all multistate characters (SYM: rates equal for transitions between two given states), and three ordered models for all multistate characters derived from quantitative variables (ORD: rates between non-adjacent states set to zero; ORDSYM: symmetrical version; ORDER: single-rate version). Initial tests showed that for some characters, the prior on the root state could affect results in terms of both transition rates and ancestral states62. Therefore, we systematically tested both inferences using flat priors32,63 (equal probability for all states, the default option in most R packages) and a prior with root state frequencies same as equilibrium64 (we denote such variants with the ‘eq’ suffix, for example, ARDeq is the implementation of the ARD model with equilibrium root prior), for all models except ER (equilibrium=equal frequencies) and the unidirectional models (root state implied by the model). Although the ARD model might seem more realistic than the more restrictive variants listed above, it may be very difficult to estimate all transition rates accurately, especially for multistate characters. Thus, we tested the fit of these models using the Akaike Information Criterion corrected for sample size, which allowed us to select the model that best fits the data while minimizing the number of parameters65. We here report the ML results from the best-fit model.

Bayesian ancestral state reconstruction analyses allowed us to explore three sources of uncertainty not accounted for in ML analyses: transition rate uncertainty, phylogenetic uncertainty and dating uncertainty57. In addition, the rjMCMC approach allowed us to explore model uncertainty56. This approach is particularly useful where model space is very large, such as for multistate discrete characters (see Supplementary Methods). Each rjMCMC analysis was run in BayesTraits for 10M generations, sampling parameters and ancestral states for 15 key nodes every 100 generations, and starting with an exponential hyperprior with a mean on a uniform interval from 0 to 1. Apparent stationarity was checked in Tracer 1.6 (ref. 66). Discarding the first 1M generations as burn-in was sufficient for all analyses and effective sample size values were nearly always very high (above 200), except for a few particular traits characterized by frequent jumps of the chain between very different models.

Here, we report the results from these three analyses at each focal node in the form of the most parsimonious state(s), the most likely state (that is, with highest marginal likelihood), and the state with highest mean probability, respectively (Supplementary Data 1). For the latter (Bayesian rjMCMC), we also report the 95% CI for the probability of the state. In several cases, these CIs are very wide, with probabilities ranging from ca. 0 to 1. Such intervals indicate strong uncertainty in ancestral state reconstructions, where MP and ML can be misleading in showing artificial precision and confidence in the reconstructed ancestral state. For this reason, we refer mostly to the rjMCMC results in this paper and call for caution in interpretation of our results where CIs are very wide. However, for most traits, nodes and trees, the three approaches reconstructed the same ancestral state and rjMCMC CIs were narrow (Supplementary Data 1 and Supplementary Discussion).

Correlation analyses

As flowers are highly complex and integrated structures, floral traits are unlikely to evolve independently from one another25,26,27,28,29,30. Although our main goal was not to evaluate the level of morphological integration in flowers, it is possible that such correlations might impact ancestral state reconstructions. However, in contrast to recently developed multivariate approaches for continuous characters67,68,69, no comparative method exists yet to account for the potential correlation of more than two discrete characters, unless a drastic simplification of model space is made25. In addition, correlated models and analyses have typically been developed for binary characters56,60. The stochastic mapping approach to correlation tests allows inclusion of multistate characters, but does not model character correlation and starts at the outset by reconstructing ancestral states independently at all nodes70; it was thus not relevant to our specific objective here. Therefore, we tested correlations among all possible pairs of binary floral traits in our data set. To do so, we first removed redundancies for multiple versions of the same character (Supplementary Methods), and then transformed all multistate characters into binary characters by maintaining the hypothesized ancestral state for the angiosperms as one state and pooling the remaining states as another (for example, for the number of perianth whorls, we analysed one-two whorls versus more than two whorls). We thus obtained a new set of 22 presumably independent characters and analysed all 231 pairwise correlations among these characters (Table 1). Given our observation that reconstructed ancestral states in the single-trait analyses were remarkably consistent across the 10 series of phylogenetic trees (see Supplementary Discussion), we conducted all of our correlation analyses using the C series of trees, which best reflects the current consensus on higher-level angiosperm phylogeny and allows us to take into account phylogenetic uncertainty.

As for our single-trait analyses, we used both an ML and a Bayesian rjMCMC approach to test for correlations and their impact on reconstructed ancestral states, using again the rayDISC function of corHMM 1.18 (ref. 55) in R54 for ML analyses and BayesTraits 2 (ref. 58) for rjMCMC analyses. The ML approach allowed us to test the fit of a small set of combined Markov models (that is, with 4 × 4 Q matrices to model all possible transitions among the four possible combined states, excluding dual transitions), including correlated (dependent) and uncorrelated (independent) models60. Specifically, for each character pair, we fitted four correlated models (ARDnodual, ARDnodualeq, differing only in the root state prior: see above; SYMnodual, SYMnodualeq) and three uncorrelated models (ERnodual, UNCORRnodual, UNCORRnodualeq; UNCORRnodual corresponds to the most general, 4-parameter ‘independent’ model from ref. 56). Using Akaike Information Criterion corrected for sample size, we selected the best-fit model and compared the ancestral combined states reconstructed with those obtained in our single-trait analyses (Supplementary Data 2). As a measure of support for correlation, we report the cumulative Akaike weight of correlated models (Table 1). The rjMCMC approach allowed us to explore the vast space of the 21,146 possible Markov combined models for the evolution of two binary characters, sampling models according to their posterior probability56, with settings as above (10M generations, sampling every 100 generations). As for single-trait analyses, the ancestral states reconstructed using this approach integrate over model, parameter, tree and dating uncertainty, as measured by the CIs associated with the probability (proportional likelihood) of each state (Supplementary Data 2). Support for correlation is here measured by the Bayes Factor comparing the dependent models to the independent models, rewritten as the ratio of the posterior to the prior odds of the two models56: BF DI =[P(M D |D)/P(M I |D)]/[(21146−51)/51], where P(M D |D) and P(M I |D) are the sampling frequencies of dependent and independent models, respectively.

Data availability

Summary (MCC) BEAST trees are provided as Supplementary Data 3–12 and a complete list of morphological data records and references (extracted from PROTEUS) is provided as Supplementary Data 13. Additional trees and data files are available from the authors on request.