This section addresses choice of whole-genome information, overall features of a proteome tree of fungi and protists, protistan origin of Microsporidia, and a description of other notable differences and similarities between the proteome tree and current gene trees.

Since there is no a priori criteria for the best descriptor to build the organism phylogeny, we took an empirical approach to find the best one among three types of descriptors in the public databases: whole-genome DNA sequence, transcriptome sequence, and proteome sequence. In addition, the “optimal feature lengths” of the three descriptors, the critical information needed for the FFP method that would give the most stable tree topology, was also empirically determined using the Robinson-Foulds metric ( 13 ) in the PHYLIP package ( 14 ). The results of the empirical searches showed that the proteome tree is most topologically stable among the three genome trees (described in more detail in Materials and Methods ). Various features of the proteome tree of the Fungi kingdom are described below and compared with those of the gene trees based on various selected gene sets.

Although the member compositions of the groups at the next to the deepest level of divergence in the proteome tree are similar to those in the gene trees, the branching orders of some of the groups are different, and more so at higher branching levels (compare Fig. S1 A and B ; also, see below). Fig. 1 shows clade membership and branching order of the proteome tree, and Fig. S1A shows the taxon identifications of the fungi and protists in the tree. The statistical support calculated by Jackknife Monophyly Indices ( 15 ) and the relative branch lengths for various clades are shown in a simplified tree ( Fig. 2 ).

Microsporidia has been assigned as the basal group of all fungi in most gene trees (e.g., Fig. S1B ). Surprisingly, in the proteome tree, the group is placed among the nonfungal unicellular eukaryotic organisms of paraphyletic protists “Protozoa” (marked “(a)” in Figs. 1 and 2 , and Fig. S1A ; see below for details).

Comparison of the proteome tree and a gene tree of the Fungi kingdom. List of taxonomic identifications (taxonIDs) are included in the linear trees The corresponding names of organisms can be found in Table S1 . (A) The proteome tree represented in a linear form corresponds to the tree of the circular form in Fig. 1 . The branch lengths calculated by JSD ( 33 ) are scaled to 1,000 (which corresponds to 500 from the common ancestor of fungi and protists to the terminal leaves) and shown above the branch lines. All of the Jackknife Monophyly Index values are 1.00, except those shown below the branch lines. (B) A gene tree of Fungi downloaded from MycoCosm of the JGI fungal portal ( 36 , 37 ). The colored bars in the gene tree indicate the fungal organisms used in our proteome tree of Fig. 1 and A, and uncolored regions are the fungi whose genome sequences are not released to the public at the time of our study, and thus not used in the proteome tree. Each clade of Ascomycota, Basidiomycota, and Monokaryotic fungi (any fungi other than Ascomycota and Basidiomycota) are constructed independently using the “FastTree” program and then joined manually (see genome.jgi.doe.gov/ext-api/mycocosm/clustering/clm/r/fungi.42/all/2014_Mycocosm_All-Fungi_tree.png ).

Simplified proteome tree of Fungi and Protozoa. The figure shows the proteome tree collapsed at the phylum or equivalent levels with the relative branch lengths from one common ancestor of a clade to its previous common ancestor. (The branch lengths for the two outgroups and uncollapsed species are not shown.) For the statistical support of the collapsed groups, the Jackknife Monophyly Index ( 5 ) for each collapsed clade (except the two outgroups) are shown under the branch lines. The branch lengths calculated by JSD are normalized to 1,000 (the scale on top), which corresponds to 500 from the common ancestor of fungi and protists to the terminal leaves. The number of the members in a clade is indicated at the end of the clade name, and the four marked (by lowercase alphabets in parentheses) groups are the groups whose placements in the proteome tree are significantly different from those in the gene trees, as discussed in the Results . The clade colors correspond to those in Fig. 1 . For the identities of the out-group, see Materials and Methods . The tree was constructed using ITOL ( 43 ).

A Circos (topological) representation of the proteome tree of Fungi kingdom. The branches of three major groups are colored in light green for group I (Monokaryotic fungi), red for group II (Basidiomycota), and purple for group III (Ascomycota). All protists are in blue. The branches of two sets of outgroups are in black. The names of nine groups at phylum level belonging to the three major groups are shown around the circle. The four marked (by lowercase alphabets in parentheses) groups with dotted-lined branches are the groups whose placements in the proteome tree are significantly different from those in the gene trees, as discussed in Results . The taxon identification numbers can be found in Fig. S1A , and their taxon names can be found in Table S1 . For the identities of the outgroups, see Materials and Methods . The branch lengths are relative and not to scale. The figure was prepared using the Interactive Tree of Life (ITOL) ( 43 ).

In contrast to four (Ascomycota, Basidiomycota, Monokarya, and Microsporidia) to eight (Glomeromycota, Zygomycota, Basidiomycota, Ascomycota, Chytridiomycota, Neocallimatigomycota, Blastocladiomycota, and Microsporidia) major groups in the Fungi kingdom in the gene trees ( 3 ) ( Fig. S1B ), there are only three (Monokarya, Basidiomycota, and Ascomycota) earliest diverging and deepest branching major fungal groups in the proteome tree ( Figs. 1 and 2 and Fig. S1A ). The first major group (group I in Figs. 1 and 2 ) corresponds to Monokaryotic fungi and consists of three subgroups that do not appear to produce dikaryons during their life cycle: Cryptomycota, Chytridomycota, and Zygomycota. The second major group (group II) corresponds to Basidiomycota, which are dikaryon-producing fungi whose sexual spores are formed externally on small-pedestal fruiting bodies called basidia, and consists of Puccinomycotina, Ustilaginomycotina, and Agaricomycotina. The third major group (group III) corresponds to Ascomycota, which are dikaryon-producing fungi whose sexual spores are formed internally inside sacs called “asci” on top of fruiting bodies, and consists of Taphrinomycotina, Saccharomycotina, and Pezizomycotina. The three major groups appear to have branched out almost simultaneously from the common ancestor of all fungi ( Fig. 2 ).

Protistan Origin of Microsporidia.

The Microsporidia is a eukaryotic group of spore-forming unicellular obligate parasites to a very wide range of animal hosts, including human. Several thousands of them are named, suggesting that there may be more than an order-of-magnitude more unnamed Microsporidia species in nature. Individual Microsporidia species usually infect one host species or a group of closely related taxa. They have very small genomes, and the gene trees place the group at or near the basal position of all fungi (e.g., Fig. S1B).

Although the supporters of the fungal origin of Microsporidia have been gaining the ground rapidly among mycologists, alternative origins cannot be ruled out completely. It has been difficult to infer the evolutionary history of Microsporidia due to its shifting positions in the gene trees depending on the genes selected to build the gene trees and evolutionary narratives to explain the shifts based on comparative genome sequences and biochemical data (for a review, see ref. 16). To interrogate the boundary between the fungal kingdom and protists (large, diverse, and paraphyletic/polyphyletic, unicellular, nonfungal microbial eukaryotes) and also to revisit the fungal origin of Microsporidia, a group of 71 protists, for which genome sequences are available, was included in this study.

In the proteome tree constructed for a population containing both fungi and protists, as in the gene trees, all members of Microsporidia in the study form a single clade, suggesting that they most likely evolved from a common ancestor. However, the clade is not located with other fungi, as in the gene trees, but located among the protists, such as Giardia, Trichomonas, Entamoeba, and, Trypanosomatiae, some of which, like the Microsporidia, also lack or lost mitochondria, but have much larger genomes than Microsporidia (Figs. 1 and 2 and Fig. S1A). This observation indicates that the proteome sequences of Microsporidia are more similar to those of the protists than to those of fungi. The current narrative is that the very small genome sizes of Microsporidia have resulted from one or more steps of extreme reduction of much larger genomes of fungal origin (16). Most of these “evidences” are based on the sequence similarity of the proteins coded by one or limited number of genes (4, 5, 17, 18). However, the proteome tree suggests another narrative that the genomes of the Microsporidia may have a protistan origin rather than fungal origin [marked “(a)” in Figs. 1 and 2 and Fig. S1] and gone through similar extreme genomic reduction.

The protistan origin of Microsporidia was first shown by Vossbrink et al. (19) in their gene tree built using the DNA sequence of a small subunit ribosomal RNA gene, where they placed Microsporidia at the basal position of all eukaryotes they tested (including a few animals, plants, fungi, and protists). However, this proposal was “overturned” by the now-popular fungal origin of Microsporidia based on subsequent gene trees of certain protein-coding genes and narratives derived from biochemical and cellular observations, including the absence or loss of mitochondria, which is not critical to the fungal or protistan origin of Microsporidia (ref. 16 and references within). There was another gene-tree–based indication of grouping of Microsporidia at the basal position of all other eukaryotes: in this study, Thomarat et al. (20) observed, in table 1 of ref. 20, that of 99 gene trees built based on very carefully selected protein sequences of Encephalitozoon cuniculi, the first Microsporidida of a known genome sequence, a majority of their trees (80 of 99 gene trees) by the BIONJ method (Materials and Methods) placed E. cuniculi at the basal position of all eukaryotes (animals, fungi, and plants), thus presumably among protists, whereas the rest of the gene trees placed it at the basal positions of fungi (13 of 99), of fungi and animal (4 of 99), of animal (1 of 99), and at a “nonbasal” position (1 of 99). However, Thomarat et al. discounted their majority results supporting the protistan origin and took a minority results that supports the fungal origin by arguing the slower relative evolutionary change rate of the minority (13 of 99) genes. These observations, combined with the proteome tree, support the protistan origin of Microsporidia rather than the currently popular view of the fungal origin of Microsporidia.

More detailed kinship of Microsporidia among various clades of protists awaits whole-genome sequences of many more protists of diverse taxa, since all 71 protists in this study population are from a subgroup of nonphotosynthetic protists, Protozoa.