Abstract The diversity of life is one of the most striking aspects of our planet; hence knowing how many species inhabit Earth is among the most fundamental questions in science. Yet the answer to this question remains enigmatic, as efforts to sample the world's biodiversity to date have been limited and thus have precluded direct quantification of global species richness, and because indirect estimates rely on assumptions that have proven highly controversial. Here we show that the higher taxonomic classification of species (i.e., the assignment of species to phylum, class, order, family, and genus) follows a consistent and predictable pattern from which the total number of species in a taxonomic group can be estimated. This approach was validated against well-known taxa, and when applied to all domains of life, it predicts ∼8.7 million (±1.3 million SE) eukaryotic species globally, of which ∼2.2 million (±0.18 million SE) are marine. In spite of 250 years of taxonomic classification and over 1.2 million species already catalogued in a central database, our results suggest that some 86% of existing species on Earth and 91% of species in the ocean still await description. Renewed interest in further exploration and taxonomy is required if this significant gap in our knowledge of life on Earth is to be closed.

Author Summary Knowing the number of species on Earth is one of the most basic yet elusive questions in science. Unfortunately, obtaining an accurate number is constrained by the fact that most species remain to be described and because indirect attempts to answer this question have been highly controversial. Here, we document that the taxonomic classification of species into higher taxonomic groups (from genera to phyla) follows a consistent pattern from which the total number of species in any taxonomic group can be predicted. Assessment of this pattern for all kingdoms of life on Earth predicts ∼8.7 million (±1.3 million SE) species globally, of which ∼2.2 million (±0.18 million SE) are marine. Our results suggest that some 86% of the species on Earth, and 91% in the ocean, still await description. Closing this knowledge gap will require a renewed interest in exploration and taxonomy, and a continuing effort to catalogue existing biodiversity data in publicly available databases.

Citation: Mora C, Tittensor DP, Adl S, Simpson AGB, Worm B (2011) How Many Species Are There on Earth and in the Ocean? PLoS Biol 9(8): e1001127. https://doi.org/10.1371/journal.pbio.1001127 Academic Editor: Georgina M. Mace, Imperial College London, United Kingdom Received: November 12, 2010; Accepted: July 13, 2011; Published: August 23, 2011 Copyright: © 2011 Mora et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: Funding was provided by the Sloan Foundation through the Census of Marine Life Program, Future of Marine Animal Populations project. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Discussion Knowing the total number of species has been a question of great interest motivated in part by our collective curiosity about the diversity of life on Earth and in part by the need to provide a reference point for current and future losses of biodiversity. Unfortunately, incomplete sampling of the world's biodiversity combined with a lack of robust extrapolation approaches has yielded highly uncertain and controversial estimates of how many species there are on Earth. In this paper, we describe a new approach whose validation against existing inventories and explicit statistical nature adds greater robustness to the estimation of the number of species of given taxa. In general, the approach was reasonably robust to various caveats, and we hope that future improvements in data quality will further diminish problems with synonyms and incompleteness of data, and lead to even better (and likely higher) estimates of global species richness. Our current estimate of ∼8.7 million species narrows the range of 3 to 100 million species suggested by taxonomic experts [1] and it suggests that after 250 years of taxonomic classification only a small fraction of species on Earth (∼14%) and in the ocean (∼9%) have been indexed in a central database (Table 2). Closing this knowledge gap may still take a lot longer. Considering current rates of description of eukaryote species in the last 20 years (i.e., 6,200 species per year; ±811 SD; Figure 3F–3J), the average number of new species described per taxonomist's career (i.e., 24.8 species, [30]) and the estimated average cost to describe animal species (i.e., US$48,500 per species [30]) and assuming that these values remain constant and are general among taxonomic groups, describing Earth's remaining species may take as long as 1,200 years and would require 303,000 taxonomists at an approximated cost of US$364 billion. With extinction rates now exceeding natural background rates by a factor of 100 to 1,000 [31], our results also suggest that this slow advance in the description of species will lead to species becoming extinct before we know they even existed. High rates of biodiversity loss provide an urgent incentive to increase our knowledge of Earth's remaining species. Previous studies have indicated that current catalogues of species are biased towards conspicuous species with large geographical ranges, body sizes, and abundances [4],[32]. This suggests that the bulk of species that remain to be discovered are likely to be small-ranged and perhaps concentrated in hotspots and less explored areas such as the deep sea and soil; although their small body-size and cryptic nature suggest that many could be found literally in our own “backyards” (after Hawksworth and Rossman [33]). Though remarkable efforts and progress have been made, a further closing of this knowledge gap will require a renewed interest in exploration and taxonomy by both researchers and funding agencies, and a continuing effort to catalogue existing biodiversity data in publicly available databases.

Materials and Methods Databases Calculations of the number of species on Earth were based on the classification of currently valid species from the Catalogue of Life (www.sp2000.org, [34]) and the estimations for species in the ocean were based on The World's Register of Marine Species (www.marinespecies.org, [35]). The latter database is largely contained within the former. These databases were screened for inconsistencies in the higher taxonomy including homonyms and the classification of taxa into multiple clades (e.g., ensuring that all diatom taxa were assigned to “Chromista” and not to “plants”). The Earth's prokaryotes were analyzed independently using the most recent classification available in the List of Prokaryotic Names with Standing in Nomenclature database (http://www.bacterio.cict.fr). Additional information on the year of description of taxa was obtained from the Global Names Index database (http://www.globalnames.org). We only used data to 2006 to prevent artificial flattening of accumulation curves due to recent discoveries and descriptions not yet being entered into databases. Statistical Analysis To account for higher taxa yet to be discovered, we used the following approach. First, for each taxonomic rank from phylum to genus, we fitted six asymptotic parametric regression models (i.e., negative exponential, asymptotic, Michaelis-Menten, rational, Chapman-Richards, and modified Weibull [23]) to the temporal accumulation curve of higher taxa (Figure 1A–1E) and used multimodel averaging based on the small-sample size corrected version of Akaike's Information Criteria (AIC c ) to predict the asymptotic number of taxa (dotted horizontal line in Figure 1A–1E) [23]. Ideally data should be modeled using only the decelerating part of the accumulation curve [22]–[23], however, frequently there was no obvious breakpoint at which accumulation curves switched from an increasing to a decelerating rate of discovery (Figure 1A–1E). Therefore, we fitted models to data starting at all possible years from 1758 onwards (data before 1758 were added as an intercept to prevent a spike due to Linnaeus) and selected the model predictions if at least 10 years of data were available and if five of the six asymptotic models converged to the subset data. Then, the estimated multimodel asymptotes and standard errors for each selected year were used to estimate a consensus asymptote and its standard error. In this approach, the multimodel asymptotes for all cut-off years selected and their standard errors are weighted proportionally to their standard error, thus ensuring that the uncertainty both within and among predictions were incorporated [36]. To estimate the number of species in a taxonomic group from its higher taxonomy, we used Least Squares Regression models to relate the consensus asymptotic number of higher taxa against their numerical rank, and then used the resulting regression model to extrapolate to the species level (Figure 1G). Since data are not strictly independent across hierarchically organized taxa, we also used models based on Generalized Least Squares assuming autocorrelated regression errors. Both types of models were run with and without the inverse of the consensus estimate variances as weights to account for differences in certainty in the asymptotic number of higher taxa. We evaluated the fit of exponential, power, and hyperexponential functions to the data and obtained a prediction of the number of species by multimodel averaging based on AIC c of the best type of function. The hyperexponential function was chosen for kingdoms whereas the exponential function for the smaller groups was used in the validation analysis (see comparison of fits in Figure S4). Survey of Taxonomists We contacted 4,771 taxonomy experts with electronic mail addresses as listed in the World Taxonomist Database (www.eti.uva.nl/tools/wtd.php); 1,833 were faulty e-mails, hence about 2,938 experts received our request, of which 548 responded to our survey (response rate of 18.7%). Respondents were asked to identify their taxon of expertise, and to comment on what percentage of currently valid names could be synonyms at taxonomic levels from species to kingdom. We also polled taxonomists about whether the taxonomic effort (measured as numbers of professional taxonomists) in their area of expertise in recent times was increasing, decreasing, or stable.

Acknowledgments We thank David Stang, Ward Appeltans, the Catalogue of Life, the World Register of Marine Species, the List of Prokaryotic Names with Standing Nomenclature, the Global Names Index databases, the World Taxonomist Database, and all their constituent databases and uncountable contributors for making their data freely available. We also thank the numerous respondents to our taxonomic survey for sharing their insights. Finally, we are indebted to Stuart Pimm, Andrew Solow, and Catherine Muir for helpful and constructive comments on the manuscript and to Philippe Bouchet, Frederick Grassle, and Terry Erwin for valuable discussion.

Author Contributions The author(s) have made the following declarations about their contributions: Conceived and designed the experiments: CM DPT BW. Analyzed the data: CM DPT. Wrote the paper: CM DPT SA AGBS BW. Reviewed higher taxonomy: CM SA AGBS.