The phylogenetic relationships of several hominin species remain controversial. Two methodological issues contribute to the uncertainty—use of partial, inconsistent datasets and reliance on phylogenetic methods that are ill-suited to testing competing hypotheses. Here, we report a study designed to overcome these issues. We first compiled a supermatrix of craniodental characters for all widely accepted hominin species. We then took advantage of recently developed Bayesian methods for building trees of serially sampled tips to test among hypotheses that have been put forward in three of the most important current debates in hominin phylogenetics—the relationship between Australopithecus sediba and Homo , the taxonomic status of the Dmanisi hominins, and the place of the so-called hobbit fossils from Flores, Indonesia, in the hominin tree. Based on our results, several published hypotheses can be statistically rejected. For example, the data do not support the claim that Dmanisi hominins and all other early Homo specimens represent a single species, nor that the hobbit fossils are the remains of small-bodied modern humans, one of whom had Down syndrome. More broadly, our study provides a new baseline dataset for future work on hominin phylogeny and illustrates the promise of Bayesian approaches for understanding hominin phylogenetic relationships.

1. Introduction

Determining humanity's place in nature has long been an important scientific challenge [1]. As a result of the genetic revolution and the development of formal methods of phylogenetic analysis, the relationship of our tribe (Hominini) to the living apes has been clarified: it is now accepted that hominins are most closely related to chimpanzees and bonobos (the panins), and that hominins and panins are more closely related to gorillas than to orangutans [2,3]. By contrast, there remains considerable debate about the relationships among the 20 or so species of hominin. While some relationships seem settled, others continue to be debated—vigorously in some cases [4].

Two methodological issues contribute to the uncertainty. One is inconsistency among datasets. Most studies have focused on either early hominins or later hominins (e.g. Kimbel et al. [5] versus Martinón-Torres et al. [6]). Few analyses have included taxa that span the whole period of human evolution. In addition, different studies use different datasets. All studies have relied heavily on craniodental characters, but there is little agreement beyond that (e.g. Strait & Grine [7] versus Zeitoun [8]). The other issue concerns how the datasets are analysed. To date, researchers have relied on parsimony methods to analyse hominin relationships. These methods are useful for generating trees but are not well suited to comparing alternative trees. As a consequence, there have been few attempts to formally evaluate the relative support for competing hypotheses.

Here, we report a study of hominin relationships that was designed with both of these issues in mind. We first compiled a ‘supermatrix’ [9] that includes data for all widely accepted hominin species by collating data from 13 studies [5–8,10–18]. Using a set of principles to reconcile among-study coding differences (see Material and methods), we amassed scores for 380 craniodental characters for 20 hominin species that span the entire 7 Myr history of our lineage. We also included data for two outgroups: Pan troglodytes and Gorilla gorilla. To the best of our knowledge, the supermatrix is the largest qualitative character dataset ever assembled for the hominins.

Subsequently, we tested phylogenetic hypotheses with the supermatrix and a Bayesian method for joint estimation of the relationships of living and dated fossil taxa [19–23]. Bayesian phylogenetic inference estimates the posterior probability distributions of a phylogeny and set of model parameters, given the data and a model of evolution (see electronic supplementary material S1). Competing phylogenetic hypotheses were converted into partially constrained trees with fossil species as dated tips, and the relative support for these tree models were assessed with Bayes factors, which compare the marginal likelihoods of two sets of partially constrained trees [22,24]. Including dated tips is advantageous because it constrains the search space and allows for more robust estimates of the rate of evolution [23]. Importantly for the analysis of fossil hominin taxa—most of which cannot be coded for all characters—ambiguity due to missing data leads to low Bayes factors, which indicates that the data cannot differentiate between tree models.

We used Bayes factors to assess the support for the competing hypotheses that have been put forward in three important controversies concerning hominin relationships. The first focuses on whether the recently discovered species Australopithecus sediba is the ancestor of the genus Homo [11,16]. The second concerns the systematics of the fossil hominins from the site of Dmanisi, Georgia [17,25–30]. The third is whether the so-called ‘hobbit’ fossils from Liang Bua, Indonesia, represent a distinct hominin species, and if they do, from which lineage they are descended [31]. For each controversy, we converted the hypotheses into tree models (see electronic supplementary material, figure S1) and then compared the tree models’ marginal likelihoods. Only the relationships of the focal taxa were constrained to conform to their respective hypotheses; other species were allowed to move freely.

Our analyses show that several of the hypotheses that have been put forward regarding Au. sediba, the Dmanisi hominins and the hobbits can be decisively rejected based on the available fossil evidence and the model of evolution employed. More broadly, our study provides a new baseline dataset for future work on hominin phylogeny and illustrates the promise of Bayesian approaches for understanding hominin phylogenetic relationships.

2. Material and methods

(a) Morphological data

We created a supermatrix of craniodental characters that have been used to study hominin phylogeny [5–8,10–18]. From the original studies, we recorded the fossil specimens used, the character definitions, any measurements and the character states assigned to species. In some studies, character states were reported for individual specimens. In these cases, we adopted a 66% majority-rule to code the characters (following [15]). If 66% of the specimens of a given species exhibited a certain character state, that state was assigned to the species. Otherwise, the species was coded as polymorphic for the character.

We then concatenated the matrices. When the same character was used in multiple studies, the character state assessments from those studies were merged. In many cases, character scores were consistent across studies. However, when studies conflicted, we used the following criteria. First, we favoured assessments from studies that used larger samples of fossil specimens. Second, where studies differed, we preferred the more polymorphic designation for the taxon. Third, when the morphological feature was described using different numbers of character states in various studies, we preferred the simpler character scoring system. Lastly, when a conflict among studies could not be resolved based on the above criteria, the state assessments from the conflicted studies were combined and the taxon was coded as polymorphic for the character states in question. This approach to merging character matrices is conservative because it favours ambiguity wherever there is uncertainty in coding.

In total, we collected scores for 380 characters for 20 species of hominins and two outgroup species (P. troglodytes and G. gorilla).

(b) Geological dates

For each fossil species, the oldest date associated with the specimens providing the morphological data was used. Thus, the dates may not necessarily correspond to the first appearance dates of the species. We dated tips in this manner to better link the scored character states with elapsed time. Dates used in this study are given in electronic supplementary material, table S1.

(c) Model selection

Because Bayesian methods of phylogenetic inference are model-based, we had to choose a model of evolutionary change prior to testing the competing hypotheses. Starting with a base Markov k-state model [32], which posits that characters switch among discrete states such that the probability of observing different states in a character is a truncated exponential function of time between observations, we evaluated several model parameters. Best-fit model parameters were identified with Bayes factors associated with the resulting trees. Here, a Bayes factor can be considered a measure of the strength of evidence in favour of one model over another, and is computed as twice the difference of the natural logs of the models' marginal likelihoods (see electronic supplementary material S1). Bayes factors are interpreted on the same scale as the log-likelihood ratio test [24]. Thus, a Bayes factor of 6 is regarded as ‘strong evidence’ [24]. It suggests that the better model fits the data more than 400 times better than the other model and is comparable to a p-value rejecting the alternative model of less than 0.02. The model selection procedure was carried out in the program MrBayes v. 3.2.3 [33] via the CIPRES Science Gateway v. 3.3 [34].

(i) Character sampling

With morphological data, characters chosen for analysis are normally those that are phylogenetically informative [33]. Characters with no change or change in only one species are often excluded. We corrected for this bias by calculating conditional likelihoods based only on phylogenetically informative characters [33]. The model with this correction was strongly preferred over the model in which no bias correction was implemented (BF = 761.02).

(ii) Rate variation

The craniodental characters used in this study potentially evolved at different rates. Among-character rate heterogeneity can be modelled by allowing different characters to have different evolutionary rates. Use of a gamma model for rate heterogeneity was strongly favoured over use of a model with no rate variation (BF = 11.58).

(iii) Clock rates

Because fossil species are treated as non-contemporaneous tips, the ages of the fossil specimens can be used to calibrate the rate of evolutionary change, resulting in branch lengths that are proportional to time [21]. There are several options for specifying how the time and rate of evolutionary changes are modelled. A strict clock assumes a constant rate of change throughout the tree [35]. Relaxed-clock models allow the rate of change to vary across the branches. With the autocorrelated relaxed clock [36], the rate of change evolves through time such that the descendant nodes evolve at a rate that is sampled from a distribution centred on the inferred rate of the ancestral branch. With the uncorrelated relaxed clock [37], the rate for each branch is sampled from a distribution specified by the user. In this study, rates were drawn from an exponential distribution. The uncorrelated relaxed-clock model was strongly preferred over the strict-clock model (BF = 62.88), and the uncorrelated relaxed-clock model was strongly preferred over the autocorrelated relaxed-clock model (BF = 33.26).

(iv) Priors on node times

The use of a relaxed-clock model requires a prior distribution on node times. The uniform prior assumes that the time at a particular node has equal probability across the interval between the time of its parent node and its oldest daughter node. The birth–death prior assumes that lineages speciate and go extinct according to a stochastic process with parameters for speciation and extinction. The latter was strongly preferred over the former (BF = 11.12).

(d) Analyses

Having evaluated potential model parameters, we proceeded to the analysis of the craniodental data. Based on the results of the model parameter evaluation exercise, characters were modelled to evolve under a Markov k-state model with a gamma-distributed among-character rate variation, correcting for the sampling bias for parsimony-informative characters. Of the 380 characters, 281 were treated as unordered and 99 as ordered. The uncorrelated clock model was implemented to calibrate the tree with fossil hominins as non-contemporaneous tips, and the birth–death model was used as the prior on node times. The oldest dates associated with the specimens in the taxa were assigned as fixed ages for the terminal tips.

To generate a best estimate of hominin phylogeny, we performed four independent runs, each with 10 million Markov chain Monte Carlo (MCMC) generations. Each run consisted of one cold and three heated chains; the cold chain was sampled every 1000 generations. We assessed convergence of the four runs using MrBayes's convergence diagnostics. The first 25% of the sampled trees were discarded as burn-in.

Subsequently, sets of competing trees were constructed (electronic supplementary material, figures S2–S4). Stepping-stone sampling [38,39] was used to estimate the marginal likelihoods of the tree models. In each ‘step’, MCMC was conducted for 196 000 generations and samples were taken every 1000 generations. We used a total of 50 steps. The first step was discarded as burn-in, as were the first 49 000 generations from all subsequent steps. For each tree tested, we performed four independent runs, as per the preceding analysis. The marginal likelihoods were used to calculate Bayes factors in a series of tests in which the tree with the best marginal likelihood estimate was compared with the hypothetical trees in a pairwise manner.

The analyses were carried out in the program MrBayes [33] via the CIPRES Science Gateway v. 3.3 [34]. Additional background on the analyses is provided in electronic supplementary material S1.

3. Results and discussion

A summary of the best trees we obtained is presented in figure 1 (see also electronic supplementary material S2 and table S2). The tree captures most widely accepted relationships, and the posterior probabilities are comparable with those obtained in other Bayesian phylogenetic studies with a high percentage of fossil taxa [23]. Figure 1. Summary of best trees obtained in the dated Bayesian analysis. The posterior probability values for the clades are indicated. See electronic supplementary material S2 and table S2 for more details.

(a) Australopithecus sediba and the origin of genus Homo

In 2010, Berger and co-workers [11,40] reported the discovery of 1.97 Ma fossil hominins from Malapa, South Africa. These fossils have a unique combination of morphological features, some of which are shared with the australopiths and others with early Homo. In light of this, Berger et al. [11] assigned the fossils to a new species, Au. sediba.

Berger et al. [11] outlined four competing hypotheses regarding the relationships of Au. sediba, and then tested them with a parsimony analysis of 69 craniodental characters. The hypotheses differ in how Au. sediba is related to the members of genus Homo, especially the three earliest members, H. habilis, H. rudolfensis and H. erectus. Berger et al.'s parsimony analysis yielded a single most parsimonious tree in which Au. sediba was the sister taxon of a clade consisting of all the species of Homo, which Berger et al. took to be evidence that Au. sediba is affiliated with Homo and may actually be its ancestor.

Subsequently, several researchers [41,42] challenged the putative link between Au. sediba and Homo. They argued that Au. sediba probably arose from the much better-known South African australopith Au. africanus, and then went extinct without issue. This hypothesis was supported by a parsimony analysis of dental characters conducted by Irish et al. [16]. These authors obtained a single shortest tree in which Au. sediba was the sister taxon of Au. africanus, and the (Au. sediba and Au. africanus) clade was the sister taxon of Homo.

The partially constrained trees we used to represent these hypotheses are shown in electronic supplementary material, figure S2. Three hypotheses have low support and can be rejected (table 1). There is strong evidence to reject the tree in which Au. sediba and Au. africanus are sister taxa (BF = 16.42), the one in which H. habilis is the sister taxon of a clade comprising Au. sediba, H. rudolfensis, H. erectus and later Homo (BF = 9.22), and the one in which H. rudolfensis is the sister taxon of a clade comprising Au. sediba, H. habilis, H. erectus and later Homo (BF = 7.68). Of the remaining hypotheses, the best supported is the one in which Au. sediba is the sister taxon of a clade comprising all Homo species. Thus, our analysis does not support the hypothesis that Au. sediba arose from Au. africanus and died out without issue [16,41,42]. Rather, it is consistent with Berger et al.'s [11] conclusion that Au. sediba groups with Homo and may be its ancestor.

Table 1.Results of Bayes factor analyses carried out to compare phylogenetic hypotheses regarding Au. sediba and the species of genus Homo. Numbers refer to electronic supplementary material, figure S2. Collapse hypothesis marginal log-likelihood Bayes factor interpretation 1a. ancestor to H. habilis −2122.00 7.68 strong evidence to reject model 1b. ancestor to H. rudolfensis −2122.77 9.22 strong evidence to reject model 1c. ancestor to H. erectus/ergaster −2119.74 3.16 evidence not strong enough to reject model 1d. ancestor to genus Homo −2118.16 — best model 1e. descendant of A. africanus −2126.37 16.42 strong evidence to reject model

Our best-supported tree has some important implications. To begin with, all genus concepts used in palaeoanthropology agree that genera should be monophyletic [43], and so Au. sediba does not belong in the genus Australopithecus. The species could be assigned to Homo, or given its own genus name, depending on the importance accorded to maximizing information content in taxonomic names [43]. A second implication is that the current first appearance date for Au. sediba is substantially too young. Because sister lineages have to be the same age, the lineage leading to Au. sediba must be either the same age or older than the oldest Homo specimen. Currently, the earliest specimen that is widely accepted to belong to Homo dates to 2.5–2.3 Ma [44]. A recent discovery may push this date back to 2.8 Ma [45]. Thus, the Au. sediba lineage must be at least 300 000–500 000 years older than the current hypodigm suggests, and may be as much as 800 000 years older. Lastly, the best-supported tree has implications for the place of origin of Homo. It is widely believed that East Africa was the locus of early hominin evolution, and that species dispersed from there to other regions [46]. However, some have argued that genus Homo originated in South Africa [47]. Australopithecus sediba is only known from South Africa at the moment, so our best-supported tree is consistent with this alternative hypothesis.

(b) Systematics of the Dmanisi hominins

Since the early 1990s, the site of Dmanisi in Georgia has yielded a number of important early Homo specimens [17,27,28,48–51]. Dating to 1.85 Ma [52], these are the oldest hominin remains outside of Africa.

Several hypotheses regarding the systematics of the Dmanisi hominins have been put forward. These hypotheses differ in relation to the number of species represented among the Dmanisi hominins, and the Dmanisi hominins' relationships with H. habilis, H. rudolfensis, Asian H. erectus and early African H. erectus (sometimes called H. ergaster). One suggestion is that the Dmanisi specimens represent an early lineage of H. erectus that descended from H. habilis or an H. habilis-like species, and is ancestral to both Asian H. erectus and early African H. erectus [25]. Another proposal is that the Dmanisi specimens are more closely related to early African H. erectus than to H. habilis, H. rudolfensis or Asian H. erectus [26]. A third hypothesis is that the Dmanisi hominins represent a new species, Homo georgicus, that is descended from H. habilis and H. rudolfensis, and which gave rise to early African H. erectus [27].

More radical proposals have also been made. Lordkipanidze et al. [17] have argued that the taxonomy of early Homo needs to be simplified in light of the Dmanisi sample, and have suggested that the Dmanisi hominins, Asian H. erectus, early African H. erectus, H. rudolfensis and H. habilis should all be assigned to a single species. In diametric opposition to Lordkipanidze et al. [17], Martinón-Torres et al. [29] have argued on the basis of the mandibles from the site that the Dmanisi sample includes the remains of two Homo species. They contend that the small mandibles represent a species that is close to the node from which early African H. erectus, Asian H. erectus and later Homo species originated, while a large mandible, D2600, belongs to a different species. Bermúdez de Castro et al. [30] have also suggested that the small mandibles represent one species and D2600 another, but they contend that the species represented by the small mandibles is closely related to just H. habilis and early African H. erectus.

We tested all of these hypotheses (electronic supplementary material, figure S3; table 2). The support for three of them is so low that they can be rejected. There is strong evidence to reject Lordkipanidze et al.'s [17] hypothesis that there is just one species of early Homo (BF = 18.98). We can also reject the H. georgicus hypothesis (BF = 6.28) and Bermúdez de Castro et al.'s [30] version of the two species hypothesis (BF = 9.04). However, the remaining three cannot be rejected. Of these, the one with the highest marginal likelihood is based on Martinón-Torres et al.'s [29] ‘two species’ hypothesis.

Table 2.Results of Bayes factor analyses carried out to compare phylogenetic hypotheses regarding the Dmanisi fossils. Numbers refer to electronic supplementary material, figure S3. Collapse hypothesis marginal log-likelihood Bayes factor interpretation 2a. Rightmire et al. [25] −2258.95 4.00 evidence not strong enough to reject model 2b. Gabunia et al. [26] −2259.21 4.52 evidence not strong enough to reject model 2c. Gabounia et al. [27] −2260.09 6.28 strong evidence to reject model 2d. Lordkipanidze et al. [17] −2266.44 18.98 strong evidence to reject model 2e. Martinón-Torres et al. [29] −2256.95 — best model 2f. Bermúdez de Castro et al. [30] −2261.47 9.04 strong evidence to reject model

Lordkipanidze et al.'s [17] ‘one species of early Homo’ hypothesis is based on the results of a geometric morphometrics analysis of overall cranial shape. Their analysis indicated that the variation in the Dmanisi hominin cranial sample exceeds the variation in a combined sample of H. habilis, H. rudolfensis, early African H. erectus and Asian H. erectus crania. Lordkipanidze et al. [17] argued that this must mean that H. habilis, H. rudolfensis, early African H. erectus and Asian H. erectus belong to the same species as the Dmanisi specimens. This hypothesis was immediately criticized by Spoor [53], and has since been challenged by other researchers [28,43]. One concern of the critics is that many of the features that have been used to distinguish H. habilis, H. rudolfensis, early African H. erectus and Asian H. erectus were not captured in Lordkipanidze et al.'s [17] analysis of overall cranial shape [28,43,54]. Critics have also highlighted the inability of Lordkipanidze et al.'s landmarks to distinguish between a Neanderthal cranium and Dmanisi Skull 4 in their analysis [17]. Because these specimens are separated in time by more than 1.5 Myr and are widely accepted to belong to separate species, it has been argued that the landmarks are inadequate for assessing the limits of fossil hominin species [43]. The results of our analyses also go against Lordkipanidze et al.'s hypothesis.

Another implication of our results is that more attention should be paid to the idea that there are two species represented among the Dmanisi hominins. The possibility that the Dmanisi hominin sample includes the remains of more than one species has been raised a number of times [28–30,54,55], but has not yet been taken seriously [25,27,56]. The Bayes factor support for Martinón-Torres et al.'s [29] hypothesis suggests the ‘two species’ hypothesis deserves closer scrutiny. Skinner et al. [55] examined height and breadth variation in the Dmanisi mandibles, and found that they exhibit more variation in corpus shape, corpus height and overall mandible size than any extant ape species. Martinón-Torres et al. [29] noted that the D2600 mandible has the primitive pattern of molar size gradient, whereas the rest of the Dmanisi mandibles have the derived pattern. This morphological evidence has generally been viewed as less compelling than the geological evidence, which is usually interpreted as indicating that the fossils recovered at the site were deposited within a few centuries and have not moved very far [56]. Our results suggest that alternative scenarios should be considered. For example, Bermúdez de Castro et al. [30] argue that the stratigraphic context of the hominin fossils is more complex than is usually presented, and that the hominin fossils could in fact have been re-deposited from sediments of different age. Even if the fossils have not moved, Schwartz et al. [28] have argued that a window of several hundred years would provide ‘ample time’ for faunal migration and/or replacement.

(c) What is Homo floresiensis?

In 2004, a team led by the late Mike Morwood reported the discovery of fossil hominins on Flores, Indonesia [57,58]. These fossils, dated to 17–74 kya, were discovered at the cave site of Liang Bua, along with fossilized animal remains and stone tools [58,59]. The hominins included a relatively complete skeleton, LB1, and the remains of at least nine other individuals [60]. These fossils possess a unique combination of primitive and derived features. Like the australopiths, they were small-bodied (estimated stature of 106 cm with body mass of 16–29 kg) and small-brained (380–426 cc) [57,61]. However, other cranial features resemble Homo [57,62]. Based on this mosaic morphology, the team assigned the fossils to a new species called Homo floresiensis, and argued that it is a dwarfed descendant of H. erectus [57].

Debate about the nature of the Liang Bua hominin fossils has raged over the past decade. Immediately following the announcement of the Flores discovery, it was suggested that the Liang Bua hominin fossils do not represent a new species, but rather are a group of small-bodied H. sapiens, one of whom, LB1, was afflicted with microcephaly [63]. Several other pathological diagnoses have been put forward since [32]. Most recently, some of the proponents of the original pathological hypothesis have argued that LB1 had Down syndrome [64]. Other researchers have accepted Morwood et al.'s assessment that the fossils represent a new hominin species but have questioned the idea that H. floresiensis is descended from H. erectus [10,65]. Argue et al. [10] argue that H. floresiensis is a descendant of an early Homo species that preceded H. erectus, such as H. habilis or H. rudolfensis. Brown & Maeda [65] contend that H. floresiensis could be descended from an australopith species rather than a species of early Homo.

We created six partially constrained trees to test these hypotheses (electronic supplementary material, figure S4; table 3). Based on the Bayes factor tests, we can reject the tree in which H. floresiensis is the sister taxon of H. sapiens (BF = 7.96), and the one in which H. floresiensis is the sister taxon of Au. africanus and Paranthropus (BF = 8.74). The remaining trees could not be rejected. Of these trees, the best supported is the one in which H. floresiensis is constrained to fall on the branches leading to H. habilis and H. rudolfensis, but not the branches leading to the other Homo species.

Table 3.Results of Bayes factor analyses carried out to compare phylogenetic hypotheses regarding the status of H. floresiensis. Numbers refer to electronic supplementary material, figure S4. Collapse hypothesis marginal log-likelihood Bayes factor interpretation 3a. descendant of H. erectus −2118.12 3.70 evidence not strong enough to reject model 3b. pathological H. sapiens −2120.25 7.96 strong evidence to reject model 3c. descendant of early Homo −2116.27 — best model 3d. descendant of Australopithecus −2120.64 8.74 strong evidence to reject model 3e. descendant of early hominin 1 −2118.15 3.76 evidence not strong enough to reject model 3f. descendant of early hominin 2 −2119.17 5.80 evidence not strong enough to reject model

The rejection of the tree in which H. floresiensis is the sister taxon of H. sapiens means our data do not support the latest pathology hypothesis. An obvious potential concern about this is that our H. sapiens sample does not include any Down syndrome individuals. However, this is not in fact a problem. Henneberg et al. [64] used 17 skeletal characters to diagnose LB1 with Down syndrome. None of these characters is among the 43 characters in the supermatrix for which we have data for the Liang Bua hominins. Thus, the results of the Bayes factor tests are independent of Henneberg et al.'s assessment of the health status of LB1. Even if their diagnosis of LB1 were correct, it would not alter our results. This is because the Down syndrome hypothesis contends that the Liang Bua hominin fossils are the remains of modern humans, one of whom, LB1, had Down syndrome. For this hypothesis to be correct, LB1 must have characters that are diagnostic of Down syndrome, and LB1 and the other Liang Bua hominins must also exhibit characters that align them with H. sapiens. Henneberg et al. concentrated on trying to demonstrate that LBI has characters that are diagnostic of Down syndrome [64], but they failed to identify any characters aligning the fossils to H. sapiens. Henneberg et al. are not alone in this. None of the proponents of the pathology hypotheses has identified characters that align the Liang Bua hominins with H. sapiens [63,66,67]. Thus, our results are not particularly surprising. No data support the hypothesis that the Liang Bua hominins are H. sapiens, regardless of the health status of LB1.

While the data we currently have for H. floresiensis are unable to distinguish among the various ‘hobbits are early hominins' hypotheses, it is interesting that the best supported of the trees that we tested is the one in which H. floresiensis was constrained to fall on the branches leading to H. habilis and H. rudolfensis. This suggests that H. floresiensis is a descendant of pre-H. erectus small-bodied hominins that migrated out of Africa and made it to Southeast Asia [60,65]. A corollary of this is that our understanding of hominin colonization of Eurasia may require revision. The current consensus is that H. erectus was the first hominin species to migrate out of Africa, and did so shortly after 2 Ma. A pre-H. erectus origin for H. floresiensis implies that an earlier Homo was the first species of hominin to leave Africa. A pre-H. erectus origin for H. floresiensis also raises the possibility that H. erectus evolved in Asia rather than in Africa [68].

4. Conclusion

The study reported here is the first to use Bayesian phylogenetic analysis to evaluate competing hypotheses concerning the relationships of the fossil hominins. Based on our results, the utility of the approach for the investigation of hominin phylogeny seems clear. Our analyses show that a number of hypotheses that have been put forward in three important on-going debates can be rejected unequivocally, thereby reducing the scope of the disagreement in each case and moving the field forwards. Given this, we suggest that the Bayesian framework should be adopted to systematically evaluate other phylogenetic debates in palaeoanthropology. The approach improves objectivity because the statistical results are replicable given a dataset and model.

Improved models might well alter confidence in the inferences made here, and this is one place where further research should be focused, as the Bayesian framework also allows alternative (e.g. simpler versus more complex) models to be formally compared. We need fine-grained information on skeletal development such that covariation among characters due to common developmental pathways and allometric constraints can be tested for and accommodated. With such data, more complex scenarios of character evolution—such as rates that (co-)vary not only among characters, but also in different parts of the tree [69]—can also be modelled.

Although improved models are necessary, our primary recommendation concerns the data used to evaluate fossil hominin phylogenetic relationships. While we believe our supermatrix is the largest qualitative dataset ever compiled for the fossil hominins, it only contains characters of the skull. The omission of postcranial data needs to be rectified. Given that morphological analyses of the postcranial remains of H. floresiensis have found similarities to extant apes, australopiths and early Homo [70,71], there is reason to believe that the inclusion of postcranial data may allow us to discriminate between the various ‘hobbits are early hominins’ hypotheses for H. floresiensis. Including postcranial data should also improve our ability to test between the remaining hypotheses concerning Au. sediba and the Dmanisi hominins. More generally, there is a need for palaeoanthropologists to develop, and commit to using, a common character state dataset for the investigation of fossil hominin phylogeny, as has been achieved in various ‘Tree of Life’ projects. This dataset must include data for all generally accepted species and all widely used characters. In addition, characters and their states must be rigorously defined, and the relationship between the states assigned to species and the hypodigms of the species must be clear. Continued use of partial, poorly defined character state data matrices should be avoided.

Data accessibility

A list of fossil hypodigms, the character definitions and the character matrix have been deposited in Dryad (http://dx.doi.org/10.5061/dryad.5025v).

Authors' contributions

M.D., A.O.M. and M.C. designed the study; M.D. compiled the data in consultation with M.C.; M.D. analysed the data with assistance from N.J.M., M.C. and A.O.M.; M.D., A.O.M. and M.C. interpreted the results; M.D., N.J.M., A.O.M. and M.C. wrote the manuscript. All authors gave final approval for publication.

Competing interests

The authors declare no competing interests.

Funding

Our work was supported by the Canada Research Chairs Program, the Canada Foundation for Innovation, the British Columbia Knowledge Development Fund, NSERC Canada, the National Institute for Mathematical and Biological Synthesis, NSF, the University of Tennessee, Knoxville, and Simon Fraser University.

Acknowledgements We thank the members of HESP and FAB* at SFU, as well as Yoel Rak, Charles Roseman and Bernard Wood for helpful comments and suggestions. We are also grateful to Norman Macleod, Marta Mirazon Lahr, Kieran McNulty and an anonymous reviewer for constructive comments on an earlier version of this manuscript.

Footnotes