A major question in the study of both anthropology and cognitive science is why the world’s languages show recurrent similarities in color naming. Here we examine this inherently evolutionary question–the evolution of color systems in language–using phylogenetic methods. We track the evolution of color terms across a large language tree in order to trace the history of the systems. We provide further validation of phylogenetic approaches to culture, and provide an explicit history of color terms across a large language sample, the Pama-Nyungan languages of Australia. Our work is of relevance to anthropologists, psychologists, and linguists.

The naming of colors has long been a topic of interest in the study of human culture and cognition. Color term research has asked diverse questions about thought and communication, but no previous research has used an evolutionary framework. We show that there is broad support for the most influential theory of color term development (that most strongly represented by Berlin and Kay [Berlin B, Kay P (1969) (Univ of California Press, Berkeley, CA)]); however, we find extensive evidence for the loss (as well as gain) of color terms. We find alternative trajectories of color term evolution beyond those considered in the standard theories. These results not only refine our knowledge of how humans lexicalize the color space and how the systems change over time; they illustrate the promise of phylogenetic methods within the domain of cognitive science, and they show how language change interacts with human perception.

The naming of colors has long been a topic of interest in the study of human culture and cognition. It is a key case study for the link between perception, language, and the categorization of the natural world (1⇓⇓–4). The assumptions central to these lines of research on color naming are often linked, whether implicitly or explicitly, with the ways in which color term systems are believed to evolve. One of the most noteworthy scholarly works on color terms, both in terms of its impact on subsequent research and its clear and explicit evolutionary hypotheses, is the classification system proposed by Berlin and Kay (5) and refined in subsequent works (6⇓–8). However, despite the very clear hypothesis in this literature that the attested range of color-naming systems in language results from evolution along highly constrained pathways, very little has been done to test these claims. Here, we directly examine the evolutionary hypotheses associated with this research tradition: principally, that as color term systems evolve languages gain but never lose basic color terms; and that the order in which color terms are added to a language’s lexicon is fixed. This approach capitalizes on the different patterns we should find in the presence of strong, universal cognitive constraints on color evolution, compared with those that might result from a more relativistic view, in which every language’s color term system development follows a unique path. We use Bayesian phylogenetic methods, which allow us to probabilistically reconstruct ancestral inventories and evaluate claims regarding the order in which color terms enter (and leave) the lexicon. We apply these techniques to Australia’s Pama-Nyungan language family.

The study of color system evolution in Australian languages represents a unique opportunity to evaluate claims central to the debate regarding color systems. Pama-Nyungan is a large language family that extends across approximately 90% of the Australian mainland. The internal composition of this family has been studied using both traditional ( 15 ) and phylogenetic comparative methods ( 16 ). The diversity of color-naming systems used by speakers of Pama-Nyungan languages make it an ideal case for examining the evolution of color terms. The languages range through all five basic evolutionary stages of the WCS model. This is in contrast to other large families such as Indo-European and Austronesian where languages tend to cluster in WCS stage VI, making them unsuitable for recovering evolutionary trajectories using phylogenetics.

Once a language has terms for colors, we would expect them to change over time. Major types of change in vocabulary include semantic shift, where a word extends or contracts its meaning, or is used metaphorically ( 13 ). There is no a priori expectation from language change that colors would change as a system; although we do sometimes find words changing in parallel ( 14 ), words usually change independently. We assume that cognitive constraints play a role in language change in this domain, while still allowing for normal processes of sound change, semantic shift, and lexical replacement to occur in individual color terms.

This core evolutionary model—unidirectional progression through color system development in a fixed order—is maintained in the revised evolutionary model presented in subsequent work ( 7 , 8 ), which additionally proposes an alternative pathway through stages III and IV. Fig. 1 gives streamlined version of this model, which, like our data, is not explicit about which foci may be combined in early-stage composite categories. The row labeled A represents Kay and Maffi’s “main line” of color term evolution, which accounts for 83% of the languages in the WCS.

Theories of color system evolution have themselves changed over the last several decades, as the empirical data and diversity of perspectives involved in this area of research have grown. The evolutionary process outlined in Berlin and Kay ( 5 ) comprises seven distinct stages. The most basic system involves a two-category system, with terms centered on the black and white foci. The second stage adds a color associated with the focal category red, followed by either yellow or green in stage III. In stage IV, both yellow and green are present, as well as black, white, and red. Stage V adds blue, followed by brown in stage VI. The final stage involves the addition of pink, purple, orange, and/or gray.

Berlin and Kay’s basic findings have been largely affirmed by the much larger cross-linguistic sample in the World Color Survey (WCS) ( 8 ). Importantly, the 11 basic color foci of Berlin and Kay were revised to a set of six basic color foci [the lightness categories black and white, plus the Hering primary colors red, blue, yellow, and green ( 11 )]. These foci are consistent with highly clustered “best examples” of basic colors from the WCS ( 12 ).

The Berlin and Kay survey of color terms explicitly tested cross-linguistic variation in color naming, focusing scrutiny on earlier scholars’ treatment of color as a canonical example of linguistic relativity ( 3 , 9 ). In direct contrast to relativistic views, Berlin and Kay found that languages have no more than 11 basic color terms, and that the systems used to organize these colors occupy only a small portion of the potential design space. They found such cross-linguistic agreement in the focal points of these color categories as further evidence for universals in color semantics. Furthermore, the seven color systems they identified were hypothesized to represent natural evolutionary stages.

Berlin and Kay’s 1969 influential study ( 5 ) first established the notion of a universal, cross-linguistic typology of color term systems and ascribed the limited range of systems attested in their surveys to a strict developmental pathway. The model outlined in Berlin and Kay ( 5 ) and subsequent work ( 6 ⇓ ⇓ ⇓ – 10 ) makes two evolutionary claims. First, the progression through the stages of color system development is hypothesized to be unidirectional. That is, languages gain basic color terms, but they do not lose them. Second, the order in which colors are added to a system is largely fixed.

Finally, we use MCMC analysis to examine the posterior probabilities for reconstructing each color to selected ancestral nodes. For ancestral state reconstructions, we implement a MCMC analysis that infers a single rate for gain transitions and a single rate for loss transitions across all seven color system characters. This approach treats the color lexicon as a unified system as it estimates the likely ancestral color terms.

We examine hypothesized orderings of color term gain by applying reversible jump MCMC (RJMCMC) ( 22 , 27 ) analyses to data for pairs of colors, comparing dependent and independent models of trait evolution. RJMCMC moves between models with different numbers of parameters as it searches the space of trees and transition rates, sampling models in proportion to their posterior probabilities. By representing pairs of binary traits as a single character with four possible states (00, 01, 10, 11), these analyses characterize dependencies in trait evolution in terms of eight parameters, which represent transitions between these states. Dependent RJMCMC analyses sample across models that allow separate rates for the gain or loss of each trait in the presence or absence of the other; independent analyses have separate gain and loss rates for each trait that do not depend on the other trait’s state. The posterior sample of models generated by these analyses can be used to examine the support for individual parameters that represent ordered gains or losses of colors. We are thus able to assess the evidence for dependent evolution between individual color terms and to test hypotheses about the relative order in which colors are added.

To test the first of these hypotheses, we use Markov chain Monte Carlo (MCMC) comparative methods to estimate the likelihood of alternative models, given our trees and data. By computing the Bayes factor (BF) support for models that disprefer or disallow the loss of color terms compared with models that allow both gains or losses of colors (by means of their marginal likelihoods), we can evaluate whether Pama-Nyungan color system evolution is consistent with the principle that languages gain color terms but do not lose them ( Figs. S3 and S4 ).

To evaluate the basic evolutionary claims of the WCS theory, we use a Bayesian phylogenetic method for the study of trait evolution, implemented with the BayesTraits software package ( 19 ). These methods have previously been used in numerous studies of linguistic and cultural evolution ( 20 ⇓ ⇓ – 23 ). Modeling the evolution of cultural traits using phylogenetic comparative methods developed for biological processes is not entirely uncontroversial ( 24 ); however, arguments that these approaches are invalidated by differences in the transmission of cultural and biological material have been discussed thoroughly and largely refuted ( 25 , 26 ).

We represent the Pama-Nyungan language phylogeny using a sample of 700 trees (see Figs. S1 and S2 ). The trees were subsampled from a Markov chain used to derive a consensus tree summarizing relationships among Pama-Nyungan languages. The tree was compiled using basic vocabulary data ( 16 , 18 ). The main clades identified in historical work on Pama-Nyungan are all recovered with high posterior probability, as are many other clades; however, some of the primary branches high in the tree receive equivocal support. We track reconstructions on nodes that have a high posterior probability (that is, that appear frequently in the tree sample), thus avoiding the problem that node reconstruction probabilities can only ever be as high as the probability of the node itself ( 19 ).

Data from this sample were coded as a set of seven binary characters, each representing a color category. For each language, the character state (0 or 1) represents the presence or absence of a term representing a particular color category in that language’s lexicon. (See especially Figs. S1 and S2 and Table S1 ).

The data for this study consist of basic color terms from 189 Pama-Nyungan languages in the Chirila lexical database ( 17 ). The basic color terms were identified based on the association of a form with at least one English translation included in the set of six basic WCS color terms, plus brown, the most frequent secondary color term in our sample.

Results and Discussion

Gain and Loss of Colors. The marginal likelihoods of nested models can be used to evaluate the support for a dependent hypothesis (in this case, the hypotheses that color terms are gained but never lost), compared with a null hypothesis (here, that no such constraints act on color term systems). This is done by BF evaluation, which compares the probability of the observed data under two hypotheses represented by these nested models. Here, we use the log BFs guidelines (29), where 2 logBF 12 = 2 ( logL ( H 1 ) − logL ( H 2 ) ) , with H 1 and H 2 representing the alternate and null hypotheses, respectively. We test an unrestricted model that allows both the rates associated with color term gain and loss to vary freely. The hypothesis that color terms can be gained, but never (or almost never) lost, is represented by two models. The first sets the rate parameter for color term loss to zero. The second sets different prior distributions for each of the two rate parameters. This initializes the analysis with a bias toward gains of color terms, compared with losses, but allows for color term loss. The opposite patterns are also tested for models that are biased toward color term loss. A final alternative model restricts the rate parameters for color term gain and loss to be equal, creating a single rate model under which neither gain nor loss is prohibited and both of these processes are assumed to be equally likely. Models for which the rate parameter for color term gain or the rate parameter for color term loss is set to zero fail to converge. The incompatibility with our data of models that implement exceptionless trends of gain or loss of color categories provides evidence against a strong interpretation of the Kay and Maffi (7) model. An evolutionary explanation for color term systems must allow for at least some color term loss. Table 1 reports the results for analyses that do converge. Two-parameter analyses all result in similar likelihoods because the gain and loss rates converge to near-identical values across all models, regardless of biases toward gain or loss introduced by priors. The unrestricted model estimates the transition rate for color term gain to be markedly higher than that for color term loss (0.95 versus 0.36, respectively). It is thus unsurprising that an analysis that forces these rates to be equal has a far lower likelihood than the unrestricted model, with BFs showing extremely strong support for the two-parameter model over this single-parameter model. In sum, the results suggest that, although a strict prohibition on the loss of color terms is not compatible with Pama-Nyungan color term system evolution, the processes by which these systems have developed have involved substantially more color term additions than losses. Table 1. BF support for color term system models

Ordering of Color Term Addition: Dependent Evolution Analyses. Comparisons of dependent and independent models of evolution for each pair of colors are used to identify correlations in the evolution of color terms. Because the dependent model receives substantial BF support (BF > 2) for the majority of color pairs, we further investigate dependencies between color pairs by examining the frequency with which individual transition rates are deleted in RJMCMC (see Table 2). Table 2. BF support for correlated evolution between color pairs We find support for evolutionary dependencies between all pairs of colors, as would be predicted by WCS, with the exception of red/yellow and red/blue. For most remaining color pairs, the BF support for the dependent model was strong. However, a correlated model of evolution between red and green receives only moderate BF support. For most pairs of colors that are added in adjacent stages along the “mainline” WCS trajectory, we find strong evidence of dependent evolution, consistent with a theory in which color terms are added in a fixed order. The exception to this is red/green, which receives only moderate support. Although all pathways in the WCS model involve the addition of red before green, we find no term for red in 11% of the Pama-Nyungan languages that have a term for green. These languages can be explained either by a gain of green before red or, more likely according to our ancestral state reconstructions, a loss of red. Neither of these explanations is consistent with WCS theory. The lack of support for dependencies between red and yellow or blue is likely the result of the fact that red is reconstructed to the root of the tree, and lost independently across several branches of Pama-Nyungan, which vary in the likelihood of a yellow or blue category. Thus, the evolution of red is captured as well by one gain parameter and one loss parameter as it is by separate rates for gain and loss in the presence and absence of yellow/blue. Stronger support for a dependency between red and brown likely reflects the fact that red is found in all sampled languages that have brown. The RJMCMC procedure allows the number of model parameters to vary across iterations, it provides information about the posterior probability that any parameter should be deleted, which is useful for investigating the ordering of gains and losses of colors for which dependent models are supported. To do this, we examine the percentage of iterations in which particular parameters were set to zero in the RJMCMC analysis. We expect two categories of parameters to be frequently set to zero: parameters associated with the gain of a “later” color term in the absence of an “earlier” color term (e.g., the rate for gaining yellow in the absence of red), and those associated with the loss of an “early” color term in the presence of a “late” color term (e.g., the loss of red where a term for yellow is present, arrow (h) in Fig. 2). These two types of parameters are associated with changes that contradict the WCS theory, namely out-of-order additions of terms and losses of terms in later stages of the evolutionary trajectory. Parameters associated with color gains in the order prescribed by the WCS model (arrows a and f in Fig. 2) are expected to be deleted seldom, if ever. Fig. 2. Parameters in dependent models. Indeed, we find that parameters associated with gaining color terms in the order prescribed by the WCS “main line” (edges a and f in Fig. 2 and Table 3) are almost never deleted. The parameter for a gain of a brown term when a blue term is present is the most often deleted set to zero in 22% of models. Table 3. Deletion frequencies for parameters in RJMCMC model strings, expressed as percentage Parameters describing the gain of “late” color terms in the absence of “earlier” color terms (column d in Fig. 2 and Table 3) are expected to be universally deleted under the WCS theory. However, the deletion rates for this parameter are less consistently supportive of WCS hypotheses than the parameters that are associated with “main line” color gains. Percent deletion of this parameter ranges from 21% (for gain of green in the absence of red) to 100% (for gains of blue or brown in the absence of yellow). That is, we never find blue or brown gained without yellow. The deletion rate is also extremely high for the green/blue and red/brown color pairs. For other color pairs, relatively low deletion rates suggest that the ordering of color term gain may not be as secure as suggested by the WCS. Color pairs green/yellow, green/brown, and blue/brown retain the parameter associated with out-of-order color term gain in 69–74% of sampled iterations. For green/yellow, this may suggest that some branches of Pama-Nyungan may evolve along an alternative WCS pathway (i.e., pathway B in Fig. 1). The parameter associated with losses of “early” color terms in systems where “later” color terms are present (parameter h) is even more variable. Not only do the changes described by this parameter involve losses of color terms, they also result in systems that generally do not fit into WCS classification stages. Despite the strong expectation that this parameter should be deleted, it is retained 100% of the time for color pairs red/green, yellow/blue, and yellow/brown (see column h in Table 3). The yellow category would thus appear to be less resistant to loss than the WCS theory would suggest. Color pairs red/brown and green/blue are more consistent with WCS predictions, with the h parameter deleted 99% and 94% of the time, respectively. As a whole, the patterns of deletion for this parameter across all color pairs show clear evidence for color loss and variable resistance to loss across colors. The posterior distribution of models produced by RJMCMC is also useful for examining alternative pathways for color term addition. Although the WCS “main line” involves the gain of green before yellow, with the addition of blue only after these two colors have been added, a minority of attested systems surveyed by the WCS show evidence for the addition of yellow before green or the emergence of blue before splitting yellow and green (8). Although the parameter for gaining yellow without green is set to zero in 29% of iterations, the parameter for gaining these colors in the reverse order is never set to zero. These results support the dominance of the green-first pathway in Pama-Nyungan and provides further evidence that both universal and language- or family-specific factors are involved in the evolution of color systems. We find further support for the WCS “main line” in the dependent model for yellow and blue. Although the parameter associated with gains of blue in the absence of yellow is always deleted, the parameter for gain of yellow in the absence of blue is never deleted. Parameters associated with losing either of these colors in the presence of the other are also almost never deleted. Thus, we find very strong support for the addition of yellow before blue in Pama-Nyungan, but poor support for the notion that these particular terms are resistant to loss.

Ancestral Node Reconstruction. Ancestral node reconstruction estimates produced by the unconstrained two-parameter analysis across all seven color categories provide further evidence regarding the evolutionary trajectories of color term systems. Fig. 3 displays histograms showing the likelihood of each color’s presence at the root, ancestral nodes corresponding to well-established subgroups, and a sample of other internal nodes. Fig. 3. Ancestral state reconstructions on consensus tree. For the majority of Pama-Nyungan subgroups, the reconstructed color term categories co-occur in patterns that are consistent with the WCS typology. The Paman, Yuin-Kuri, and Durubulic subgroups, for example, both have high probabilities for black, white, and red reconstructing to state 1 (present) at their ancestral nodes, consistent with WCS stage II. Several other subgroups, including Karnic, Thura-Yura, and Ngayarta, have ancestral state probabilities consistent with WCS stage III, with the color categories black, white, red, and green. The alternative WCS stage III configuration, with black, white, red, and yellow, is not as well supported among ancestral state reconstructions. Only Bandjalangic shows this pattern. Only one Pama-Nyungan subgroup, the Central New South Wales languages, could be plausibly reconstructed with the six-color system of the WCS stage V (black, white, red, yellow, green, blue). However, although the probability of reconstructing blue for this subgroup is fairly secure (0.89), the reconstruction of yellow is less certain (0.45). Regardless of whether this subgroup is reconstructed as a canonical stage V system, it represents a challenge to one of the Kay and Maffi (7) hypotheses. It also represents a rapid elaboration of the color term system, given that its parent node shows strong support for only black, white, and red. The pattern of blue and brown occurring without yellow is even more robust in the Kulin subgroup. Three of its seven languages have blue and brown terms but lack yellow, with probabilities of 82% and 42%, respectively, for reconstructing blue and brown but only 4% for yellow. Deeper in the tree, we find evidence that basic color term systems involved small numbers of color categories for the majority of the history of Pama-Nyungan. The root shows a high probability of having black, white, and red color categories, with a very small probability of green, presumably due to the prevalence of that color category outside of the Pama–Maric languages. The ancestral node reconstructions between this root and the primary subgroups generally show a progression from three-color systems to four-color systems including green (WCS stage III). A transition from this four-color system to a five-color system (black, white, red, green, yellow/WCS IV) is also apparent within the western branch of the family, although elsewhere in the tree the likelihood of five-color WCS stage IV system is lower due to low reconstruction probabilities for yellow. Although the general trend suggested by ancestral node reconstruction probabilities is consistent with WCS evolutionary pathways, a more detailed examination of the results reveals evidence for patterns that contradict Kay and Maffi (7). Reasonably strong evidence for color term loss can be found in languages like Wayilwan (with only black, white, green) within the Central New South Wales subgroup (with a probable reconstruction of black, white, red, green, blue). We also find a reasonably high probability for a green category in Western Pama-Nyungan nodes ancestral to the Kanyara–Mantharta subgroup, although the probability of green in Kanyara–Mantharta itself is very low (0.01). This decrease in the probability of a green category along the branches leading to the Kanyara–Mantharta subgroup can reasonably be interpreted as a likely loss of that color.