Language relies on a suite of cognitive capacities that make it far more complex than any other animal communication system. One notable capacity is our ability to label external objects and events with acoustically distinct, referential words. Crucially, the acoustic structure of referential words is socially learned and agreed upon within a community of language users [].

Given the central importance of learning referential words in language, a key question is whether this is a uniquely human trait []. Studying referential vocalizations in extant non-human primates (“primates”) with whom we share common ancestors is one approach that can shed light on evolutionary continuity in this capacity []. Numerous studies have shown that primates can refer to external objects in the environment, such as food or predators, with acoustically distinct vocalizations, and receivers respond in ways that suggest the calls are putatively meaningful []. However, in all cases, the relationship between the acoustic structure of the call and the referent eliciting it appears to be tightly fixed []. This is unlikely to be because primates are incapable of flexible modification of vocalizations. Indeed, several primate and non-primate species have been shown to adapt the acoustic structure of social-contact vocalizations in a learned way [], potentially to improve social bonding between individuals []. Instead, it has been posited that the contexts that elicit referential calls induce arousal states in producers that are associated with specific call structures []. One major consequence of this apparent lack of flexibility in the structure of functionally referential calls is the conclusion that this type of call is of little use for understanding the evolutionary origins of language [].

On the occurrence and significance of motivation-structural rules in some bird and mammal sounds.

What birds with complex social relationships can tell us about vocal learning: vocal sharing in avian groups.

The faculty of language: what is it, who has it, and how did it evolve?.

The aim of our study was to investigate the degree of control and flexibility adult captive chimpanzees have over the structure of their functionally referential food calls []. Chimpanzee food grunts are produced almost exclusively in feeding contexts and are directed at valuable social partners []. The acoustic structure of these calls varies systematically with the caller’s preference for the referent, which, in captivity, produces food-specific calls [], and listeners can extract meaningful information about the value and type of food available from group members’ grunts (K.E.S., T. Kaller, J. Call, and K. Zuberbühler, unpublished data; []). However, the extent to which vocal learning processes can influence the acoustic structure of chimpanzee referential food calls or exactly how group members converge on a distinct call for a specific food type remains unknown.

In order to address these questions, we investigated the structure of food grunts before and after a rare integration of two groups of captive adult chimpanzees ( Table S1 ) at Edinburgh Zoo in the UK []. Acoustic recordings of grunts produced in response to apples were taken before integration in 2010, then after integration in 2011, and again in 2013. We obtained these data from six Edinburgh Zoo residents (ED) and seven immigrant individuals from Beekse Bergen Safari Park (BB) in the Netherlands. We also collected data on the social behavior of all individuals across years to examine whether any acoustic convergence may be related to social cohesion of the integrated group.

Network analysis of social changes in a captive chimpanzee community following the successful integration of two adult groups.

We predicted that if acoustic convergence in the food grunts of both subgroups occurred, it would coincide with the development of close social affiliations between subgroups []. Systematic preference data for apples were collected in all 3 years in order to explore the possibility that any changes in the acoustic structure of food grunts may simply reflect changes in preference value of the referents and accompanying arousal states [].

Post hoc GLMMs on the four acoustic variables used to generate the PCA were subsequently employed to investigate the nature of this acoustic change in the BB calls between 2010 and 2013. In 2010, the BB chimpanzees showed a significantly stronger preference for apples compared to the ED chimpanzees ( Figure 2 ), and, in line with previous research [], the BB calls were higher in frequency than ED calls ( Table 1 ). In 2013, the BB calls had a significantly lower first formant frequency and peak frequency compared to calls in 2010 (mean frequency of the first formant [F1]: LR test, χ= 18.1, df = 1, p < 0.001; peak frequency: LR test, χ= 8.6, df = 1, p = 0.003; Table S4 ) and tended to have a shorter intercall interval (LR test, χ= 3.3, df = 1, p = 0.07; Table S4 ). No effect of year was found on the duration of BB calls (LR test, χ= 0.02, df = 1, p = 0.88; Table S4 ). This pattern of acoustic change was independent of the BB individuals’ preferences for apples, which remained unchanged across years ( Figure 2 ).

Mean Values from the Acoustic Parameters Extracted from the Grunts of Individuals in the ED and BB Groups in 2010 and 2013

Table 1 Mean Values from the Acoustic Parameters Extracted from the Grunts of Individuals in the ED and BB Groups in 2010 and 2013

Boxplot representing the median percentage of times apples were chosen in pairwise comparisons with five other foods by individuals of ED and BB groups in 2010 and 2013. Data for individuals who also contributed acoustic data from each group (ED = 6; BB = 7) are plotted.

Four acoustic measures extracted from food grunts recorded over the 3 years (2010, 2011, and 2013; see Supplemental Information and Table S2 ) were entered into a principal-component analysis (PCA), where the first principle component (PC1) explained the majority of the total variation: 47%, compared to 26%, 20%, and 7% for components 2–4, respectively. We therefore focused on PC1 for remaining modeling analyses (see Supplemental Information and Table S3 for loading values of the individual acoustic parameters). Generalized linear mixed-effects models (GLMMs) demonstrated that there was a significant interaction between group (BB or ED) and year on PC1 (likelihood ratio [LR] test: χ= 4.98, df = 1, p = 0.025), suggesting that the effect of year on PC1 differed significantly between the groups. Figure 1 indicates the direction of the interaction with BB individuals (dotted line) converging on PC1 of ED (solid line), which remained stable across years. Assessment of 95% confidence intervals (CIs) indicated that the acoustic structures of food grunts from the two groups were significantly different in 2010 and 2011 (CIs exclude zero; Table S4 ) but were no longer significantly different in 2013 and thus had converged (CIs include zero; Table S4 ).

The interaction between year (2010, 2011, 2013) and group (ED, BB) as a function of PC1. Points represent raw data, and lines represent model predictions derived from the GLMMs (dashed lines represent BB; solid lines represent ED).

Using an association index as a measure of social relationships (see Supplemental Information ), social network analyses (SNAs) showed that in 2010 and 2011, BB and ED individuals had formed two distinct social subgroups, with maximum modularities of 40% and 30%, respectively []. In contrast, in 2013, the SNA showed a maximum modularity of 17% ( Figure S1 ), which is below the 30% threshold for a significant division of data into subgroups []. This indicates that, by 2013, the distinct social subgroups identified in 2010 and 2011 by Schel et al. [] no longer existed, and the group was fully integrated ( Figure 3 ).

(A–C) Sociograms illustrating association patterns between July 2010–December 2010 (A), April 2011–October 2011 (B) (data from Schel et al. []), and April 2013–July 2013 (C). ED individuals are in dark gray; BB individuals are in light gray. Males are shown as squares, and females are shown as circles. The size of the node scales according to eigenvector centrality index for that individual. Thickness of lines scale with the strength of association. All relationships of less than the community mean (A + B = 0.05, C = 0.15) were removed for clarity.

Network analysis of social changes in a captive chimpanzee community following the successful integration of two adult groups.

Network analysis of social changes in a captive chimpanzee community following the successful integration of two adult groups.

Network analysis of social changes in a captive chimpanzee community following the successful integration of two adult groups.

Preference tests demonstrated that the perceived value of apples to ED and BB chimpanzees remained stable across years. BB chimpanzees chose apples over other foods significantly more often than ED chimpanzees in both 2010 (Mann-Whitney U test = 3.50, z = −2.58, p = 0.01; Figure 2 ) and 2013 (Mann-Whitney U test = 2.00, z = −2.76, p = 0.004; Figure 2 ). Moreover, we found no significant difference in preference value of apples between 2010 and 2013 for either ED (Wilcoxon signed-ranks test z = −1.633, p = 0.25, r = −0.66; Figure 2 ) or BB (Wilcoxon signed-ranks test z = −2.10, p = 0.875, r = −0.79; Figure 2 ) individuals.

Changes in Acoustic Structure Are Not a Result of Changes in Food Preferences

Conclusions

17 Geissmann T. Mate change enhances duetting activity in the siamang gibbon (Hylobates syndactulus). 18 Geissmann T. Duet-splitting and the evolution of gibbon songs. 19 Elowson A.M.

Snowdon C.T. Pygmy marmosets, Cebuella pygmaea, modify vocal structure in response to changed social environment. 20 Snowdon C.T.

Elowson A.M. Pygmy marmosets modify call structure when paired. 21 Marshall A.J.

Wrangham R.W.

Arcadi A.C. Does learning affect the structure of vocalizations in chimpanzees?. 22 Mitani J.C.

Brandt K.L. Social factors influence the acoustic variability in the long-distance calls of male chimpanzees. 23 Mitani J.C.

Gros-Louis J. Chorusing and call convergence in chimpanzees: tests of three hypotheses. 14 Nowicki S. Vocal plasticity in captive black-capped chickadees: the acoustic basis and rate of call convergence. 15 Hile A.G.

Plummer T.K.

Striedter G.F. Male vocal imitation produces call convergence during pair bonding in budgerigars, Melopsittacus undulatus. 16 Brown E.D.

Farabaugh S.M. What birds with complex social relationships can tell us about vocal learning: vocal sharing in avian groups. 2 Wheeler B.C.

Fischer J. Functionally referential signals: a promising paradigm whose time has passed. 9 Fedurek P.

Slocombe K.E. Primate vocal communication: a useful tool for understanding human speech and language evolution?. 30 Hollen L.I.

Radford A.N. The development of alarm call behaviour in mammals and birds. 2 Wheeler B.C.

Fischer J. Functionally referential signals: a promising paradigm whose time has passed. 24 Morton E.S. On the occurrence and significance of motivation-structural rules in some bird and mammal sounds. By comparing the acoustic structure of food grunts given to a specific food type (apples) from two different chimpanzee groups before and after social integration, we demonstrate that these vocalizations are not acoustically fixed and can be modulated independently of the preference value of the referent. Specifically, by 2013, the immigrant BB subgroup had converged on the acoustic structure of food grunts for apples produced by ED individuals ( Figures 1 and S2 ). Furthermore, mere exposure to the new group’s different calls for apples was insufficient to trigger convergence: after 1 year of living together, there was no evidence of convergence. It was only in 2013, when social integration was complete and strong social relationships had formed between members of the original subgroups ( Figure 3 ), that BB individuals produced calls similar in structure to those of ED individuals. These results indicate that adult chimpanzees are both able and motivated to modify the acoustic structure of their referential food grunts to match those of unrelated close social partners. Previous observational studies have shown similar learning-based acoustic modulation in the social calls of primates [] and non-primates []. However, to our knowledge, this is the first example of vocal learning from conspecifics having an influence on the structure of functionally referential vocalizations of any non-human species []. This challenges long-held assumptions that, unlike human referential words, functionally referential primate calls cannot be decoupled from the arousal state experienced by the signaler and are completely fixed in their acoustic structure [].

22 Mitani J.C.

Brandt K.L. Social factors influence the acoustic variability in the long-distance calls of male chimpanzees. 31 van de Waal E.

Borgeaud C.

Whiten A. Potent social learning and conformity shape a wild primate’s foraging decisions. The acoustic convergence was asymmetrical in its nature, with BB individuals converging on the acoustic structure of calls produced by the resident ED chimpanzees. Several potential factors could account for this effect. First, it is possible that rather than a generalized network of convergence, BB individuals were simply converging on the vocalizations of the dominant male, an ED individual. However, this would be inconsistent with previous acoustic convergence work suggesting that subordinate chimpanzees were equally likely to converge on the pant-hoots of dominant and subordinate males []. An alternative is that conformity mechanisms may have motivated the immigrants to adopt the vocal norms of the host group. Although such mechanisms have not yet been confirmed for the learning of primate vocalizations, studies of wild vervet monkeys have shown immigrant males conforming to the normative foraging habits of their host group [].

Although the chimpanzee subgroups did eventually converge their food grunts, this took approximately 3 years. The slow rate of adoption of the “local” vocal label for apples by the BB individuals can be explained by the long time period it took for close affiliative relationships to emerge between the subgroups (the integration of two adult, mixed-sex groups would not occur in the wild). In the future, where neighboring wild communities are studied, tracking changes in the acoustic structure of food grunts produced by immigrant females before and after emigration from their natal group would add valuable insights into the time course of vocal learning in a natural setting.

10 Seyfarth R.M.

Cheney D.L.

Marler P. Vervet monkey alarm calls: semantic communication in a free-ranging primate. 27 Slocombe K.E.

Zuberbühler K. Food-associated calls in chimpanzees: responses to food types or food preferences?. 22 Mitani J.C.

Brandt K.L. Social factors influence the acoustic variability in the long-distance calls of male chimpanzees. 23 Mitani J.C.

Gros-Louis J. Chorusing and call convergence in chimpanzees: tests of three hypotheses. 32 Tyack P.L. Convergence of calls as animals form social bonds, active compensation for noisy communication channels, and the evolution of vocal learning in mammals. 33 Hausberger M.

Richard-Yris M.-A.

Henry L.

Lepage L.

Schmidt I. Song sharing reflects the social organization in a captive group of European starlings (Sturnus vulgaris). Determining which factors motivate vocal learning of functionally referential calls in non-human primates may also help in understanding what adaptive benefit this ability confers. Given that subtle variation in call structure can transfer different information to listeners [], it may be that vocal learning facilitates effective communication through ensuring that signalers converge on the same acoustic structure for a particular external event or context. One experimental means of examining this would be to carry out playback experiments before and after an immigrant converges on their host group’s calls. If receivers are only capable of decoding the referent of a call after convergence has taken place, this would suggest that the purpose of vocal learning is to facilitate communication. However, if individuals are able to decode the referent of a call prior to acoustic convergence, this might suggest that convergence primarily serves a social function, as has been suggested by previous studies for convergence in non-referential contact calls []. Until further experimental evidence is available, it seems most parsimonious to assume social facilitation motivated the observed acoustic changes.

Our study demonstrates that chimpanzees have some control over the acoustic structure of their functionally referential vocalizations and that they are motivated to deploy this ability to make their calls more similar to those of close social partners. Although the adaptive benefits of such behavior need further investigation, the salient finding is that functionally referential food calls are not rigidly coupled to the arousal state of signalers and are open to vocal learning processes. This suggests that such referential calls share more characteristics with referential words in humans than previously thought and indicates the cognitive building blocks underlying the flexible, socially mediated learning of referential words in language may be evolutionarily older than previously thought.