Subjects and housing

We used 16 sub-adult captive ravens housed in two separate social groups of eight birds each at the Haidlhof Research Station, Bad Vöslau, Austria. Both groups contained male and female peers (group 1: 3 males, 5 females; group 2: 4 males, 4 females). For a description of each individual (for example, age, rank and raising history) see Supplementary Table 1. Both groups were kept in adjacent parts of a large aviary complex (compartment A and B; Fig. 5a) for 9 months, with full visual and auditory access to the other group. Before the experiments, during a 1-month period, each group was trained to temporarily use another part of the complex, that is, birds of group 1, that traditionally were found in part B, could move to part C; birds of group 2, that traditionally used part A, could move to part B when group 1 was in C (Fig. 5b). This procedure allowed us to familiarize birds of both groups with the middle compartment B, which was subsequently used for testing. All aviary parts are enriched with trees, perches, playing devices and shallow pools for bathing. The middle compartment B is subdivided into two same-sized parts (B-I, B-II) by wire mesh panels with sliding doors and an opaque observation hut (2.5 × 2.5 m2). On experimental days, the birds received their normal diet consisting of meat, milk products, bread, vegetables and fruits twice a day. Water was available ad libitum.

Figure 5: Schematic representation of the set-up of the aviaries. Aviaries A (18 × 10 × 5 m3), B (15 × 15 × 5 m3) and C (8 × 10 × 5 m3), housing group 1 (orange) and group 2 (yellow) during the different phases (a–c) of the experiment. The black dot represents an example of an animal in a test, the sound logo the place of the speaker from which the playback was played and the camera logo the respective place of the cameras that filmed this bird. Full size image

Ethical note

The ravens of group 1 originated from captive breeding pairs in zoos (Alpenzoo Innsbruck, Austria; Zoo Wels, Austria; and Nationalpark Bayrischer Wald, Germany) and a private owner (K Trella, Austria); those of group 2 originated from captive breeding pairs at the Konrad Lorenz Forschungsstelle in Grünau, Austria and from Lund University, Sweden. The study complied with Austrian law and local government guidelines (§ 2. Federal Law Gazette number 501/1989), and received oversight from the internal behavioural research group at the faculty of Life sciences, University of Vienna, and was authorized owing to its non-invasive character. The study subjects remained in captivity at Haidlhof Research Station after the completion of this study for further research.

Experimental design and set-up

Experiments started after all birds were comfortable with a short individual separation in the middle compartment B, while their conspecifics remained in parts A and C. For testing, the focal subject was called either into subdivision B-I or B-II, that is, in the half being closer to A or C, respectively; the loudspeaker used for playing back the stimuli was hidden in the opposite subdivision, always behind the wooden hut. Specifically, the loudspeakers’ position was such that the direction of the played back stimuli was congruent with the current position of the group that particular stimuli could come from: if the focal subject was positioned in B-I, it was tested with stimuli of group 1 from the direction of C; if it was positioned in B-II, it was tested with stimuli of group 2 from the direction of A (Fig. 5c).

Each playback contained three vocal interactions of the same individuals, each separated by 1 min. Stimuli were played from a loudspeaker (LD systems Roadboy 65, flat frequency response 80–15 kHz) connected to a MacBook Pro through a wireless system (Sennheiser EK 2000, flat frequency response 25–20 kHz). Loudness was adjusted to the natural submissive vocalization sound pressure levels. The actual playback loudness at the receiver varied depending on focal bird’s position in the aviary and the weather conditions. To hinder social learning and/or disruption of established hierarchies, the test playbacks were masked for all other animals using synchronized white-noise playbacks from two loudspeakers (LD systems Roadboy 65, flat frequency response 80–15 kHz), one directed at each groups. All loudspeakers were visually occluded for all animals.

Conditions

Focal individuals were subjected to playbacks of vocal interactions (see acoustic information below) of two other birds in an order consistent with the group’s dominance hierarchy (expected condition) and in an order inconsistent (that is, mimicking a rank reversal) with the group’s dominance hierarchy (unexpected condition). Per testing day, the birds received two sessions: one with playbacks of individuals of their own sex and one with playbacks of individuals of the different sex. In addition, animals were tested twice: once with playbacks of group members (both males and females) and once with playbacks of members of the other group (again both males and females). Consequently, all birds were tested in four conditions per in/out-group; that is, two control (expected) and two corresponding test (unexpected) conditions. The order of expected versus unexpected was counterbalanced over the tested birds within each session, the order of the played back sexes was counterbalanced over the tested birds over the two sessions per day and the order of in- or out-group playbacks were counterbalanced over the tested birds over the two testing days. For a schematic representation of all conditions please see Supplementary Table 2.

Testing lasted roughly an hour per day: after a 15-min habituation period, we played back the first stimulus to the subject (session 1, for example, own sex/congruent), and after another 15 min, we played back the second stimuli (session 2, for example, own sex/incongruent), followed by 15 min post observations. For the entire period, the behaviour of the focal subject was videotaped (using two Canon LEGRIA HD-camcorders). Models (that is, those individuals whose calls were played back) remained the same per focal subject, that is, both the expected and unexpected playback of either familiar (in-group) or unfamiliar (out-group) and of either same-sexed and different-sexed birds. For an overview of which models were used for which subject, please see Supplementary Table 1.

Acoustic recordings and stimuli preparation

The playback consisted of two types of vocalizations: self-aggrandizing display (hereafter SAD) and submissive calls41. Ravens of both sexes show SADs accompanied by a dominant posture, as a directed dominance display, which is often followed by submissive calls (Supplementary Fig. 3), and submissive posture and retreat by the subordinate individual. Note that the combination of SADs and submissive calls determined the meaning of the interaction, that is, a mild conflict with clear outcome. Ravens can show SADs also in a non-directional way, typically when they have temporarily left or are about to join the group. Acoustically, SADs can be highly variable between regions and individuals41 and a single individual may produce several distinct SAD types (personal observation). In our case, most birds within each group shared their vocal display repertoire regardless of their sex but varied in the frequency of certain SAD type usage (Supplementary Fig. 4). To create the stimuli, we used the two predominant SAD types from each group (Supplementary Table 3).

We constructed the dyadic interaction stimuli using vocalizations of six birds (three males and three females of consecutive ranks) from each group. Each stimulus approximated a dyadic interaction of a dominant (SAD vocalization) and subordinate (submissive vocalization) individual. We used the most frequent SAD type for each bird (Supplementary Table 3). Only within-sex interactions were considered. For each sex and group, this resulted in four stimuli of one rank step (two congruent and two incongruent with the actual group hierarchy) and two stimuli of two rank steps. In total, we obtained 24 playback stimuli (two groups × two sexes × three individuals × two congruency conditions).

Acoustic recordings of SADs and submissive calls were obtained between February 2011 and June 2012 from various non-experimental situations. All calls were recorded with a Sennheiser K6/ME66 shotgun microphone connected to a Marantz PMD660/Zoom H4n digital recorder or a Canon LEGRIA HD-camcorder. Best quality recordings were individually extracted, high-pass filtered at 200 Hz and peak amplitude normalized. SADs were normalized at −10 dB levels of the submissive calls to approximate the natural loudness difference between the two call types. Submissive calls are usually produced in bouts, which include adjacent calls without pause. For better approximation of the natural call occurrence, submissive calls were extracted singly or as two immediately adjacent calls.

Each individual stimulus consisted of a bout of three SADs from individual I immediately followed by a bout of five to seven submissive calls from individual II followed again by a single SAD from individual II (Fig. 6 and Supplementary Audio 1). SADs were spaced 2±0.2 s and submissive calls <0.5 s apart. The number of submissive calls varied between five and seven depending on the length of the individual calls in the stimulus. All individual calls were used no more than once within one stimulus and no more than three times within one playback session. We prepared stimuli using PRAAT 5.2.46 (ref. 42) and Adobe Audition CS5.5 software packages for mac OS X.

Figure 6: Example waveform of a playback stimulus. Playback stimulus simulating an interaction between a dominant bird giving a bout of three SADs (individual I) followed by a bout of submissive vocalizations from a subordinate bird (individual II) and followed again by one SAD from the dominant. Full size image

Measures and data analyses

Before the experiments, we analysed the dominance hierarchies in both groups. Therefore, we provided the birds with a heap of food that could be monopolized by one individual and scored all unidirectional displacements38. We arranged these data in matrices with actors in rows and recipients in columns. We determined the dominance order most consistent with a linear hierarchy, calculating Landau’s linearity indices (h′) using MatMan 1.1 (ref. 43) and reordered matrices to best fit a linear hierarchy44,45. We found significantly linear hierarchies in both groups (group 1: h′=0.964, n=8, P<0.001, based on 342 interactions and with 0% unknown relationships; group 2: h′=0.774, n=8, P=0.015, based on 403 interactions and 3.57% unknown relationships).

Videos of the experiments were coded with Solomon coder46 by J.S. who was blind for the congruence of the playback and for the sex of the played back individuals. Per playback, we coded 17 different behavioural variables (see Supplementary Table 4) during the 3.5 min of the playback (three playbacks a 10 s+2 min in between the three playbacks and 1 min post playback) and during the 3.5 min before the playback. Playbacks (12.5%) were recoded by Kerstin Pölzl. We used Spearman’s ρ-correlations to calculate inter-rater reliability regarding durational behaviours. All durational measures were scored almost identically, with Spearman’s ρ-correlation coefficients ranging between 0.73 and 1, and P≤0.001. Inter-rater reliability regarding point behaviours was calculated using Cohen’s κ. The value of κ was 0.68, which corresponds to a good level of agreement (91.2% agreement).

To reduce the amount of response variables, we performed a principle component analysis (PCA) on all behaviours coded during and before the playback. Note that if different sets of behaviours are found together before and after the playback, combining the two times might be hindering the PCA. Subtracting the behaviours found during the playback from the baseline before playback may lessen this problem. However, such a subtraction presumes an a priori difference between the phases, which would cause a problem for a subsequent PCA in case this difference is not present owing to a large amount of zeros in the data.

On the basis of eigenvalue (>1) and scree-plot investigation, we extracted five components that in total explained 53.4% of the overall variance of all data. On the basis of the variable loadings, the five components seem to reflect; 1, activity; 2, vocalization; 3, attention; 4, self-directed behaviour; and 5, ‘stress’ (Supplementary Table 4). Subsequently, we procured individual component scores for the five PCA components using the regression method. These component scores have a mean of zero and a variance equal to the squared multiple correlation between the estimated and the true component values.

To assess whether individuals reacted differently to playbacks with an expected interaction versus a playback with an unexpected interaction, we first calculated per component the difference between an individual’s component score during and before the playback that is, playback−baseline (delta).

Per component, we then used GLMM to assess the effect of condition (expected versus unexpected), sex of the subject, sex of the playback and age on the delta score. We ran separate analyses for the responses to in-group and to out-group stimuli. In these models, the delta of the component scores was the response variable, whereas condition, sex, sex of the playback and age were entered as fixed variables. Furthermore, as we dealt with repeated data, we structured our data as to represent the nested structure of our data. Particularly, we structured our data to be nested in each individual, which in turn were nested in one of the two groups. Consequently, we entered subject identity and group as random variables to our models. We ran models including all main effects and two-way interactions of sex and sex of the playback with condition, and several reduced models and selected the best fitting model with the Akaike Information Criteria. All reported P-values are two tailed, and we consider α≤0.05 as a significant effect. Where appropriate, we ran post hoc analyses using Wilcoxon-signed ranks tests.