Subjects and experimental apparatus

The study was carried out at Buttercups Sanctuary for Goats (http://www.buttercups.org.uk) in Kent, UK. At the sanctuary, goats are released into a large field during the day and are confined indoors either in individual or shared pens (average size = 3.5 m2) at night. Goats have ad libitum access to hay, grass, and water and are also fed with a commercial concentrate according to their health condition and age. In total, 24 adult goats (12 females and 12 castrated males) of different breeds and ages (Table 1) were tested from May to September 2015, at Buttercups Sanctuary for Goats in Kent (UK). An experimental arena (7 m × 5 m) was set up and placed in one of the fields where the goats are released during the day. The arena consisted of a rectangular area composed of a start pen connected by a gate to a central arena made with a commercial opaque agricultural metal fence (Fig. 4). A loudspeaker was placed outside the perimeter of the arena, on the opposite side to the main gate. The speaker was not visible to the goats and was concealed with camouflage netting.

Table 1 Goats tested and experimental design. PNP indicates a Positive (habituation) - Negative (dishabituation) - Positive (rehabituation) sequence; NPN indicates a Negative (habituation) - Positive (dishabituation) – Negative (rehabituation) sequence. FEFR indicates sequences built with FEeding anticipation and feeding FRustration calls; FRFE indicates sequences built with feeding FRustration and FEeding anticipation calls; FEIS indicates sequences built with FEeding anticipation and ISolation calls and ISFE indicates sequences built with ISolation and FEeding anticipation calls Full size table

Fig. 4 Experimental enclosure. The experimental apparatus (7 m × 5 m) consisted of a start pen connected by a door to a central arena. The loudspeaker was placed at the far end of the arena (outside the perimeter) and was covered with hunting net and natural vegetation. The experimenter remained inside the start pen during the tests, out of view, behind a PVC garden screening fence Full size image

Sound recordings

The vocalisations used in this study were obtained from a previous study [7] conducted at the same location. The calls selected belonged to goats that did not share a pen with the subjects during the night, or to goats that were no longer at the sanctuary at the time of testing. Calls were recorded at distances of 3–5 m from the focal animal using a Sennheiser MKH-70 directional microphone (frequency response 50–20,000 Hz; max SPL 124 dB at 1 kHz) connected to a Marantz PMD-660 digital recorder (sampling rate: 44.1 kHz with amplitude resolution of 16 bits in WAV format). Three different contexts inducing emotions were considered: 1) food anticipation (positive, high arousal), in which the goats, tested in pairs in two adjacent pens, learned to anticipate a food reward after three days of training and were recorded on the fourth day when the experimenter approached the tested goats with a bucket of food; 2) food frustration (negative, high arousal), in which only one of the goats in a pair received food from the experimenter, and the other one was recorded while its pair mate was eating; 3) isolation (negative, low arousal), in which the tested goats were recorded while isolated in a pen alone for 5 min away from the other goats but within their usual daytime range, after 3 days of habituation to this situation. The changes in the behaviour and physiology of the subjects in these three contexts were examined. The arousal and the valence of each recording context were determined using physiological and behavioural indicators of emotions (used to validate of the emotional arousal and valence; [7]. Food anticipation and food frustration induced higher arousal compared to isolation. Food anticipation and food frustration were also associated with lower heart-rate variability, higher respiration rate, more movements, more calls, more time spent with ears pointing forwards and less time with ears on the side. In the food anticipation condition, goats had their ears oriented backwards less often and spent more time with their tails up compared to the food frustration and isolation conditions [7]. The detailed vocal parameter analysis identified six acoustic parameters affected by the arousal. F0 contour over time and energy quartile increased with arousal, whereas the first formant decreased. F0 variation within the call was influenced by valence and decreased from negative to positive valence. The acoustic structure of the calls is described in more detail in Briefer et al. [7].

Playback experiments and exclusion criteria

The habituation-dishabituation-rehabituation paradigm (modified from Charlton et al., [37, 48, 49]) was used to investigate whether goats are able to perceive conspecific vocal expression of emotional valence. The paradigm is based on the repeated presentation of a stimulus, for example a positive call produced while a goat was experiencing a given emotional valence, to a subject (habituation), followed by the presentation of a different stimulus [dishabituation; in our case, calls produced while a goat was experiencing a situation with emotional valence opposite to the situation used during the habituation phase (e.g. negative)]. The response (behavioural and/or physiological) of the subject should indicate whether the element that distinguishes the two stimuli (in our case, change in valence) is conspicuous enough to be detected. A reduction in the response of the subject (habituation) after a repeated presentation of the stimulus, followed by an increment in the response when a new stimulus is presented (dishabituation) would indicate that the two stimuli are perceived as different [48, 49, 71]. After the dishabituation, the stimulus used in the habituation is presented again (rehabituation), in order to ensure that the response occurring during the dishabituation is robust and not a random consequence of a renewal of attention [48, 49].

Twenty four sessions (six goats in total, playback sequences played FEFR = 5, FRFE = 7, FEIS = 8, ISFE = 4) were excluded from the final analysis because: 1) subjects did not react to the first habituation call, i.e. individuals did not look towards the source of the playback during the first call of habituation, and/or 2) subjects failed to habituate, defined as sessions where the time spent looking towards the speaker during the last playback of the habituation phase was more than two times longer than the first playback of the habituation phase [37].

Playback sequence and procedure

Each playback sequence consisted of 13 calls, separated by a time interval of 20 s. Only good quality calls with low background noise were selected to prepare the playback sequences as follows: three calls per individual with a signal-to-noise ratio > 10 dB were selected from eight individuals in the food anticipation context, from six individuals in the food frustration context and from five individuals in the isolation context (i.e. 57 calls in total) within the original pool of 180 calls (i.e. 40 calls in food anticipation; 80 calls in food frustration and 60 calls in isolation; [7]. In order to test if the valence of the calls was perceived regardless of context (two contexts of negative valence; frustration and isolation) and order (i.e. which valence was used for the habituation or dishabituation phase), the sequences included the following combinations of valence and context: six sequences included food anticipation (habituation) – food frustration (dishabituation) – food anticipation (rehabituation) calls, (hereafter, “FEFR”); six sequences included food frustration (habituation) – food anticipation (dishabituation) – food frustration (rehabituation) calls, (hereafter, “FRFE”); five sequences included food anticipation (habituation) - isolation (dishabituation) – food anticipation (rehabituation) calls, (hereafter, “FEIS”); and five sequences included isolation (habituation) – food anticipation (dishabituation) - isolation (rehabituation) calls, (hereafter, “ISFE”).

Calls within the sequence were emitted by the same individual, but were produced in two different emotional contexts. The first nine calls (three different calls produced in a given context – food anticipation, food frustration or isolation - repeated three times each and combined in random order) constituted the habituation phase (H); the following three calls (three different calls produced in a context of opposite valence compared to the habituation calls, and combined in a random order) constituted the dishabituation phase (D); and the final call (a single call randomly selected from the habituation phase) constituted the rehabituation phase (R).

Each vocalisation was broadcasted from a Mackie Thump TH-12A loudspeaker (LOUD Technologies Inc., Woodinville, WA; frequency response: 57 Hz - 20 kHz ± 3 dB) connected to an active box to boost the sound (Active Box DI-100 Fame) and to an audio player (Technika MP111), at an approximately natural amplitude (88.99 ± 0.93 dB) measured at 1 m using an ASL-8851 sound level meter. The original duration of the calls was maintained, in order not to remove any information contained in their structure (feeding = 0.71 ± 0.02 s; frustration = 0.70 ± 0.03 s and isolation = 0.71 ± 0.02 s). The peak amplitude of each call had been equalised during the preparation of the sequences. The presentation order of the playback sequences was balanced within each group of 12 subjects (tested in the same day), so that half of the subjects experienced first the Positive – Negative - Positive (PNP) sequence and the opposite Negative – Positive - Negative (NPN) sequence in the following session. The other half of the group experienced NPN first and PNP in the following session. The sex of the goat that produced the calls used in the playback sequence was counterbalanced within and between subjects (half the males and half the females were tested with same sex playbacks and the other half with opposite sex playback). Overall, each subject was tested on two different days, with one session per day, and a three-day interval between sessions.

Before the experiment started, goats were released twice (i.e. one for each consecutive day) for 5 min inside the arena to familiarise with the experimental setup. During the test phase, individuals were gently brought to the start pen, where a familiar experimenter placed the heart rate monitor BioHarness belt around the goats’ thorax. When a clear electrocardiogram (ECG) was obtained, the main gate that provided access to the central arena was opened. After 30 s, the first playback call was played and the session continued until the last call was played.

Behavioural and physiological data collection and analyses

The duration of looking towards the speaker was measured and defined as the time from when the subject directed the head towards the playback location (start) until when the head was turned away and the animal stopped looking (end), within the 20 s following each call. If the subjects were already looking towards the speaker when one of the calls of a sequence was broadcasted, then this behaviour was considered to begin at the onset of the playback [48]. When the goat looked away and then looked back to the speaker within the 20 s following each call, the time was scored again. The total duration of looking towards the sound source was calculated for each subject and for each of the 13 calls. All trials were video recorded using a digital video camera placed at the entrance of the arena (Sony HDR-CX190E). The videos were analysed frame by frame using QuickTime player (Apple Inc.). A second observer, blind to the experimental hypothesis, scored 30% of the sessions to test the reliability of the parameters measured by the two observers. Inter-observer agreement for the behaviour scored was high (Spearman rank correlation; r s = 0.990, p < 0.001).

The physiological parameters were recorded using a non-invasive Bluetooth device (EC38 Type 3, BioHarness Physiology Monitoring System, Zephyr Technology Corporation, Annapolis, MD, USA) fixed to a belt placed around the goat’s chest. A small patch of hair (7 cm X 15 cm) was clipped before the experiment in order to obtain a clearer ECG trace. This procedure took place a week before the testing to avoid any confounding effects of being manipulated. The continuous ECG trace was transmitted in real time to a laptop (ASUS S200E) and registered using the software AcqKnowledge v.4.4 (BIOPAC System Inc.). During the playbacks, we entered visible markers in the ECG trace at the beginning of each call to be able to link the physiological data to the specific calls and phases of the experiments. The time of occurrence of each heart beat identified on the ECG trace was extracted during the 20 s following each call. HR and HRV (measured as root mean square of successive inter-beat interval differences, RMSSD) were further calculated from the extracted heart beats on the longest selection possible within 20 s.

Data analysis

Analyses were conducted using Linear and Generalised Mixed-Effects Models (lmer function, lme4 library; Pinheiro 2000) in R v.3.2.2 [72, 73]. First, the occurrence of looking towards the speaker, HR and RMSSD were compared over the nine calls played during the habituation phase (H1-H9) to determine whether goats habituated to the sounds throughout this phase (indicated by a significant decrease in occurrence of looking and in HR throughout the phase). Subsequently, responses to the last habituation call (H9) were compared to those of the first dishabituation call (D10). Responses were also compared to dishabituation calls D10 vs D11, and D11 vs D12, to investigate the response pattern within the dishabituation phase. Finally, responses to the dishabituation calls (D10, D11, and D12) were compared to those of the rehabituation call (R13). The model selection and the variable considered were call number (1 to 13; or a combination of these for further post-hoc tests) and call valence (positive or negative), as well as their interaction as fixed effects. The duration of the measurement period (9.34 ± 0.17 s) was also included as a control factor in the model carried out on RMSSD, because it could potentially affect this value. The factor “Session” [1 and 2] nested within the identity of the goats (“ID”) nested within “Group” [1 and 2] was included as a random factor, crossed with the identity and the sex of the goat producing the playback calls. Non-significant interactions between call number and valence were removed from the models [74]. The statistical significance of the factors was assessed by comparing the models with and without the factor included using a likelihood-ratio test. When an interaction effect was found, further post-hoc comparisons were performed using a Tukey HSD test.

Q–Q plots and scatterplots of the residuals of the model were checked visually for normal distribution and homoscedasticity. In order to meet the model assumptions, HR was log-transformed. HR (log-transformed) and RMSSD were input into LMMs fit with Gaussian family distribution and identity link function. The occurrence of looking towards the speaker did not meet the assumptions despite log-transformation. It was thus transformed to binary data (looked at the speaker = 1; did not look = 0) and input into a GLMM fit with binomial family distribution and logit link function.