These results are in line with another study showing an MMN-like response in a single chimpanzee (Pan troglodyte) [16] using the same two-tone odd-ball paradigm with scalp-recorded EEG. Together with the current experiment these studies provide evidence that ERP and MMN can be measured in both monkeys and apes.

The results show that physically identical deviant and standard stimuli elicited different responses. The average amplitude of the responses for both monkeys tended to be large in the frontal and central areas, similar to a human MMN [23] . Table 1 shows the mean amplitudes for monkey A and monkey Y, for each condition, stimulus type and electrode position. There was no indication of hemispheric differences.

An MMN-like response was found for the deviant responses as compared to physically identical standards in a time-window centered on the absolute maximum of the difference waves (D 500 –S 500 , D 1500 –S 1500 ; See Table 1 and gray-shaded windows in Figure 3 ).

For monkey Y the mean negative amplitude for deviant stimuli was significantly greater than that for standard stimuli. An ANOVA with the same factors revealed significant main effects in Type (F (1, 78644) = 206.474, P<0.001, η 2 = 0.003) and Electrode (F (2.336, 163892.2) = 181.928, P<0.0001, η 2 = 0.002, ε = 0.584). All interactions involving the Electrode factor were significant, namely Electrode × Stimulus (F (2.336, 163892.2) = 3.543, P<0.05, η 2 = 0.002, ε = 0.584), Electrode × Type (F (2.336, 163892.2) = 35.920, P<0.0001, η 2 <0.001, ε = 0.584) and Electrode × Stimulus × Type (F (2.336, 163892.2) = 6.034, P<0.0001, η 2 <0.001, ε = 0.584). Tukey unequal-N HSD post-hoc tests revealed no difference between F3 and F4 and Type having no effect on Pz.

For monkey A the ANOVA with factors Stimulus (500 Hz vs. 1500 Hz) × Type (Deviant vs. Standard) × Electrode (Fz vs. Cz vs. Pz vs. F3 vs. F4) revealed significant main effects in Type (F (1, 47395) = 104.555, P<0.001, η 2 = 0.002), Stimulus (F (1, 47395) = 12.045, P<0.001, η 2 <0.001) and Electrode (F (3.202, 151750.6) = 151.684, P<0.0001, η 2 = 0.002, ε = 0.800) as well al a Stimulus × Type interaction (F (1, 47395) = 31.476, P<0.001, η 2 = 0.003). All interactions involving the Electrode factor were significant, namely Electrode × Stimulus (F (3.202, 151750.6) = 2.723, P<0.05, η 2 = 0.002, ε = 0.800), Electrode × Type (F (3.202, 151750.6) = 51.294, P<0.0001, η 2 <0.001, ε = 0.800) and Electrode × Stimulus × Type (F (3.202, 151750.6) = 3.113, P<0.01, η 2 <0.05, ε = 0.800). Tukey unequal-N HSD post-hoc tests revealed no significant difference between F3 and F4 and and Type having no effect on Pz. Additionally the effect of Type was only marginally significant (df = 47395, P = 0.066) on 1500 Hz stimuli.

Figure 3 shows that the electrical brain responses elicited by the standard and deviant stimulus are different for both monkeys, with a morphology comparable to a human MMN, though with a shorter latency (peaks around 90 ms, instead of 150 ms) and slightly larger amplitude as compared to humans (around 10 µV, instead of 5 µV) [23] . These differences in latency and amplitude can be attributed to the anatomical differences between human and monkey brains (e.g., skull size, thickness, and the distribution of musculature [24] ).

In Experiment 1 we presented two rhesus monkeys with a sequence of sounds using a two-tone oddball paradigm (see Methods) to see whether an MMN-like response can be elicited.

The ERP responses to the omission (red lines in Figure 4 ) have a morphology comparable to human MMN (i.e. negative in early latencies). However, the polarity of the responses, probably due to inter-individual differences, were different in the two monkeys. Nevertheless, there is a small, but significant amplitude difference between the standard tone and the omission in a time range comparable to human MMN [21] , [26] suggesting that the omission was indeed detected.

Again for both monkeys the average amplitude tended to be large in the frontal and central areas, without any laterality effects.

For monkey A an ANOVA with factors Type (Omission vs. Tone) × Electrode (Fz vs. Cz vs. Pz vs. F3 vs. F4) revealed significant main effects in Type (F (1, 15708) = 32.906, P<0.0001, η 2 = 0.002) and Electrode (F (2.894, 45465.48) = 32.049, P<0.001, η 2 = 0.002, ε = 0.724). Tukey unequal-N HSD post-hoc tests revealed no significant difference between F3 and F4.

Mean amplitudes of responses elicited by standard and deviant stimuli were measured within a time window centered on the absolute maximum of the D minus S difference waves (see Table 2 and gray-shaded windows in Figure 4 ).

Figure 4 shows the electrical brain responses elicited by the standard (S) and the deviant (D; an omission). (Note that Figure 4 shows a time window with three repetitions of the standard tone, marked by rectangles at either side of the time line.) This allows for a comparison of the responses to the first and second tone after the omission. To test the effects of the omission we concentrate on the time range closest to the occurrence of the omission (see Methods; Table 2 ). In both monkeys the standard stimuli elicit a steady-state response with increased amplitude, phase-aligned to the stimuli. The amplitude of the response for the first tone after the omission (see Figure 4 ), most notably in monkey Y, neural activity increased after the short period of silence, but returns near to previous levels by the second tone. This could also be interpreted as a response marking the beginning of a rhythmic group [25] .

To study whether an MMN can be elicited in response to omissions as well, the same rhesus monkeys were presented with a tone sequence in which tones were omitted (i.e. replaced by silence, see Methods).

Rhesus Monkeys do not Detect ‘Loud Rests’, but are Sensitive to Rhythmic Grouping

In Experiment 3 we presented the same two rhesus monkeys with complex stimuli consisting of sound sequences based on a typical rock drum accompaniment pattern (see Figure 1).

The standard stimuli are four randomly presented and strictly metrical sound patterns (S 1 –S 4 ), with a deviant pattern (D) presented which the ‘downbeat’ omitted. Humans adults perceive the D pattern within the context of standards as if the rhythm was broken, stumbled, or became strongly syncopated for a moment [20]. We refer to the omission at the start of D as a ‘loud rest’ and the omissions in S 2 –S 4 as ‘silent rests’; Music theory suggests the former to sound ‘syncopated’ (a violation of a metric expectation) and the latter not [3].

A sequence repeating the D pattern 100% of the time was also presented (‘deviant-control’ or D control ) to allow controlling for acoustic effects on the ERP.

On the basis of the dissociation hypothesis, and the observation that monkeys apparently can not synchronize to a beat [7] but are sensitive to auditory timing [12], one might expect that monkeys are sensitive to rhythmic structure (interval-based timing) but not to metric structure (beat-based timing). This hypothesis predicts that omissions that play a role in rhythmic grouping [27] can be detected, as they mark the structure of a rhythmic pattern (as is the case in D control ), consequently not eliciting an MMN as they are part of the regularity. In contrast, the omissions that do not affect the rhythmic grouping will not be detected as part of a regularity, since they occur irregularly (as is the case in S 2 –S 4 and D) and hence may elicit an MMN.

In humans these differences in salience appear to be related to the coding of an internal representation of the rhythmic structure of a sound pattern [27], with the first sound after a relatively long inter-onset interval determining the rhythmic group structure [25]. If this is the case we expect the first sound of a repeated rhythmic pattern (D control ) – but not a randomly inserted pattern (D) – to elicit a response marking the beginning of a rhythmic group [25].

An alternative hypothesis is based on the observations made in human adults and newborns using the same stimuli and experimental paradigm [17], [18], [19], [20]. This hypothesis predicts that primates are not only able to sense rhythmic grouping, but are also able to detect the regular beat that is induced by a varying rhythmic stimulus. The perception of a ‘loud rest’ – a violation of a temporal expectation reflected by an MMN-like signal– can serve as evidence for the presence of a strong metric expectation [3]. This hypothesis predicts an large and early MMN for the omission in the deviant (D, containing a ‘loud rest’), but no or considerably smaller MMN for the omissions in the standard (S 2 –S 4, containing ‘silent rests’). And since the omission in the deviant-control (D control ) is expected – the pattern is presented repeatedly –, there as well no MMN is predicted. If these three aspects are observed (as they were found in human adults and newborns [20]), they suggest that a regular beat is extracted from the auditory stimulus. This could be interpreted as evidence against the vocal learning hypothesis.

Figure 5 shows that the electrical brain responses elicited by omissions in the standard (S 2 –S 4 ) and deviant-control (D control ) are relatively flat, and different from the deviant (D), with the latter eliciting a more pronounced negative peak, most notably in monkey Y. This suggest a similar result as was found human adults and newborns. However, the ERP response to S 1 (dotted black line in Figure 5) is not different from that in response to D (solid red line in Figure 5), while D contains an omission and S 1 does not. This seriously weakens the interpretation that the monkeys are able to extract the beat from the stimulus.

Mean amplitudes of responses elicited by standard and deviant stimuli were measured within a time window centered on the absolute maximum of the D minus S 2–4 difference waves (see Table 3 and the early gray-shaded windows in Figure 5).

For monkey A an ANOVA with factors Type (S 1 vs. S 2 vs. S 3 vs. S 4 vs. D control vs. D) × Electrode (Fz vs. Cz vs. Pz vs. F3 vs. F4) in the early window (105–155 ms) showed significant main effects in Type (F (5, 37269) = 89.318, P<0.0001, η2 = 0.012) and Electrode (F (3.006, 112063.6) = 11.221, P<0.0001, η2<0.001, ε = 0.752), as well as a significant Electrode × Type (F (15.034, 112063.6) = 7.475, P<0.0001, η2 = 0.001, ε = 0.752) interaction. Tukey unequal-N HSD post-hoc tests were performed. All channels differed from each other (df = 149076, P<0.05) except for Cz, F3 and F4 not differing from each other. All Types differed from each other (df = 49071, P<0.01), except D, D control and S 1 from each other and S 3 from S 4 .

For monkey Y an ANOVA with factors Type (S 1 vs. S 2 vs. S 3 vs. S 4 vs. D control vs. D) × Electrode (Fz vs. Cz vs. Pz vs. F3 vs. F4) in the early window (73–123 ms) showed significant main effects in Type (F (5, 49071) = 74.323, P<0.0001, η2 = 0.008) and Electrode (F (2.412, 118344.7) = 48.423, P<0.0001, η2 = 0.001, ε = 0.603), as well as a significant Electrode × Type (F (12.059, 118344.7) = 9.479, P<0.0001, η2 = 0.001, ε = 0.603) interaction. Tukey unequal-N HSD post-hoc tests were performed. All channels differed from each other (df = 196284, P<0.05) except for F3 and F4 and Fz not differing from F3. All Types differed from each other (df = 49071, P<0.001), except D from S 1 ; S 2 from S 3 and S 4 from D control also the difference between S 3 and S 4 was less significant (P<0.05) than other differences.

So in short, while there is a difference between D (containing a ‘loud rest’) and S 2 –S 4 (containing ‘silent rests’) and as such evidence in support of beat perception, there is no difference between D and S 1 : a pattern with and without an omission. This makes the interpretation that the monkeys are detecting the beat (by distinguishing ‘loud rests’ from ‘silent rests’) less likely and leads to the alternative hypothesis that the monkeys are solely detecting rhythmic groups [21]–[22]: the first note of a rhythmic group (separated by an omission) eliciting an MMN-like response in D control (but not in D).

Mean amplitudes were measured in a late time window just after the first tone (after 200 ms), centered on the absolute maximum of the D minus D control difference waves (see Table 4 and the late gray-shaded windows in Figure 5).

For monkey A the ANOVA with the same factors on the late window (214–264 ms) showed significant main effects in Type (F (5, 49071) = 71.134, P<0.0001, η2 = 0.009) and Electrode (F (2.975, 110879.9) = 35.850, P<0.0001, η2<0.001, ε = 0.744), as well as a significant Electrode × Type (F (14.876, 110879.9) = 19.880, P<0.0001, η2 = 0.003, ε = 0.744) interaction. Tukey unequal-N HSD post-hoc tests were performed showing that D was significantly different from D control (df = 37269, P<0.001) while not differing from S 1 . All channels differed from each other (df = 149076, P<0.001) except for Cz and F4.

For monkey Y the ANOVA with the same factors on the late window (220–270 ms) showed significant main effects in Type (F (5, 49071) = 195.816, P<0.0001, η2 = 0.020) and Electrode (F (2.412, 118344.7) = 283.270, P<0.0001, η2 = 0.006, ε = 0.604), as well as a significant Electrode × Type (F (12.059, 118344.7) = 47.789, P<0.0001, η2 = 0.005, ε = 0.604) interaction. Tukey unequal-N HSD post-hoc tests were performed showing that D was significantly different from D control (df = 49071, P<0.001) while not differing from S 1 . All channels differed from each other (df = 196284, P<0.001) except for Pz and F4.

These results suggests that the monkeys are actually sensing surface-level rhythmic grouping (i.e. detecting the start of a repeating rhythmic group) instead of the induced beat (i.e. detecting a regular pulse in a varying rhythmic pattern). As such, we have to conclude that rhesus monkeys, contrary to what has been shown for human adults and newborns, show no sign of representing the beat in music, but apparently do represent rhythmic groups.