A brain region for social cognition Monkeys recognize social interactions and their meanings quickly and effortlessly. Little is known about the neural circuitry that underlies this understanding. Sliwa and Freiwald scanned monkey brains as the monkeys watched static or moving stimuli. A subset of brain areas was exclusively active during monkey-monkey interactions, as opposed to physical interactions between two objects. This network shares some of its components with the monkey mirror neuron system mapped previously by others and with a possible homolog of the human network involved in the theory of mind. Science, this issue p. 745

Abstract Primate cognition requires interaction processing. Interactions can reveal otherwise hidden properties of intentional agents, such as thoughts and feelings, and of inanimate objects, such as mass and material. Where and how interaction analyses are implemented in the brain is unknown. Using whole-brain functional magnetic resonance imaging in macaque monkeys, we discovered a network centered in the medial and ventrolateral prefrontal cortex that is exclusively engaged in social interaction analysis. Exclusivity of specialization was found for no other function anywhere in the brain. Two additional networks, a parieto-premotor and a temporal one, exhibited both social and physical interaction preference, which, in the temporal lobe, mapped onto a fine-grain pattern of object, body, and face selectivity. Extent and location of a dedicated system for social interaction analysis suggest that this function is an evolutionary forerunner of human mind-reading capabilities.

Recognizing physical objects and intentional agents, their actions, and their interactions is essential for understanding the world around us (Fig. 1A) (1–3). Monkeys recognize social interactions and their meaning quickly and effortlessly; they understand grooming, play, and fight, infer social rank from interactions, and use this knowledge to recruit allies (4, 5). Monkeys also understand that colliding objects exchange forces and make use of gravity and trajectory cues to search for falling food (6, 7). Understanding interactions is a core cognitive component in primates (3, 8). Yet, little is known about the neural circuitry that underlies interaction processing. To chart the brain regions that process social and physical interactions, we presented naturalistic videos during whole-brain functional magnetic resonance imaging (fMRI) to four rhesus monkeys (supplementary materials, materials and methods). We used six main types of videos: (i) social interactions between monkeys, (ii) physical interactions between objects, (iii) monkeys engaged in independent goal-directed behaviors, (iv) objects moving independently, (v) nonacting monkeys, and (vi) stationary objects, along with low-level motion control and natural complex scene videos (Fig. 1B, movies S1 and S2, and materials and methods). Real-world videos were chosen to maximize cognitive engagement. For data analysis, we controlled for eye movements and visual motion energy through nuisance regression in the generalized linear model (GLM) and inspected several other behavioral features (figs. S1 and S2 and materials and methods). Brain activity for agents’ shape, actions, and social interactions and for objects’ interactions was defined as compared with matched control conditions with conjunction analyses (Fig. 1C and materials and methods).

Fig. 1 Task design and hypothesis. (A) Hypothesized cognitive steps occurring when watching social interactions. (i) Processing agents’ shape, (ii) processing action, (iii) processing interaction. (B) Classes of videos used for stimuli and controls that monkeys could freely watch during the experiment. (C) Schematics of the contrasts (arrows between nonblurred images) used in conjunction to identify brain activity related to four main conditions: agents, action, social interaction, and physical interaction. Pictures are presented in the same order as in (B).

Temporal and prefrontal cortices contain areas selective for specific categories of visual shapes such as faces, bodies, or objects (9–11) and for shapes of particular categories set in motion (12–14). Because interactions reveal hidden features of agents and objects (for example, object mass and material during collision), some agent/object category or agent/object-motion selective areas might be specifically engaged by interactions. We therefore first mapped canonical face, body, and object patches with a standard localizer (fig. S3A) (10) and then measured responses to naturalistic videos. Videos of inactive, active, and interacting monkeys all engaged classical face and body areas in the superior temporal sulcus (STS) and prefrontal cortex [P < 0.01, false discovery rate (FDR) corrected for multiple comparisons] (Fig. 2A and figs. S5 and S6) (14), and naturalistic videos of still, moving, and colliding objects activated regions overlapping with object patches. Region of interest (ROI) analysis revealed that category-selective areas differentiate between interaction types: agent-object interactions (agents performing goal-directed actions directed at objects compared with nonacting agents) recruited body patches but not face patches [with the exception of area prefrontal orbital (PO) and face-motion area middle dorsal (MD) (13); area middle fundus (MF) showed even less activation] (Fig. 2B); social interactions recruited both face and body patches (Fig. 2, C and D); and object-object interactions selectively recruited part of object patches only [for all, Student’s t tests, Holm-Bonferroni corrected for multiple comparisons by using family-wise error (FWE) rate at P < 0.05] (Fig. 2D). Thus, interaction-preference followed the fine-grain spatial organization for object categories: Even directly neighboring face and body areas differentiated between interaction types composed of the same category elements, faces and bodies.

Fig. 2 From agents to actions to interactions. (A) Statistical map of enhanced activation for the Agents contrast. Shown is an inflated F99 cortical model of the right hemisphere, with dark gray regions representing sulci and light gray regions representing gyri (P < 0.01, FDR-corrected for multiple comparisons). The fMRI signal is enhanced in regions of the STS and frontal cortex that overlap with face and body patches independently identified (fig. S3A). (B) Agent-object interactions assessed with the contrast (actions > agents) show enhanced activation of the body but not face patches. Bar plots are normalized signal changes for the contrast. Error bars represent SD (*P < 0.05, **P < 0.01, ***P < 0.001; all other comparisons are not significant; Holm-Bonferroni–corrected for multiple comparison). (C) Same as (B). Agent-agent interactions assessed with the contrast (social interactions > actions) show both body and face patches enhanced activation. (D) Same as (B). Object-object interactions assessed with the contrast (physical interactions > social interactions) show that only object patches are more activated by physical than social interactions. (E) Statistical map of enhanced activation for the action contrast [same conventions as in (A)]. Significant activation is found in the parietal and premotor cortices in addition to the STS, in regions that overlap with classical MNS areas independently identified (fig. S3B). (F) Brain activity for social interactions compared with agents and landscapes and scramble controls shows a similar overlap. (G) Brain activity for physical interactions compared with objects moving independently and landscapes and scramble controls shows a similar overlap. (H) Overlaid activations for action, social interaction, and physical interaction contrasts showing MNS recruitment by different types of interactions. (I) Statistical map of enhanced activation for the social interaction contrast [same conventions as in (A)] showing an extended network of regions significantly active for watching social interactions. Significant activation is found in the medial and dorsomedial prefrontal cortices (mPFC and dmPFC), in the ventrolateral prefrontal cortex (VLPFC), in the temporal pole (TP), and in parietal areas (7a), in addition to activation in the STS and in the parietal and premotor cortices. (J) Same map presented on anterior (“A”), dorsal (“D”), medial (“M”), ventral (“V”), and posterior (“P”) views of the flattened F99 cortical model of the right hemisphere, with black regions signifying not-represented noncortical parts of the brain. (K to N) Corresponding activations in subcortical areas (Amg BL/BM, BM, and Ce; Cd, caudate) displayed on coronal slices of the MNI-Paxinos template brain in radiological convention (left in figure is the right side of the brain). Coordinates are relative to interaural line. Cortical results are masked for display purposes.

Social and physical interactions engaged two additional areas, none of which were activated by noninteracting stimuli (agents or objects). These areas were located outside regions selective for faces, bodies, or objects discussed above: One area was located in the posterior portion of anterior intraparietal area (pAIP), and one in premotor area F5 (Student’s t test, P < 0.01 FDR voxel-wise) (Fig. 2E and figs. S5 and S6). Both areas were activated by agent-object interactions (Fig. 2E) but also by object-object interactions, and even more strongly by social interactions (Fig. 2, F and G). Both areas did not respond to movies of inactive monkeys (Fig. 2A) nor to independently moving objects, further emphasizing their interaction-selectivity. Social and physical interaction representations overlapped in F5 but occupied neighboring locations within pAIP (Fig. 2, F to H). The anterior part of pAIP appeared even more activated by physical than any other type of interaction (Fig. 2H). The parieto-frontal cortex contains the mirror neuron system (MNS) (15). The classical MNS can be mapped in monkeys with fMRI by using videos of humans grasping objects (16). When mapping the MNS in this way (fig. S3B and movie S3), it colocalized with pAIP and F5 interaction areas (Fig. 2, E to H). The MNS is thus engaged by three very different kinds of interactions, including the one it had been known for. If the MNS supports understanding of another agent’s actions upon objects (15), then by extension these findings imply a general role of the MNS for social and physical-world understanding.

As social-cognitive species, primates understand the social interactions of others (4, 5). Social interactions, but not physical ones, activated a large set of brain regions beyond the category-selective networks and the MNS (Fig. 2, I to N). This social-interaction network (SIN) included parts of the medial prefrontal cortex (mPFC) and anterior cingulate cortex (ACC) (areas 32 and 10mr, and 24b, respectively), dorsomedial prefrontal cortex (dmPFC) (areas F6 or Pre-SMA, 8Bm, and 9m), a temporo-parietal cluster (areas TPOc and 7a), parts of the ventro-lateral prefrontal cortex (vlPFC) (ventral part of F5, ventral part of 44, the posterior part of 47 and 12, and OPro), temporal pole regions (TPOi, STS areas 1 and 2, and TPpro), the perirhinal cortex (36R), dorsal STS areas [social posterior dorsal (sPD) and social anterior dorsal (sAD), located dorsally to posterior lateral (PL) and anterior fundus (AF) face patches, and anterior dorsal (AD), a face patch located dorsally to face patch AF], and cortical and subcortical systems engaged in reward, valence, and emotional processing [caudate; amygdala; and areas 10o, 11l, and 14r of the orbitofrontal cortex (OFC)] (figs. S5, A and D, and S6, A and D).

Large parts of the SIN network were exclusively selective for social interactions and did not respond to any other stimulus condition in the context of the present design. Areas of this exclusively social interaction network (ESIN) included a cluster in mPFC, ACC, and dmPFC; a cluster in vlPFC; area 7a in the inferior parietal lobule; and OFC areas 10o and 14r (Fig. 3, A and C, and figs. S5, A and E, and S6, A and E). The ESIN was even deactivated in all but the social interaction conditions (Fig. 3D). We did not find such exclusivity of functional specialization for any other stimulus category anywhere in the brain. However, the joint characteristic of the ESIN—social cognition focus and general deactivation during visual stimulation—bear resemblance to the human theory of mind (ToM) and the human default mode network (DMN) (17, 18). Curiously, ToM and DMN intersect in the human brain at regions of quite plausible homology to ESIN areas of the macaque brain. Thus, the macaque ESIN shares functional and anatomical characteristics of human ToM and ESIN (fig. S4).

Fig. 3 A network of regions exclusively active for watching social interactions in the monkey brain. (A) Statistical map of enhanced activation for the social interaction contrast, with same conventions as in Fig. 2. Black lines outline areas exclusively activated by videos of social interactions only and deactivated or not activated by any other visual stimulation. (B) Statistical map for regions exclusively activated by videos of social interactions (orange to yellow) and overlaid for comparison statistical map for the Action contrast (dark to light blue), on flattened and inflated views, with same conventions as in Fig. 2. Activation for these two conditions appear close but are segregated in the parietal and VLPFC. Black circles indicate location of example ROIs defined with the social interaction contrast (7a, 32, and 9m are example areas of the ESIN, and TPv is an example area of the SIN). (C) Bar plots represent percent of fMRI signal change in those ROIs for all stimulus conditions. Conditions’ color code is presented at the bottom; pictures are presented in the same order as in Fig. 1B. Error bars represent SD (*P < 0.05, **P < 0.01, ***P < 0.001; all other comparisons are not significant; Holm-Bonferroni–corrected for multiple comparison).

We found three networks engaged in interaction analyses, each with distinct functional characteristics and internal organization. Are there overarching principles of organization for all areas processing interactions? We conducted principal component analysis (PCA) across the different movie categories on 43 regions of interest (Fig. 4, A and C). The areas in this space could be grouped into object, body, and face patches and classical MNS, SIN (without the ESIN), and ESIN. This analysis, as well as an analysis of correlation distances to the classical MNS and ESIN (Fig. 4B), showed a greater similarity of face patches to the ESIN than any to other ROIs. Because of their functional homogeneity, we then performed PCA in each of the six aforementioned groups of brain areas across the same stimulus conditions (Fig. 4D). SIN and ESIN separated the social interaction condition from all others along PC1. They did not differentiate the other two agency conditions (“acting” and “nonacting”) along this dimension. Both properties were shared by the face patches (Fig. 4D). The body patches, instead, separated all three agency conditions along PC1 and the three object conditions jointly along PC2 and PC1. This functional similarity suggests that face patches are putative entry points to the SIN and ESIN. Body and object patches turned out to be functionally related to the MNS (Fig. 4, A, B, and D). It has been proposed that the MNS might provide inputs to the ToM network in humans (19). Functionally, however, the MNS differed substantially from the SIN and ESIN and was more similar to object and body patches, whereas the SIN and ESIN were closer to face areas (Fig. 4, A, B, and D). Therefore, interaction analysis by two streams—already segregating inside the STS and feeding into the classical MNS and SIN, respectively—with different functions is the most plausible model for the organization of high-level world-processing in the primate brain.

Fig. 4 Multiple social networks in the primate brain. (A) PCA of each area as a function of its mean activity to each condition of the main paradigm. Colors and dashed contours indicate each brain area’s identity. Areas that are defined independently of the main task (object patches, face patches, object patches, or MNS) tend to cluster separately. A gradient of similarity mainly along the first principal component (PC1) goes from the object patches to the MNS to the body patches to the SIN and the ESIN. (B) Correlation distance of the object patches (green), face patches (yellow), and body patches (magenta) to the MNS (from left to right; blue arrow) and to the ESIN (from bottom to top; black arrow) based on their mean activity to the conditions of the task. Object and body patches are more closely related to the MNS, whereas face patches are more closely related to the ESIN. (C) Anatomical distances between areas displayed on a flat map of the monkey brain, provided for comparison with the PCA showed in (A), demonstrate that STS areas that appear interleaved anatomically are demixed functionally in response to the task conditions. (D) PCAs and dendrograms representing each condition (color code and legends to the right) as a function of its mean activity in each patch of the group. Spatial arrangements and tree clustering show a progressive separation of the physical interactions and social interactions respectively moving from the object patches (left) to the ESIN (right).

Visual analysis of interactions is a computationally daunting problem: Each interaction generates a complex spatiotemporal flow pattern, each interaction category consists of many different such patterns, and even the smallest change to a pattern can change an interaction’s meaning. Yet, primates understand the meaning of interactions effortlessly. To meet these computational challenges, which are even more demanding than those of invariant object recognition, a neural machinery at least as extensive as that for object recognition seems necessary. Our finding that large parts of shape-selective STS are interaction-selective and that the fine-grain pattern of interaction selectivity closely follows that of shape selectivity provides a possible answer to the puzzle of where visual interaction analysis takes place: The same machinery may perform both shape and interaction analyses, possibly parsing different results into MNS and SIN. This organization is markedly different from how motion activates the same region (13, 14) and reveals how deeply interaction analysis is ingrained in visual circuitry.

The MNS is thought to add depth to the processing of agent-object interactions by uncovering motor intentions behind observed object-directed actions and to do so through a process of simulation (15). Our results of broad MNS involvement across physical and social interactions can be parsimoniously interpreted by extension; the MNS would uncover through causal model simulations the hidden properties of physical objects and intentional agents and automatically reveal the wide set of affordances [action possibilities (20)] they offer for online engagement. The MNS would, according to this scenario, not just function in motor intention processing but play a major role in supporting general core cognitive functions of intuitive physics and psychology.

We report the existence of large regions of the monkey brain exclusively engaged in social interaction analysis. The monkey ESIN parallels properties of DMN (18) and ToM systems in humans (17) and even occupies locations very similar to regions of intersection of human DMN and ToM. Because of the known role of ToM areas in social theory–driven deductions (17, 21), some parts of the monkey ESIN might play a role in elaborating, storing, and comparing species-specific socio-emotional scripts stipulating rules of social conduct (4, 5), whereas other parts might deduce inferences about other agents’ mental, emotional, and intentional states that explain their observed interactions.

The results of this study reveal a new dimension of tuning and functional organization of the STS, redefine the role of the mirror neuron system, and uncover the existence of a new high-level social cognition network with deep evolutionary heritage.

Supplementary Materials www.sciencemag.org/content/356/6339/745/suppl/DC1 Materials and Methods Figs. S1 to S6 References (22–54) Movies S1 to S3