Participants

134 healthy adults were initially considered for DCM analysis (89 female; all but one participant were between 22-35 years old with a mean age of approximately 30.5 years, see Van Essen et al.16 for why reporting of exact ages would endanger anonymity of participants). Two participants were excluded: one because of missing onset files; another participant did not show activation in the one of the predicted areas during one of the sessions and thus could not be included in the DCM analysis (see Volume of Interest Extraction section below). Thus, data were analysed from 132 participants. The experiments were performed in accordance with relevant guidelines and regulations and all experimental protocol was approved by the Institutional Review Board (IRB) (IRB # 201204036; Title: ‘Mapping the Human Connectome: Structure, Function and Heritability’). Written informed consent was obtained from all participants. Our data analysis was performed in accordance with ethical guidelines of the University College London ethics committee.

fMRI data acquisition

See Ugurbil et al.17 for a detailed description of the HCP fMRI acquisition protocols. The following abbreviated overview is taken from Barch et al.18. Briefly, whole-brain EPI acquisitions were acquired with a 32 channel head coil on a modified 3 T Siemens Skyra with TR = 720 ms, TE = 33.1 ms, flip angle = 52°, BW = 2290 Hz/Px, in-plane FOV = 208 × 180 mm, 72 slices, 2.0 mm isotropic voxels, with a multi-band acceleration factor of 819,20, as cited in ref. 18. Two runs of the task were acquired, one with a right-to-left and the other with a left-to-right phase encoding18.

fMRI data preprocessing

We used the “minimally processed” Q2 release of the HCP data for this study (Functional Pipeline v2.0; Execution 1). These time series data were preprocessed using tools from FSL and FreeSurfer to implement gradient unwarping, motion correction, fieldmap-based EPI distortion correction, brain-boundary-based registration of EPI to structural T1-weighted scan, non-linear (FNIRT) registration into MNI152 space and grand-mean intensity normalization18. See Glasser et al.21 for a detailed description of fMRI preprocessing of the HCP.

Experimental design

The following abbreviated overview is taken from Barch et al.18. A well-validated task was used to probe animacy and agency detection. The stimuli have been shown to generate robust task-related activations that are reliable across participants in brain regions associated with social cognition (Castelli et al., 2000, Castelli et al., 2002, Wheatley et al., 2007 and White et al., 2011 as cited in ref. 19). Participants viewed short video clips (20 s) of objects (squares, circles, triangles) either interacting in some way (Animate motion), or moving mechanically (Inanimate motion)18. The basic visual characteristics in terms of shape, overall speed and orientation changes were matched between stimulus categories13. After each video clip, participants rated the video by choosing from three different options, depending on whether the objects contained a social interaction (an interaction in which the shapes appear to be taking each other's feelings and thoughts into account), Not Sure, or No interaction (i.e., there is no obvious interaction between the shapes and the movement appears random). Each of the two task sessions comprised 5 video blocks (2 Animate and 3 Inanimate in first session, 3 Animate and 2 Inanimate in the other session). Note that even though there were an unequal number of videos per conditions within each session, all our analyses took into account the data of both sessions at once and thus our effects were not influenced by session specific effects. There were also 5 fixation blocks (15 s each); each video block was followed by a fixation block. Of note, the video clips were shortened to 20 s (the Castelli et al.13 clips were originally 40 s) by either splitting the videos in two or truncating them. A pilot study by Barch et al.18 confirmed that participants rated these shorter videos similarly. Figure 1 shows stills from an example of an Animate motion video.

Figure 1 Example of “Theory of Mind” animation: The Big Triangle coaxing the reluctant Little Triangle to come out of an enclosure (participants do not see captions; stimuli and description adapted from13.) Full size image

fMRI Data analysis

fMRI data were further analysed by us, using Statistical Parametric Mapping (SPM12b, www.fil.ion.ucl.ac.uk/spm). The 2 × 2 × 2 mm minimally preprocessed images were spatially smoothed with a 4-mm Gaussian kernel to increase the signal to noise ratio, while retaining sufficient anatomical acuity for extracting visual sensory areas. We did not slice time correct the data, nor did we later specify different acquisition times in the DCM model22, as simulated DCM data has been shown to cope well with slice timing differences of up to 1 s22 and our TR was 0.72 s. The time series were modelled with boxcar regressors based on two types of task block: Animate motion and Inanimate motion. In order to use the ‘All motion’ contrast as a single input to the DCM and the Animate – Inanimate motion contrast as a modulator of effective connectivity (and not use both animate and inanimate motion separately as inputs), we created appropriate parametric regressors23. These regressors were orthogonal to each other (the first regressor was All Motion – implicit baseline and the second was Animate – Inanimate motion). In other words, the first regressor was non-specific motion effects, relative to baseline, while the second modelled animacy effects during motion. In addition, we included constant session effects. Appropriate stimulus functions were convolved with the canonical hemodynamic response function to form regressors for standard SPM analyses. Together with regressors representing residual movement-related artifacts and their derivatives, these regressors comprised the full (general linear) model (GLM) for each session. A group level ANOVA was performed to identify significant regional effects for the All Motion contrast and a contrast for Animate – Inanimate motion. All analysis scripts are available online (https://github.com/HaukeHillebrandt/SPM_connectome); this ensures the analyses reported below can be replicated and extended with the openly available HCP data (see discussion).

Dynamic causal modelling

DCM estimates the experimental modulation of (intrinsic) self-connections or (extrinsic) forwards and backwards connections between brain regions that are active during a particular task in a directional manner. This enables one to infer whether experimental manipulations affect top-down, bottom-up influences or both. We used a novel post-hoc model selection routine10,11 to investigate all possible dynamic causal models and tested the hypothesis that the forward connection, which convey sensory prediction error signals, are selectively more engaged than the top-down backward connection, when people view animate movement compared with inanimate movement. Specifically, we quantified the effective connectivity between V5, which is responsive to any type of motion (animate and inanimate) and the pSTS, which is selectively activated when participants view animate motion13.

Specification of dynamic causal models

We created and estimated DCMs24 with DCM12 (version 5370) as implemented in SPM12b. The DCMs were based on the VOIs (volumes of interest) described above (V5 and the pSTS) and used the main effect of Animate – Inanimate motion to modulate the connections between these two regions (see Figure 2A). All DCMs were deterministic (as opposed to stochastic for DCMs without experimental input, see25), bilinear (as opposed to nonlinear DCMs, where activity between two regions is modulated by a third region, see26), two-state models27, with mean-centred inputs. Two-state DCMs differ from one-state models in that the activity in one brain region is modelled with both excitatory and inhibitory neuronal populations. This allows one to use positivity constraints that enforce extrinsic (between region) connectivity to be excitatory, while self or recurrent (intrinsic) connections are treated as inhibitory27. It is important to note that the hemodynamics in the current DCM are a function of excitatory states only – and the contributions to the BOLD signal from the inhibitory states are expressed indirectly, through interactions, with excitatory populations, at the neuronal level27. Note that the fixed and modulatory parameters were always scale parameters (exponentiated) to ensure positivity as per convention for two-state DCMs, so that the extrinsic connections were always excitatory27. Scale parameters of two-state DCMs are thus higher than parameter estimates from one-state DCMs. Our unexponentiated modulatory parameter estimates ranged from -2.7 to 3.9 Hz, similar to one-state DCM parameter estimates reported in other studies11,28. While the two state-DCMs use exponentiated scale parameters that introduce positivity constraints and are more plausible to interpret, these values are likely not normally distributed and heteroscedastic, because the exponential function is the inverse function of the natural logarithm (which is commonly used to transform data to meet the assumption of a normal distribution, see29,30). Thus, we used the original unexponentiated non-scale parameter estimates for all statistics, but the exponentiated parameter estimates for plots and interpretation.

Figure 2 (A) The winning model: The full model was the winning model, with the highest evidence; in this model all connections were modulated by the Animate – Inanimate motion modulator. The driving input, the ‘All motion’ contrast, entered into V5 and the pSTS. Wider lines represent stronger modulation or input relative to its comparison: V5 received more input (Mean parameter estimate = 0.96) than the pSTS (Mean parameter estimate = −0.06) and the Animate – Inanimate motion contrast modulated the forward connection from V5 to the pSTS significantly more strongly (Mean parameter estimate = 1.22) than the backward connection (Mean parameter estimate = 0.16) and the (inhibitory) self-connection of the pSTS (Mean parameter estimate = −0.19) less strongly than the self-connection of V5 (Mean parameter estimate = −0.03). This means that the (inhibitory) self-connection in the pSTS decreased more than the (inhibitory) self-connection in V5. In other words, since the (inhibitory) self-connection was decreased more towards zero, the pSTS activation is modulated by animacy. (B) VOIs used in the DCM analyses based on the mean of all participants' VOI centre coordinates and illustration of the modulatory connectivity between them. The first VOI, based on the peaks of the All Motion contrast, was in the MT+/V5 (44 −64 4; circled in blue). The other VOI was activated by the conjunction of the All motion contrast and the Animate – Inanimate motion contrast [All Motion & Animate – Inanimate motion] and was located in the pSTS (54 −50 16, circled in green). The colour of the line represents the source of the strongest bidirectional modulatory connection. Full size image

Post-hoc Bayesian model selection

Until recently, it was computationally expensive to estimate a large number of models with DCM31, especially with a large number of participants, as in the current study. A model space with n nodes has 2n×n permutations of connections that can be turned on or off, which can be modulated by different experimental manipulations, leading to a combinatorial explosion10. We used a new method to find the model evidence for all possible models by only inverting (estimating) the full model10,11,32 as a prelude to identifying the best (reduced) model. This approach fits the full model – with all free parameters – to the data. The full model generally contains all possible intrinsic forward and backward connections and all inputs and modulations of these connections by experimental factors. One then approximates the evidence for all possible reduced models, which have fewer parameters and are therefore nested within the full model. This is achieved by setting the prior variance over all combinations of free parameters (to zero). Based on the posterior density over the parameters of the full model, the approximate evidence for each reduced model can then be obtained using standard analytic results10,11. These post-hoc estimates of model evidence and the (conventional) free energy approximation (following inversion of reduced models) have been shown to yield very similar results with both simulated and real data11.

First, we used this post hoc model selection procedure11 to identify the best model out of all possible connection architectures with Bayesian model selection (BMS). Second, we looked at family-level inferences over all possible models showing whether fixed connections existed and whether they were modulated. This is done by computing a joint posterior probability density over parameter estimates for a group of participants, by using the posterior from one participant as the prior for the next participant, whose posterior then serves as the prior for the next participant and so on33,34. The posterior probability is the probability that a model (or family of models) provides the best explanation for the measured data across participants35. The probabilities for all analyses were pooled in a fixed effects fashion, because we assumed that the underlying model structure did not vary across participants. The post-hoc optimisation also provides parameter estimates for individual participants that can be compared with conventional frequentist statistics34. Thus we present the simple average parameter estimates for the model with the highest evidence (the winning model) to elucidate the quantitative nature of the connection i.e. how much a connection is modulated24. The software implementation of the post-hoc optimisation for DCM can be found in the SPM function spm_dcm_post_hoc.m.

Volume of Interest Selection

To identify and summarise regional responses for further dynamic causal modelling we used standard procedures37. Timeseries from VOIs associated with the above contrasts were summarised using the SPM12b Eigenvariate toolbox: we extracted each participant's principal eigenvariate around the participant-specific local maxima activations nearest to the peak voxel of the group (between subject) GLM analysis (see Table 1 and 2). The radii of the VOI spheres were 6 mm and the search radii for local maxima from the group analysis were restricted to 20 mm. All voxels contributing to the eigenvariates were significant at p < 0.05 uncorrected and adjusted at p < 0.05 for the effects of interest (i.e. only for those regressors that were used in the DCMs for input or modulation). In order to replicate the results across sessions and hemispheres, we created separate DCMs for each hemisphere and each of the two sessions (four DCMs overall), which were then analysed together with repeated measures ANOVAs (see Figure 2B for a schematic of the model and Figure 4 for the results aggregated across participants, hemispheres and sessions). For each model, the first volume of interest (VOI) was based on maxima in the most active cluster of the All motion contrast (which was Animate and Inanimate motion over the implicit fixation baseline). These maxima were assigned by the SPM anatomy toolbox38 to MT+/V5 (sometimes called human occipital lobe area 5 (hOC5); right: 44 −64 4; left: −44 −74 4; see Table 1 and 2 for GLM results. Figure 3A and 3B show brain maps of the means of the extracted voxels of individual participants.). V5 was the most active region in our All Motion contrast and has been shown to be highly sensitive to visual motion12. The second VOI was extracted at the local maxima of the most active clusters in each hemisphere based on the results of a conjunction analysis39,40. A conjunction of activations allows one to infer a co-occurrence of several effects in one area40: an activation map of a conjunction analysis will show those voxels as significant that would be significant in the two conjoined effects. The conjunction used here was the effect of the contrast [All Motion > Fixation Cross] & (logical AND) the contrast [Animate > Inanimate Motion] – i.e. [All motion > Fixation cross & Animate – inanimate motion]. The conjunction was performed to consider areas more active in Animate vs. Inanimate motion, but only in motion sensitive areas (activated by any type of motion). We used the more conservative test, testing against the conjunction null, instead of testing against the global null40. The second VOI was extracted from the pSTS (sometimes called inferior parietal cortex (IPC; more specifically PGa and PFm), right: 54 −50 16; left: −56 −52 10). The pSTS was highly active bilaterally: the peaks were local maxima in the most active cluster of each hemisphere with t-values above 8. The pSTS has been frequently implicated in animate motion processing (see discussion). Note that V5 was not significantly more active in this contrast, which might suggest that the stimuli were indeed well matched in terms of low-level motion properties. Finally, V5 and the pSTS have been shown to have strong (and reciprocal) anatomical connectivity41.

Table 1 GLM results. All motion over implicit baseline contrast Full size table

Table 2 GLM results. Conjunction analysis of All motion & Animate – Inanimate motion Full size table

Figure 3 (A) Brain map showing the left and right V5 VOIs. The colour gradient bar indicates how many participants had the mean of their extracted voxels at a given location. (B) Brain map showing the left and right pSTS VOIs. The colour gradient bar indicates how many participants had the mean of their extracted voxels at a given location. Full size image

Figure 4 (A): Probability densities functions of parameter estimates for individual participants showed how strongly (self-)connections were modulated by animacy across participants, hemispheres and sessions. Upper Panel: The forward connection from V5 to the pSTS was more strongly modulated by animacy than the homologue backward connection. Lower Panel: The intrinsic self-connections of the pSTS was significantly lower than V5 and one can clearly see that the inhibitory self-connection in the pSTS has decreased towards zero consistently more than the inhibitory self-connection in V5. In other words, with the pSTS was no longer inhibited, this caused the pSTS activation observed in the Animate > Inanimate contrast. Figure 4 (B): Here the same data as in Figure 4A are plotted, showing all the different data points to highlight the consistency of the results. Full size image

Specification of dynamic causal models

Our particular interest was in the effect of animate motion processing on connections among sources in the distributed visual hierarchy. In particular, we wanted to know whether the effect of animate motion processing could be explained by changes – mediated by perceptual set (see for instance ref. 42) – in intrinsic and extrinsic connections. Furthermore, if these changes were in extrinsic connections, were they in the forward or backward connections? To answer these questions, we used Bayesian model comparison (BMS) of reduced models following inversion of a full model specified as follows: The full model comprised reciprocal connections between the motion sensitive area V5 and the pSTS. The driving input into the model – represented by the DCM.C matrix24– was the effect of All motion (Animate and Inanimate motion movement, modelled as a single regressor for both types of motion). This driving motion input entered either the posterior, lower region, V5 and modelled extrageniculate input, or the higher cortical node pSTS. Our hypothesis was that V5 would be the first region to show sensitivity to the presence of motion and this would result in higher parameter estimates for V5 over the pSTS as the input region. V5 would subsequently influence activity in the pSTS region, but more so in the animate movement condition. These two cortical nodes were reciprocally coupled with extrinsic forward and backward connections, while intrinsic (self or recurrent) connections were treated as inhibitory. The effect of animacy was allowed to modulate all extrinsic and intrinsic connections. This full model was inverted for all participants and the resulting posterior densities over the connection strengths were used to perform family wise Bayesian model comparisons using post hoc optimisation. The models considered correspond to all possible combinations of the 10 free coupling parameters – corresponding to 2∧10 = 1024 reduced models. The 10 parameters comprised 2 fixed intrinsic parameters, 2 fixed extrinsic parameters, 4 parameters controlling the modulation of fixed connectivity and 2 parameters controlling the driving effect of All motion. To examine the connectivity in quantitative terms, we then analysed the posterior distribution over connections under the model with the highest evidence, using the distribution of estimates over participants.

In summary, the effect of perceptual set (animate motion) was allowed to change the intrinsic and extrinsic connectivity throughout the hierarchy. We then tested a series of reduced models comparing the evidence for (changes in) intrinsic connectivity, extrinsic connectivity or both. The evidence for these different hypotheses or models was assessed using a variational free energy approximation based upon the post hoc optimisation of reduced versions of the full model. Having identified the model with the greatest evidence, we then characterised the effects of motion and animate motion processing quantitatively, by examining the connection strengths and their bilinear modulation (the model can be replicated with scripts available online (https://github.com/HaukeHillebrandt/SPM_connectome) – also see discussion).