Significance Granger causality analysis is a statistical method for investigating the flow of information between time series. Granger causality has become more widely applied in neuroscience, due to its ability to characterize oscillatory and multivariate data. However, there are ongoing concerns regarding its applicability in neuroscience. When are these methods appropriate? How reliably do they recover the functional structure of the system? Also, what do they tell us about oscillations in neural systems? In this paper, we analyze fundamental properties of Granger causality and illustrate statistical and conceptual problems that make Granger causality difficult to apply and interpret in neuroscience studies. This work provides important conceptual clarification of Granger causality methods and suggests ways to improve analyses of neuroscience data in the future.

Abstract Granger causality methods were developed to analyze the flow of information between time series. These methods have become more widely applied in neuroscience. Frequency-domain causality measures, such as those of Geweke, as well as multivariate methods, have particular appeal in neuroscience due to the prevalence of oscillatory phenomena and highly multivariate experimental recordings. Despite its widespread application in many fields, there are ongoing concerns regarding the applicability of Granger causality methods in neuroscience. When are these methods appropriate? How reliably do they recover the system structure underlying the observed data? What do frequency-domain causality measures tell us about the functional properties of oscillatory neural systems? In this paper, we analyze fundamental properties of Granger–Geweke (GG) causality, both computational and conceptual. Specifically, we show that (i) GG causality estimates can be either severely biased or of high variance, both leading to spurious results; (ii) even if estimated correctly, GG causality estimates alone are not interpretable without examining the component behaviors of the system model; and (iii) GG causality ignores critical components of a system’s dynamics. Based on this analysis, we find that the notion of causality quantified is incompatible with the objectives of many neuroscience investigations, leading to highly counterintuitive and potentially misleading results. Through the analysis of these problems, we provide important conceptual clarification of GG causality, with implications for other related causality approaches and for the role of causality analyses in neuroscience as a whole.

Granger causality is a statistical tool developed to analyze the flow of information between time series. Neuroscientists have applied Granger causality methods to diverse sources of data, including electroencephalography (EEG), magnetoencephalography (MEG), functional magnetic resonance imaging (fMRI), and local field potentials (LFP). These studies have investigated functional neural systems at scales of organization from the cellular level (1⇓–3) to whole-brain network activity (4), under a range of conditions, including sensory stimuli (5⇓–7), varying levels of consciousness (8⇓–10), cognitive tasks (11), and pathological states (12, 13). In such analyses, the time series data are interpreted to reflect neural activity from a particular source, and Granger causality is used to characterize the directionality, directness, and dynamics of influence between sources.

Oscillations are a ubiquitous feature of neurophysiological systems and neuroscience data. They are thought to constrain and organize neural activity within and between functional networks across a wide range of temporal and spatial scales (14⇓⇓–17). Oscillations at specific frequencies have been associated with different states of arousal (18), as well as different sensory (5) and cognitive processes (19, 20). The prevalence of these neural oscillations, as well as their frequency-specific functional associations, spurred initial neuroscientific interest in frequency-domain formulations of Granger causality, such as those developed by Geweke (21, 22). In addition, neuroscience data are often highly multivariate, making it difficult to distinguish direct from indirect influences between system components. Geweke’s conditional causality measure (22) suggested a means to assess direct influences within a larger network. Hence, the Granger–Geweke approach seemed to offer neuroscientists precisely what they wanted—an assessment of direct, frequency-dependent, functional influence between time series—in a straightforward extension of standard time series techniques. Current, widely applied tools for analyzing neuroscience data are based on this Granger–Geweke (GG) paradigm (23⇓–25).

Many limitations of the GG approach are well known, including the requirements that the generative system be approximately linear, stationary, and time invariant. Analyses of the GG approach applied to more general processes, such as neural spiking data (2, 3), continuous-time processes (26), and systems with exogenous inputs and latent variables (4, 27), have been shown to produce results inconsistent with the known functional structure of the underlying system. These examples illustrate the perils of applying GG causality in situations where the generative system is poorly approximated by the vector autoregressive (VAR) model class. Other problems, such as negative causality estimates (28), have been observed even when the generative system belongs to the finite-order VAR model class. Together, these problems raise several important questions for neuroscientists interested in using GG methods: Under what circumstances are these methods appropriate? How reliably do these methods recover the functional structure underlying the observed data? And what do frequency-domain causality measures tell us about the functional properties of oscillatory neural systems?

In this paper, we perform an analysis of GG causality methods to help address these questions. We show how, due to the practical modeling choices required to compute GG causality, the structure of the VAR model can lead to bias in conditional causality estimates. We also show how, for similar reasons, the GG causality measures can be highly sensitive to variations in the estimated model parameters, particularly in the frequency domain, leading to spurious peaks and valleys and even negative values. Finally, we show how the functional properties of the underlying system are not clearly represented in Granger causality measures. As a result, the GG causality measures are prone to misinterpretation. We conclude by discussing how these findings apply to other related measures of causality and how many neuroscience investigations might be more meaningfully analyzed by characterizing system dynamics rather than causality.

Background Granger Causality. Granger causality developed in the field of econometric time series analysis. Granger (29) formulated a statistical definition of causality based on the premises that (i) a cause occurs before its effect and (ii) knowledge of a cause improves prediction of its effect. Under this framework, a time series { x i , t } is Granger causal of another time series { x j , t } if inclusion of the history of x i improves prediction of x j over knowledge of the history of x j alone. Specifically, this is quantified by comparing the prediction error variances of the one-step linear predictor, x ^ j , t , under two different conditions: one where the full histories of all time series are used for prediction and another where the putatively causal time series is omitted from the set of predictive time series. Thus, { x i , t } Granger causes { x j , t } if var ( x j , t − x ^ j , t ∣ x j , 0 : t − 1 ) > var ( x j , t − x ^ j , t ∣ x j , 0 : t − 1 , x i , 0 : t − 1 ) . Granger causality can be generalized to multistep predictions, higher-order moments of the prediction distributions, and alternative predictors (30, 31). In practice, the above linear predictors are limited to finite-order VAR models, and Granger causality is assessed by comparing the prediction error variances from separate VAR models, with one model including all components of the vector time series data, which we term the full model, and a second one including only a subset of the components, which we term the reduced model. To investigate the causality from { x i , t } to { x j , t } , let [ x j , t x i , t ] = ∑ p = 1 P [ A j , j f ( p ) A j , i f ( p ) A i , j f ( p ) A i , i f ( p ) ] [ x j , t − p x i , t − p ] + [ w j , t f w i , t f ] [1]be the full VAR(P) model of all time series components, where the superscript f is used to denote the full model. This model may be written more compactly as x t = ∑ p = 1 P A f ( p ) x t − p + w t f . The noise processes { w j , t f } and { w i , t f } are zero mean and temporally uncorrelated with covariance E [ w t 1 f ( w t 2 f ) T ] = Σ f δ t 1 − t 2 . Thus, the full one-step predictor of x j , t in the above causality definition is x ^ j , t f = ∑ p = 1 P ∑ m ∈ { i , j } A j , m f ( p ) x m , t − p . Similarly, let x j , t = ∑ p = 1 P A r ( i ) ( p ) x j , t − p + w j , t r ( i ) be the reduced VAR(P) model of the x j (putative effect) components of the time series, omitting the x i (putative cause) components. We use the superscript r ( i ) to denote this reduced model formed by omitting x i . The noise process { w j , t r ( i ) } is zero mean and temporally uncorrelated with covariance Σ r ( i ) . The reduced one-step predictor of x j , t is thus x ^ j , t r ( i ) = ∑ p = 1 P A r ( i ) ( p ) x j , t − p . Geweke Time-Domain Causality Measures. Building on Granger’s definition, Geweke (21) defined a measure of directed causality (what he referred to as linear feedback) from { x i , t } to { x j , t } to be F x i ⟶ x j = ln | var ( x j , t − x ^ j , t r ( i ) ) | | var ( x j , t − x ^ j , t f ) | = ln | Σ j , j r ( i ) | | Σ j , j f | , where | Σ | = det Σ ′ Σ . Geweke (22) expanded the previous definition of unconditional (bivariate) causality to include conditional time series components { x k , t } , purportedly making it possible to distinguish between direct and indirect influences. For example, consider three time series { x i , t } , { x j , t } , and { x k , t } . By conditioning on x k , it is to some extent possible to distinguish between direct influences between x i and x j , as opposed to indirect influences that are mediated by x k . The conditional measure, F x i ⟶ x j ∣ x k , has a form analogous to the unconditional case, except that the predictors, x ^ j , t f and x ^ j , t r ( i ) , and associated prediction-error variances, Σ j , j f and Σ j , j r ( i ) , all incorporate the conditional time series { x k , t } . These time-domain causality measures have a number of important properties. First, they are theoretically nonnegative, equaling zero in the case of noncausality. Second, the total linear dependence between two time series can be represented as the sum of the directed causalities and an instantaneous causality (details in S2. Instantaneous Causality and Total Linear Dependence). Finally, these time-domain causality measures can further be decomposed by frequency, providing a frequency-domain measure of causality. Geweke Frequency-Domain Causality Measures. The above time-domain measure of causality affords a spectral decomposition, which allowed Geweke (21) to also define an unconditional frequency-domain measure of causality. Let X ( λ ) = H f ( λ ) W f ( λ ) be the frequency-domain representation of the moving-average (MA) form of the full model in Eq. 1. X ( λ ) and W f ( λ ) are the Fourier representations of the vector time series { x t } and noise process { w t ( f ) } , respectively, and H f ( λ ) is the transfer function given by H f ( λ ) = ( I − ∑ p = 1 P A f ( p ) e − i p λ ) − 1 . As alluded to above, the model can contain instantaneously causal components. The frequency-domain definition requires removal of the instantaneous causality components by transforming the system with a rotation matrix, as described in S3. Removal of Instantaneous Causality for Frequency-Domain Computation and ref. 21. For clarity, we omit this rotation from the present overview of frequency-domain causality, but fully implement the transformation in our computational studies that follow. The spectrum of { x j , t } is then S x j , x j ( λ ) = H j , j f ( λ ) Σ j , j f H j , j f ⁣ ∗ ( λ ) + H j , i f ( λ ) Σ i , i f H j , i f ⁣ ∗ ( λ ) , where ∗ denotes conjugate transpose. The first term is the component of the spectrum of x j due to its own input noise process, whereas the second term represents the components introduced by the x i time series. The unconditional frequency-domain causality from { x i , t } to { x j , t } at frequency λ is defined as f x i ⟶ x j ( λ ) = ln | S x j , x j ( λ ) | | H j , j f ( λ ) Σ j , j f H j , j f ⁣ ∗ ( λ ) | = ln | H j , j f ( λ ) Σ j , j f H j , j f ⁣ ∗ ( λ ) + H j , i f ( λ ) Σ i , i f H j , i f ⁣ ∗ ( λ ) | | H j , j f ( λ ) Σ j , j f H j , j f ⁣ ∗ ( λ ) | . [2]If { x i , t } does not contribute to the spectrum of { x j , t } at a given frequency, the second term in the numerator of Eq. 2 is zero at that frequency, resulting in zero causality. Thus, the unconditional frequency-domain causality reflects the components of the spectrum originating from the input noise of the putatively causal time series. As with the time-domain measure, Geweke (22) expanded the frequency-domain measure to include conditional time series. And like the conditional time-domain measure, the conditional frequency-domain definition requires separate full and reduced models. Let H f ( λ ) and Σ f be the system function and noise covariance of the full model, and let H r ( i ) ( λ ) and Σ r ( i ) be the system function and noise covariance of the reduced model. As in the unconditional case, instantaneous causality components must be removed, this time from each model, as described in S3. Removal of Instantaneous Causality for Frequency-Domain Computation and ref. 22. Again, for clarity, we omit this rotation in the equations below, but fully implement this transformation in our computational studies that follow. The derivation for the conditional form of frequency-domain causality relies on the time-domain equivalence demonstrated by Geweke (22), F x i ⟶ x j ∣ x k = F x i w k r ( i ) ⟶ w j r ( i ) . The conditional frequency-domain causality is then defined by f x i ⟶ x j ∣ x k ( λ ) = f x i w k r ( i ) ⟶ w j r ( i ) ( λ ) . Thus, we must rewrite our original VAR model in terms of the time series { w j , t r ( i ) } , { x i , t } , and { w k , t r ( i ) } . Geweke (22) did this by cascading the full model with the inverse of an augmented form of the reduced model, [ W j r ( i ) ( λ ) X i ( λ ) W k r ( i ) ( λ ) ] = [ H j , j r ( i ) ( λ ) 0 H j , k r ( i ) ( λ ) 0 I 0 H k , j r ( i ) ( λ ) 0 H k , k r ( i ) ( λ ) ] − 1 × [ H j , j f ( λ ) H j , i f ( λ ) H j , k f ( λ ) H i , j f ( λ ) H i , i f ( λ ) H i , k f ( λ ) H k , j f ( λ ) H k , i f ( λ ) H k , k f ( λ ) ] [ W j f ( λ ) W i f ( λ ) W k f ( λ ) ] [3] = [ G j , j ( λ ) G j , i ( λ ) G j , k ( λ ) G i , j ( λ ) G i , i ( λ ) G i , k ( λ ) G k , j ( λ ) G k , i ( λ ) G k , k ( λ ) ] [ W j f ( λ ) W i f ( λ ) W k f ( λ ) ] . [4]This combined system relates the reduced-model noise processes and the putatively causal time series to the full-model noise process. It can be viewed as the frequency-domain representation of a VARMA model of the time series { w j , t r ( i ) } , { x i , t } , and { w k , t r ( i ) } . The spectrum of { w j , t r ( i ) } is then S w j r ( i ) , w j r ( i ) ( λ ) = G j , j ( λ ) Σ j , j f G j , j ∗ ( λ ) + G j , i ( λ ) Σ i , i f G j , i ∗ ( λ ) + G j , k ( λ ) Σ k , k f G j , k ∗ ( λ ) . [5]Hence, the conditional frequency-domain causality is f x i ⟶ x j ∣ x k ( λ ) = ln | S w j r ( i ) , w j r ( i ) ( λ ) | | G j , j ( λ ) Σ j , j f G j , j ∗ ( λ ) | = ln | Σ j , j r ( i ) | | G j , j ( λ ) Σ j , j f G j , j ∗ ( λ ) | . The last equality follows from the assumed whiteness of { w j , t r ( i ) } with covariance Σ j , j r ( i ) . This equation is similar in form to the expression for unconditional frequency-domain causality in Eq. 2, suggesting, at least superficially, an analogous interpretation. However, the transformations detailed in Eq. 3 suggest a more nuanced interpretation. Overview of Analysis. In the next section, we explore several issues that arise in the application of the conditional and unconditional GG causality measures. We demonstrate them with simulated examples and discuss the practical implications of using such methods for analyzing real data, especially neuroscience data. The simulations use finite-order VAR systems to ensure that the problems demonstrated are not due to inaccurate modeling of the generative process, but rather are inherent to GG causality. We place some emphasis on the Geweke conditional approach (22) for several reasons. Although the unconditional approach (21) is more commonly used, the conditional approach enables some degree of distinction between direct and indirect influence and is therefore more relevant to the highly multivariate datasets common in neuroscience. Furthermore, the issues being presented are partly what have limited the use of the conditional approach and have motivated the proposal of related alternative causality measures. We also emphasize the Geweke frequency-domain measures, because they have motivated the adoption of Granger causality analysis and related methods in the neurosciences.

Discussion Summary of Results. In this paper we analyzed several problems encountered during Granger causality analysis. We focused primarily on the Geweke frequency-domain measure and the case where the generative processes were finite VARs of known order. We found the following: i) There is a troublesome bias–variance trade-off as a function of the VAR model order that can produce erroneous conditional GG causality estimates, including spurious peaks and valleys in the frequency domain, at both low and high model orders. At low model orders, the GG causality can be biased, due to the fact that subsets of VAR processes—e.g., the putative effect and conditional nodes represented by the reduced model—are VARMA processes, which in general can be represented only by infinite-order VAR models. Crucially, significant biases occur even when the data-generating process is VAR and the true underlying model order is used. As the model order increases, the bias decreases, but each model order introduces new oscillatory modes, whose variance can then introduce spurious peaks in the frequency-domain GG causality.

ii) GG causality estimates are independent of the receiver dynamics, which can run counter to intuitive notions of causality intended to explain observed effects. Instead, GG causality estimates reflect a combination of the transmitter and channel dynamics, whose relative contributions can be understood only by examining the component dynamics of the estimated model. As a result, causality analyses, even for the simplest of examples, can be difficult to interpret. Our results illustrate fundamental problems associated with GG causality, in terms of both its statistical properties and its conceptual interpretation. These problems have important practical consequences for neuroscience investigations using GG causality and related methods. Moreover, as we discuss below, these problems can be avoided in large part by paying more careful attention to the modeling and system identification steps that are fundamental to data analysis (Fig. 4). Fig. 4. System identification framework. The processes of modeling and data analysis, sometimes referred to as system identification, involve a number of steps shown here (54, 55). Models of one form or another underlie all data analyses, and model selection involves several crucial choices: the class of models from which a model structure is selected, the method(s) by which candidate models are estimated, and the criteria by which model fits are optimized and a structure selected. Subsequent analysis steps also involve important choices, including the properties and statistics to be investigated and the methods by which they are computed. These choices are shaped by the scientific question of interest. Unfortunately, the use of causality analysis methods often hides this process and the associated choices from the investigator (1). Consequently, the many possible points of failure for causality methods are also hidden, such as inappropriate model structure (2) and poor model estimation (3), poor representation of the features of interest (4), and poor statistical properties (5). More fundamentally, application of causality methods can obscure the scientific question of interest, resulting in misinterpretation (6). In addition, different causality measures with different properties or interpretations are often conflated (7). Alternative Approaches for Estimating GG Causality. The potential for negative causality estimates and spurious peaks in frequency-domain GG causality has been recognized previously (28). The problem has been framed primarily as one in which conditional causality estimates are sensitive to mismatches between the separate full and reduced models; i.e., spurious values occur when the spectral features of the full and reduced models do not “line up” properly. Our work identifies the bias–variance trade-off in the VAR model order as a more fundamental difficulty. Model order selection is problematic even in the context of single-model parametric spectral estimation. AR models have the property that as model order is increased, the number of potential oscillatory modes increases, which can alter estimated peak locations. This problem is compounded when the full model must be compared with an approximate reduced model. Alternative approaches have been proposed for estimating conditional frequency-domain GG causality to eliminate the need for separate model fits. Chen et al. (28) suggested using a reduced model featuring a colored noise term formed from the components of the estimated full model. Whereas this method successfully eliminates negative causality values, it does not correctly partition the variance of the putative causal nodes within the reduced model (34). Thus, it is unclear whether the method described in Chen et al. (28) can accurately estimate conditional GG causality. Barnett and Seth (25) have proposed fitting the reduced model and using it to directly compute the spectral components of Eq. 5. However, for the same reasons described above, these reduced-model estimates will generally require high model order representations that would be susceptible to spurious peaks. Moreover, this method would remain sensitive to mismatches between the spectral estimates of the components in Eqs. 3–5. Potential Problems in Interpreting GG Causality Analyses in Neuroscience Studies. The statistical problems illustrated in this paper raise obvious concerns for the reliability of Granger causality estimates. However, the problems of interpreting causality values, even if they are estimated accurately, are more fundamental. Receiver independence for a scalar unconditional frequency-domain example was previously noted in ref. 35. In this work, we have shown that this holds more generally, for the vector unconditional frequency-domain, the unconditional time-domain, and the conditional time-domain cases. We have also shown evidence for receiver independence in the conditional frequency-domain case as well. Here we also discuss the conceptual difficulty this poses in neuroscience. That GG causality is unaffected by and not reflective of the receiver dynamics is entirely understandable from the original principle underlying the definition of Granger causality, that of improved prediction. The use of improvement of prediction makes sense in the context of econometrics, from which GG causality originated, where the objective is often to construct parsimonious models to make predictions. The functional interpretation of the models is considered less important in this context than the accuracy of the predictions. However, in neuroscience investigations, the opposite is generally true: Fundamentally, a mechanistic understanding of the neurophysiological system is sought, and as a result the structure and functional interpretation of the models are most important. In particular, neuroscience investigations seek to determine the mechanisms that produce “effects” observed at a particular site within a neural system or circuit as a function of inputs or “causes” observed at other locations. The effect node dynamics are central to this question, as they determine how the effect node responds to different inputs. In contrast, the predictive notion of causality that underlies Granger causality methods explicitly ignores the effect node dynamics and effect size. There are numerous examples in neuroscience that illustrate the importance of the effect or receiver node dynamics and effect size. For example, the dynamics of high-threshold thalamocortical neurons are governed by the degree of membrane hyperpolarization, producing tonic firing at the lowest levels of hyperpolarization, burst firing at alpha ( ∼ 10 Hz) frequencies at intermediate levels of hyperpolarization, and burst firing at low frequencies (<1 Hz) at even greater levels of hyperpolarization (36). If we viewed the firing behavior or postsynaptic fields generated by these neurons as the effect to be explained, the Granger methods would be blind to their frequency response, which is fundamental to their neurophysiology. As another example, studies of cognition and memory using fMRI have demonstrated that the amplitude of blood–oxygen-level–dependent responses can be modulated by stimulus novelty and cognitive complexity (37). Characterizations of these effects with Granger causality methods would be similarly indeterminate in explaining these amplitude-dependent effects that are central to the cognitive processes being studied. As a final example, we note a recent study by Epstein et al. (38) in which Granger causality was used to analyze electrocorticogram recordings of seizure activity in epilepsy patients, with the goal of informing epilepsy surgery planning. Epileptic seizures are observed electrophysiologically by the localized onset and propagation of characteristic oscillations, often of large amplitude. The seizure effect is therefore defined by both the system dynamics and the amplitude of the electrophysiological response. Unfortunately, because GG causality is indifferent to the receiver dynamics and the amplitude of the measured effects, it cannot specify which influences, for instance at specific foci and frequencies, contribute to the seizure effect. In particular, an epileptogenic focus would not necessarily have large GC values, because the seizure onset and propagation may be due to the effect node’s internal dynamics. At the same time, large GC values do not necessarily suggest a focus, because large GC values relate to prediction error variance and not to the effect at the recording site. These examples illustrate how Granger causality methods, due to the receiver-independence property, can fail to characterize essential neurophysiological effects of interest and lead to misinterpretation of the causes for those effects. These examples are representative of typical neuroscience problems seeking the “cause” for an “effect,” implying that the misapplication of Granger methods is likely widespread within neuroscience. Implications for Other Causality Methods. The conceptual disagreement between GG causality and the mechanistic interpretation of data from neuroscience studies such as Epstein et al. (38) stems fundamentally from a disconnect between the system properties of interest, given the scientific question, and those represented by GG causality. Beyond GG causality, numerous alternative causality measures have also been proposed, including the directed transfer function (DTF) (39, 40), the partial directed coherence (PDC) (41), and the direct directed transfer function (dDTF) (42, 43). Like GG causality, these causality measures are based on a VAR model, but each of these methods represents a different functional combination of the individual system components and thus a conceptually distinct notion of causality. The DTF represents the transfer function, not from the transmitter to the receiver, but instead from an exogenous white noise input at the transmitter to the receiver. Depending on the normalization used, it may actually include the dynamics of all systems components. The DTF essentially captures all information passing from the transmitter directed to the receiver, but this information flow is not direct (42, 43) (Fig. S5). The dDTF claims to characterize direct influences by normalizing the DTF by the partial coherence. The PDC, on the other hand, represents the actual channel between two nodes (44) but, depending on how it is normalized, may contain the frequency content of other components (Fig. S5). These measures all differ from GG causality, which represents a combination of the transmitter and direct channel dynamics, as well as the conditional node variances and their indirect channel dynamics. Fig. S5. Diagrammatic comparison of components of several causality measures. The directed graph represents a VAR system of nodes x j , x i , and x k with white noise inputs w j , w i , and w k . The system components reflected in several causality measures from x i to x j are compared. The ARMA relation from x i to x j is shown at the top. The transfer function (TF) from x i to x j is highlighted in blue and comprises the channel and receiver dynamics. The unnormalized PDC is highlighted in red and contains just the channel. The one-step detection test, noted in Lütkepohl (30), is highlighted in orange and assesses whether the AR coefficients of the channel are all zero. The unnormalized DTF is highlighted in purple. It is the cross-component of the right-sided MA form of the system or equivalently the inverse of the left-sided VAR form and, due to the inverse, possibly contains information from all components. GG causality is highlighted in green and, as we have shown, primarily reflects the channel and transmitter dynamics, but also includes indirectly connected channels and the input noise variances. Each of the above measures, with the exception of the dDTF, is based on a single-model estimate and thus does not suffer from the model subset or sensitivity issues we have described for GG causality. However, similar to GG causality, all of these methods can be problematic to interpret, because each measure represents a combination of different system components whose contributions cannot be disentangled by examining the causality values alone. Thus, studies using these methods may be subject to a mismatch between the system properties represented by the measures and the properties of interest to the investigator. Despite the clear fundamental differences in how these and other causality measures are defined, these methods are often discussed as interchangeable alternatives or referred to generically as “Granger causality” (35, 44, 45), making it even more difficult to interpret studies using these methods. In many cases, investigators may wish to detect connections between nodes in a system and may not be concerned with interpreting the form or dynamics of such connections. In such cases, under a VAR model, the direct connection between nodes is represented by the corresponding cross-components of the VAR parameter matrices, which indicate the absence of a connection if all of these parameters are equal to zero. Hypothesis tests for such connections have been described in the time series analysis literature (30, 46). This approach is analogous to that proposed by Kim et al. (1) for point process models of neural spiking data. To our knowledge, the VAR versions (30, 46) of this approach have not been used in neuroscience applications, but may offer potential advantages in computational and statistical efficiency compared with more widely applied permutation-based tests on causality measures, such as those outlined in ref. 25. A number of causality approaches have been developed that purportedly handle a wider range of generative dynamic systems, beyond those that might be well approximated by VAR models. Transfer entropy methods (31, 47, 48) can be viewed as a generalization of the Granger approach and are applicable to nonlinear systems. However, such methods require significant amounts of data, are computationally intensive, and present their own model identification issues, such as selection of the embedding order. These approaches are strictly time-domain measures, and their similarity to GG causality (31) suggests that they would face similar problems with interpretation such as independence of receiver dynamics. Dynamic causal modeling (DCM) uses neurophysiologically plausible dynamic models to characterize causal relationships (49). The models, estimation methods, and notions of causality involved in DCM are different from the time series-based causality approaches discussed here. The interpretation and statistical properties of DCM are the subject of ongoing analysis (50⇓⇓–53). Causality Analysis in the Framework of System Identification. The problems in causality methods that we have discussed arise when these methods are applied without clear reference to the underlying models, their estimation methods and statistical measures, or the scientific question of interest—that is, without clear reference to the process of system identification (54, 55). System identification comprises the set of steps undertaken to study a scientific question (Fig. 4), including the processes of model formulation and model selection. Although these steps can be challenging and complex, they are essentially a quantitative formalization of the scientific method. Unfortunately, causality methods have been used, in effect, to circumvent many of these painstaking steps (Fig. 4, annotation 1). As we have seen in our analysis, this can mask many potential points of failure, including poor representation of the system by the model, and computational and interpretational problems with statistics computed from the model. Perhaps more importantly, the use of causality methods supplants a clear statement of the scientific question of interest with an ambiguous statistic and a loaded concept that carries philosophical connotations, detracting from the question at hand. In many cases, the scientific question of interest is simply to determine the presence or absence of a direct connection from one region to another. This question can often be answered directly from the estimated model. In the case of VAR models, as discussed earlier, a hypothesis test could be performed on the off-diagonal VAR components (30). In the transfer function case, this is given by the unnormalized PDC. In DCM, this is explicitly represented in the selected model’s connections (49). In other cases, the scientific question centers on characterizing the mechanistic interactions between components of a system. As we have argued, this can be ascertained by examining the component dynamics from the estimated model. For instance, in the analysis of electrocorticographic oscillations in nonhuman primate somatosensory cortex described in ref. 5, one could examine the component node and channel dynamics to characterize more specific interactions between system components. Yet in other cases, the scientific question may relate to predicting the effects that structural changes might have on system dynamics. For instance, Epstein et al. (38) wanted to evaluate whether surgical resections could alleviate seizures. This question could also be investigated from the estimated model, by removing connections or nodes, simulating data using the model, and then evaluating the influence of these interventions on the generation of seizure oscillations. The validity of such statements would be incumbent on the applicability of the model and would require experimental validation. However, by structuring the problem in terms of the dynamic systems model, this validation is straightforward, more so than with causality measures. In the end, causality measures, however elaborate in construction, are simply statistics estimated from the model (Fig. 4, model statistics). If the model does not adequately represent the system properties of interest, subsequent analyses based on the model will fail to address the question of interest. The VAR model class underlying GG causality and other causality measures represents linear relationships within stationary time series data. Although VAR models can, in many cases, adequately approximate properties of nonlinear systems, in general this structure limits the system behaviors that can be modeled. For example, VAR models cannot explicitly represent cross-frequency coupling. Thus, although a model structure may fit some features of the data, its representation of other features and applicability to the system as a whole may not be adequate and must be considered carefully (Fig. 4, postulate models). The inability of the model to represent key features of interest can in turn lead to erroneous assessments of causality (2, 3), independent of the computational and interpretational problems that we describe in this paper. We note that such failures of causality analysis are frequently discussed in the literature, often without identifying modeling misspecification as the key point of failure. Overall, the problems of GG causality analyzed in this paper, the related problems of other causality methods, and the ongoing confusion surrounding these methods can, in most cases, be resolved by a careful consideration of system identification principles (Fig. 4). Moreover, careful modeling and analysis likely obviate the need for causality measures and their many potential pitfalls.

S1. Relationships Between Different Causality Measures and the System Components of the Underlying VAR Model As discussed in the main text, many of the causality measures used in neuroscience data analysis are derived from an underlying VAR model. The construction of these causality measures incorporates different components of the VAR system. This is illustrated in Fig. S5. To correctly interpret analyses using these methods, it is important to understand the relationships between the causality measures and the components of the underlying VAR model. To make the interrelationships between the different causality measures shown in Fig. S5 more clear, we derive here an expression for the transfer function between the putative causal node x i and the putative effect node x j . From the full VAR model, we can write the model for the single component x j in an autoregressive moving-average (ARMA) form, A j , j ( λ ) X j ( λ ) = − A j , i ( λ ) X i ( λ ) + ∑ k ≠ { j , i } − A j , k ( λ ) X k ( λ ) + W j ( λ ) . Inverting the left-hand AR transfer function, X j ( λ ) = − A j , i ( λ ) A j , j ( λ ) X i ( λ ) + ∑ k ≠ { j , i } − A j , k ( λ ) A j , j ( λ ) X k ( λ ) + 1 A j , j ( λ ) W j ( λ ) , we obtain the transfer function from x i (and those from all other nodes x k ) to x j , i.e., − A j , i ( λ ) / A j , j ( λ ) (and analogously for other nodes x k ). It is clear from this expression that the transfer function between x i and x j , the (unnormalized) PDC, and the one-step detection test (30) share in common the channel dynamics represented in { A j , i ( p ) } . In contrast, the (unnormalized) DTF, whose expression is shown in the diagram in Fig. S5 and derived in S3. Removal of Instantaneous Causality for Frequency-Domain Computation, represents the relationship between the white noise input w i and the putative effect node x j .

S2. Instantaneous Causality and Total Linear Dependence In addition to the unconditional and conditional directed causality measures, F i ⟶ j and F i ⟶ j ∣ k , Geweke (21, 22) also defined associated (undirected) instantaneous causality measures. The unconditional instantaneous causality is F i ⋅ j = ln | var ( x j , t − x ^ j , t f ) | ⋅ | var ( x i , t − x ^ i , t f ) | | var ( x ⋅ , t − x ^ ⋅ , t f ) | = ln | Σ j , j f | | Σ i , i f | | Σ ⋅ , ⋅ f | , [S1]where the subscript “ ⋅ ” denotes all components of the system, i.e., { j , i } in the unconditional case. The expression for the instantaneous causality in the conditional case, F i ⋅ j ∣ k , is identical, with the conditional components k included in the models. An alternative measure quantifying the dependence between two time series, e.g., between { x i , t } and { x j , t } , is the total linear dependence, F i , j = ln | var ( x j , t − x ^ j , t r ( i ) ) | ⋅ | var ( x i , t − x ^ i , t r ( j ) ) | | var ( x ⋅ , t − x ^ ⋅ , t f ) | = ln | Σ j , j r ( i ) | | Σ i , i r ( i ) | | Σ ⋅ , ⋅ f | . Hence, one additional motivation of the definition of directed and instantaneous causality of Geweke (21) is that they compose a time-domain decomposition of the total linear dependence, such that F i , j = F i ⟶ j + F j ⟶ i + F i ⋅ j . The expression for the total linear dependence in the conditional case, F i , j ∣ k , is identical, such that the conditional definitions of directed and instantaneous causality of Geweke (22) similarly compose a time-domain decomposition, F i , j ∣ k = F i ⟶ j ∣ k + F j ⟶ i ∣ k + F i ⋅ j ∣ k .

S3. Removal of Instantaneous Causality for Frequency-Domain Computation The definition of the unconditional frequency-domain causality as a directed decomposition of the spectrum requires any instantaneous correlation between the noise processes, i.e., the instantaneous causality F i ⋅ j in Eq. S1, to be removed. This can be achieved by applying a transformation to the system that block diagonalizes the noise covariance Σ f while preserving the model spectrum S ( λ ) = H f ( λ ) Σ f ( H f ( λ ) ) ∗ . Postmultiplying the system function by D − 1 and premultiplying the noise process by D , such that S ( λ ) = ( H f ( λ ) D − 1 ) ( D Σ f D T ) ( D − T ( H f ( λ ) ) ∗ ) , where D = [ I 0 − Σ i j f ( Σ j j f ) − 1 I ] , leaves the output spectrum invariant, but rotates the system to a model with block-diagonal input noise covariance Σ ∼ f = D Σ f D T and transformed system function H ∼ f ( λ ) = H f ( λ ) D − 1 . Here we are following our ordering convention from the main text, where if we are investigating the causality from x i to x j , the effect components are ordered before the causal components; i.e., x t = [ x j , t T x i , t T ] T . In the conditional case, the frequency-domain causality computation requires removal of the instantaneous causality from both the full model and the reduced model. For the reduced model, the system function H r ( i ) ( λ ) and noise covariance Σ r ( i ) are transformed, as above, to H ∼ r ( i ) ( λ ) and Σ ∼ r ( i ) , using D r ( i ) = [ I 0 − Σ k j r ( i ) ( Σ j j r ( i ) ) − 1 I ] . Similarly, the system function H f ( λ ) and noise covariance matrix Σ f of the full model are transformed to H ∼ f ( λ ) and Σ ∼ f , but in this case, the transformation matrix for the full model [appendix in Chen et al. (28)] is given by D f = [ I 0 0 0 I 0 0 − ( Σ k i f − Σ k j f ( Σ j j f ) − 1 Σ j k f ) ( Σ i i f − Σ i j f ( Σ j j f ) − 1 Σ j i f ) − 1 I ] [ I 0 0 − Σ i j f ( Σ j j f ) − 1 I 0 − Σ k j f ( Σ j j f ) − 1 0 I ] . Note that the nature of the transformations now makes the causality computation dependent on the order of the components in the vector time series. Again, we are following our ordering convention from the main text. When investigating the causality from x i to x j conditional on x k , the effect components are ordered before the causal components in both the full and reduced models and the conditional components are appended to the end of the full model, i.e., { j , i } and { j , i , k } .

S4. MA Matrix Recursions for Invertible VAR Systems In the computation of GG causality, the full and reduced models are assumed to be stable and invertible so that the transfer functions can be computed. The VAR(P) system x t − ∑ p = 1 P A ( p ) x t − p = w t is invertible if det ( I − ∑ p = 1 P A ( p ) z p ) ≠ 0 for | z | ≤ 1. The transfer function is then H ( λ ) = ( I − ∑ p = 1 P A ( p ) e − i p λ ) − 1 . The transfer function can also be expressed as the Fourier transform of the MA matrices. In MA form, the system is x t = ∑ p = 0 ∞ Φ ( p ) w t − p , giving the following alternative expression for the transfer function, H ( λ ) = ∑ p = 0 ∞ Φ ( p ) e − i p λ . Although we do not make use of the MA matrices in the main text, for convenience we note here that the MA matrices can be computed recursively from the AR matrices [Lütkepohl (30)]: Φ ( 0 ) = I M Φ ( p ) = ∑ j = 1 p Φ ( p − j ) A ( j ) for p = 1 , 2 , … .

S5. Construction of Simulated Examples In the examples presented in the main text, we used simulated VAR systems with spectral peaks at distinct frequencies. Within the VAR model class, these peaks can be generated by the univariate AR behavior, what we refer to as the internal dynamics, of the individual nodes. For a univariate AR system, a spectral peak requires a model order of two, i.e., x t = A ( 1 ) x t − 1 + A ( 2 ) x t − 2 + w t , and every increase of the order by two adds a peak. The modes of the system are given by the roots of the characteristic polynomial or equivalently the eigenvalues of the VAR(1)-form system matrix. In the AR(2) case, this is det ( λ I − [ A ( 1 ) A ( 2 ) I 0 ] ) = λ 2 − A ( 1 ) λ − A ( 2 ) = 0. [S2]To have an oscillatory peak, the roots must be a complex conjugate pair, such that the characteristic equation can be factored as ( λ − r e i θ ) ( λ − r e − i θ ) = 0. [S3]Expanding the left-hand side of Eq. S3 and using Euler identities reduce the expression to λ 2 − r ( e i θ + e − i θ ) λ + r 2 = λ 2 − 2 r cos ( θ ) λ + r 2 . Matching terms with Eq. S2 gives expressions for the AR(2) parameters in terms of the desired pole amplitudes and frequencies, A ( 2 ) = − r 2 A ( 1 ) = 2 r cos ( θ ) . Here, θ is the pole angle in radians. It is related to the frequency f in hertz by θ = f Δ t 2 π , where the assumed sampling rate is 1 Δ t Hz. These expressions were used to set the resonance frequencies of the individual nodes in the simulated examples in the main text. For single nodes, i.e., univariate AR, r < 1 ensures stability. In the multivariate case, the overall stability of the system is guaranteed only if all eigenvalues of the VAR(1)-form system matrix are within the unit circle. Note that whereas the pole location is given by θ , this frequency is not the exact maximum of the spectrum, only approximate. It is possible to derive an analytical expression for the spectral peak frequency in terms of r and θ [for example, Priestley (56)], but the expression is quite cumbersome. For ease of discussion, in the main text, we have referred to the pole location θ as the spectral peak location. Although this is not strictly true, the approximation in our case is reasonable.

S6. Receiver Independence of Unconditional Frequency-Domain GG Causality In this section, we show that the unconditional frequency-domain GG causality is independent of the receiver dynamics. Without loss of generality, we demonstrate that the causality from transmitter x 1 to receiver x 2 is independent of the parameters governing the internal receiver dynamics { A 22 ( p ) } . We begin by writing the full-model transfer function H ( λ ) in terms of block components. The transfer function is the inverse of the Fourier transform of the VAR, A ( λ ) = I − ∑ p = 1 P A ( p ) e − i p λ , so from the matrix inversion lemma [ H 11 ( λ ) H 12 ( λ ) H 21 ( λ ) H 22 ( λ ) ] = [ A 11 ( λ ) A 12 ( λ ) A 21 ( λ ) A 22 ( λ ) ] − 1 = [ A 11 − 1 + A 11 − 1 A 12 ( A 22 − A 21 A 11 − 1 A 12 ) − 1 A 21 A 11 − 1 − A 11 − 1 A 12 ( A 22 − A 21 A 11 − 1 A 12 ) − 1 − ( A 22 − A 21 A 11 − 1 A 12 ) − 1 A 21 A 11 − 1 ( A 22 − A 21 A 11 − 1 A 12 ) − 1 ] . Writing the causality in terms of the transfer function and noise covariance components, f x 1 ⟶ x 2 ( λ ) = ln | H 22 ( λ ) Σ 22 H 22 ∗ ( λ ) + H 21 ( λ ) Σ 11 H 21 ∗ ( λ ) | | H 22 ( λ ) Σ 22 H 22 ∗ ( λ ) | . Substituting in terms of A components from above, f x 1 ⟶ x 2 ( λ ) = ln | ( A 22 − A 21 A 11 − 1 A 12 ) − 1 { Σ 2,2 + A 21 A 11 − 1 Σ 11 ( A 11 − 1 ) * A 21 * } ( ( A 22 − A 21 A 11 − 1 A 12 ) − 1 ) * | | ( A 22 − A 21 A 11 − 1 A 12 ) − 1 Σ 22 ( ( A 22 − A 21 A 11 − 1 A 12 ) − 1 ) * | . Recall that for a matrix | Σ | = det Σ ′ Σ . Because for square matrices the determinant of the product is equal to the product of the determinants, the determinants of all factors in both the numerator and the denominator separate. Consequently, the determinants of the first factors, ( A 22 − A 21 A 11 − 1 A 12 ) − 1 , the last factors, ( ( A 22 − A 21 A 11 − 1 A 12 ) − 1 ) ∗ , and their corresponding transposes cancel from the numerator and denominator. This results in the final equality, f x 1 ⟶ x 2 ( λ ) = ln | Σ 22 + A 21 ( λ ) A 11 − 1 ( λ ) Σ A 11 − 1 ⁣ ∗ 11 ( λ ) A 21 ∗ ( λ ) | | Σ 22 | . We thus see that the receiver dynamics A 22 ( λ ) do not appear. Because the time-domain causality is the integral of the frequency-domain causality, the time-domain unconditional GG causality is independent of the receiver dynamics as well.

S7. Receiver Independence of Time-Domain GG Causality The GG causality from node i to node j compares the one-step prediction-error variance of the full model, Σ j j f , with the one-step prediction-error variance of the reduced model, Σ j j r ( i ) , F x i ⟶ x j = ln | var ( x j , t − x ^ j , t r ( i ) ) | | var ( x j , t − x ^ j , t f ) | = ln | Σ j j r ( i ) | | Σ j j f | . The prediction-error covariance of the full model is just the input noise covariance Σ f = Σ w , so the dependence of the GG causality on the VAR parameters arises through the prediction-error covariance of the reduced model. We do not have a closed-form expression for the reduced model with respect to the full-model parameters. In general, the reduced model can only be obtained numerically (25), which obscures the form of its dependence on the full-model VAR parameters. Instead, we use a state-space representation of the VAR to derive an expression for the reduced-model prediction-error variance in terms of the VAR parameters. From this, we will see that the receiver node VAR parameters do not enter into the equation for the reduced-model prediction-error variance, and thus, the time-domain GG causality is independent of receiver dynamics. This is true for both the unconditional and conditional cases. To put the model in state-space form, we write the VAR(P) full model as the augmented VAR(1) state equation, x ¯ t = F x ¯ t − 1 + G w t = [ A f ( 1 ) ⋯ A f ( P − 1 ) A f ( P ) I 0 ] [ x t − 1 ⋮ x t − p ] + [ I 0 ] w t f , [S4]where x ¯ t = [ x t T x t − 1 T … x t − ( P − 1 ) T ] T is the augmented state. The observation equation is y t = H x ¯ t + v t = [ ( I 0 ) 0 … 0 ] [ x t x t − 1 ⋮ x t − ( P − 1 ) ] = x j , t , [S5]where H is a selection matrix that selects the observed components from the top block of the state, the observation noise is zero, v t = 0 , and our convention is that the observed/effect components are ordered first in each block, x t = [ x j , t T x i , t T ] T . The reduced-model prediction-error covariance is simply the observed components of the state prediction-error covariance, Σ r ( i ) = H Σ t + 1 ∣ t H T . [S6]The state prediction-error covariance and state (filtered)-error covariance are given recursively by the standard Kalman filter equations, Σ t + 1 ∣ t = F Σ t ∣ t F T + G Σ w G T [S7]and Σ t ∣ t = Σ t ∣ t − 1 − Σ t ∣ t − 1 H T ( H Σ t ∣ t − 1 H T + R ) − 1 H Σ t ∣ t − 1 , [S8]where in this case the observation noise covariance is zero, R = 0 . We designate the blocks of the prediction-error and state covariances by Σ t + 1 ∣ t = [ K t + 1 L t + 1 L t + 1 T N t + 1 ] and Σ t ∣ t = [ B t C t C t T D t ] , respectively. To see the VAR parameter dependence, we write out the state prediction-error equation, Eq. S7, by components. To demonstrate without loss of generality, we use VAR(2), Σ t + 1 ∣ t = [ A ( 1 ) A ( 2 ) I 0 ] [ B t C t C t T D t ] [ A ( 1 ) T I A ( 2 ) T 0 ] + [ I 0 ] Σ w [ I 0 ] . [S9]In this case, the blocks K , L , N , B , C , and D all have dimensions M × M , the same dimensions as the VAR matrices A ( p ) , where M is the number of system components. For notational simplicity, we have removed the superscript f from the VAR matrices and moved the AR lag index to superscript. Because the reduced components are directly observed, y t = x j , t , the associated state-error variances are zero, so the only nonzero components of the blocks are the covariances of the unobserved/omitted components, B t = ( 0 0 0 σ ( i ) t , t ∣ t 2 ) , C t = ( 0 0 0 σ ( i ) t , t − 1 ∣ t 2 ) , and D t = ( 0 0 0 σ ( i ) t − 1 , t − 1 ∣ t 2 ) . Carrying out the matrix multiplication of Eq. S9 for the prediction-error covariance, we have Σ t + 1 ∣ t = [ K t + 1 L t + 1 L t + 1 T N t + 1 ] = [ Σ w + A ( 1 ) B t A ( 1 ) T + A ( 2 ) C t T A ( 1 ) T + A ( 1 ) C t A ( 2 ) T + A ( 2 ) D t A ( 2 ) T A ( 1 ) B t + A ( 2 ) C t T B t A ( 1 ) T + C t A ( 2 ) T B t ] . The top left block of Σ t + 1 ∣ t is K t + 1 = Σ w + A ( 1 ) B t A ( 1 ) T + A ( 2 ) C t T A ( 1 ) T + A ( 1 ) C t A ( 2 ) T + A ( 2 ) D t A ( 2 ) T = ( Σ w j + A j i ( 1 ) σ t , t ∣ t 2 A j i ( 1 ) T + A j i ( 2 ) σ t − 1 , t ∣ t 2 A j i ( 1 ) T + A j i ( 1 ) σ t , t − 1 ∣ t 2 A j i ( 2 ) T + A j i ( 2 ) σ t − 1 , t − 1 ∣ t 2 A j i ( 2 ) T A j i ( 1 ) σ t , t ∣ t 2 A i i ( 1 ) T + A j i ( 2 ) σ t − 1 , t ∣ t 2 A i i ( 1 ) T + A j i ( 1 ) σ t , t − 1 ∣ t 2 A i i ( 2 ) T + A j i ( 2 ) σ t − 1 , t − 1 ∣ t 2 A i i ( 2 ) T A i i ( 1 ) σ t , t ∣ t 2 A j i ( 1 ) T + A i i ( 2 ) σ t − 1 , t ∣ t 2 A j i ( 1 ) T + A i i ( 1 ) σ t , t − 1 ∣ t 2 A j i ( 2 ) T + A i i ( 2 ) σ t − 1 , t − 1 ∣ t 2 A j i ( 2 ) T Σ w i + A i i ( 1 ) σ t , t ∣ t 2 A i i ( 1 ) T + A i i ( 2 ) σ t − 1 , t ∣ t 2 A i i ( 1 ) T + A i i ( 1 ) σ t , t − 1 ∣ t 2 A i i ( 2 ) T + A i i ( 2 ) σ t − 1 , t − 1 ∣ t 2 A i i ( 2 ) T ) . [S10]The upper right (and transpose of the lower left) block of Σ t + 1 ∣ t is L t + 1 = A ( 1 ) B t + A ( 2 ) C t T = ( 0 A j i ( 1 ) σ t , t ∣ t 2 + A j i ( 2 ) σ t − 1 , t ∣ t 2 0 A i i ( 1 ) σ t , t ∣ t 2 + A i i ( 2 ) σ t − 1 , t ∣ t 2 ) . [S11]The lower right block of Σ t + 1 ∣ t is N t + 1 = B t = ( 0 0 0 σ t , t ∣ t 2 ) . [S12]Stepping forward the Kalman filter with Eq. S8, the next state-error covariance is Σ t + 1 | t + 1 = Σ t + 1 | t − Σ t + 1 | t H T ( H Σ t + 1 | t H T ) − 1 H Σ t + 1 | t . [S13]Expanding, [ B t + 1 C t + 1 C t + 1 T D t + 1 ] = [ K t + 1 L t + 1 L t + 1 T N t + 1 ] − [ K t + 1 L t + 1 L t + 1 T N t + 1 ] [ I 0 – 0 0 ] ( K t + 1 , j j ) − 1 [ I 0 ∣ 0 0 ] [ K t + 1 L t + 1 L t + 1 T N t + 1 ] = [ ( 0 0 0 K t + 1 , i i − K t + 1 , i j K t + 1 , j j − 1 K t + 1 , j i ) ( 0 0 0 L t + 1 , i i − K t + 1 , i j K t + 1 , j j − 1 L t + 1 , j i ) ( 0 0 0 L t + 1 , i i T − L t + 1 , j i T K t + 1 , j j − 1 K t + 1 , j i ) ( 0 0 0 N t + 1 , i i − L t + 1 , j i T K t + 1 , j j − 1 L t + 1 , j i ) ] . We see that the state-covariance structure is preserved, as expected, and that the covariances of the unobserved components are given iteratively by σ ( i ) t + 1 , t + 1 ∣ t + 1 2 = K t + 1 , i i − K t + 1 , i j K t + 1 , j j − 1 K t + 1 , j i σ ( i ) t + 1 , t ∣ t + 1 2 = L t + 1 , i i − K t + 1 , i j K t + 1 , j j − 1 L t + 1 , j i σ ( i ) t , t ∣ t + 1 2 = N t + 1 , i i − L t + 1 , j i T K t + 1 , j j − 1 L t + 1 , j i . [S14]The state prediction- and filtered-error covariances of Eqs. S7 and S8 have been consolidated to Eqs. S10–S12 and S14. From this set of equations, we see that the parameters determining the receiver dynamics, { A j j ( p ) } , do not appear. These equations can be computed recursively, and in steady state, they form a set of nonlinear equations, the solution of which determines the reduced-model prediction-error variance. Hence, because the { A j j ( p ) } parameters do not appear, the reduced-model prediction-error variance and the time-domain causality are, as seen before in the unconditional frequency domain, independent of the receiver dynamics. In fact, we see that receiver independence holds for the conditional case, as well, and that the causality is further independent of the dynamics of the conditional nodes. This is because any conditional time series x k would be included with x j as directly observed components of the reduced model. We thus find that the causality is only a function, albeit complicated, of the channel parameters { A j i ( p ) } , the transmitter parameters { A i i ( p ) } , and the input noise covariance Σ w .

S8. Additional Figures and Table for Example 1 Table S1 and Figs. S1–S4 provide additional information about the VAR parameter and conditional frequency-domain causality estimates of the VAR(3) three-node series system in example 1. Table S1 shows the mean and SD for the VAR parameter estimates of the full model for orders 3, 6, and 20. The data show that the parameters are estimated appropriately, suggesting that the bias–variance trade-off and separate model mismatch problems described in the main text are not artifacts of poor estimation of the VAR parameters. Fig. S1 shows the mean values of several order selection criteria—the Akaike information criterion (red), the Hannan–Quinn criterion (blue), and the Schwarz criterion (green)—for various model orders. On average, all three criteria were minimized when using the true model order 3. This shows that for the system in example 1, the true system model order would have been identified by any of the mentioned selection criteria. Despite this, as we demonstrate in the main text, the resulting causality estimates are biased. Fig. S2 shows the causality estimates for all cause–effect pairs from example 1, using the true model order 3. Figs. S3 and S4 show the estimates from using the increased orders 6 and 20, respectively. Figs. S3 and S4 complement Fig. 2 in the main text by showing the results for all possible connections between nodes. Comparing Figs. S3 and S4 again shows the change in the distributions as the order is increased, reflecting the bias associated with using the true model order. At the same time, increasing the order increases “noisiness” of the estimates of individual realizations due to the additional spectral peaks in the models.

S9. Permutation Significance for Noncausality Significance levels for the conditional causality estimates are estimated by permutation (32). In example 1, we generate 1,000 realizations of the system. For each cause–effect pair (from i to j conditional on k), 1,000 permuted realizations are formed by randomly selecting realization indexes I i and I j k from a uniform distribution and recombining the I i th realization of { x i , t } with the I j k th realization of { x { j , k } , t } to generate a permuted realization. The conditional frequency domain causality is estimated for each permuted realization and the maximal value over all frequencies is identified. The 95% significance level over all frequencies is taken as 95% of the 1,000 maximal values. This significance level is against the null hypothesis of no difference between the causality spectrum and that of a noncausal system. It is the level below which the maximum of 95% of null causality permutations reside. This tests the whole causality waveform against the null hypothesis of no causal connection, and no assessment of significance for particular frequencies, despite the presence of peaks or valleys, can be made. The procedure can be applied to particular frequency bands of interest by identifying the 95% level within the band, but if multiple bands are tested, the appropriate corrections must be applied.

Acknowledgments P.A.S. was supported by Training Grant T32 HG002295 from the National Human Genome Research Institute. P.L.P. was supported by the National Institutes of Health Director’s New Innovator Award DP2-OD006454.

Footnotes Author contributions: P.A.S. and P.L.P. designed research, performed research, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. A.P.G. is a guest editor invited by the Editorial Board.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1704663114/-/DCSupplemental.