Causal relationship based on instantaneous phase dependency

We define the cause–effect relationship between Time Series A and Time Series B according to the fundamental criterion of causal assessment proposed by Galilei1: cause is that which put, the effect follows; and removed, the effect is removed; thus, variable A causes variable B if the instantaneous phase dependency between A and B is diminished when the intrinsic component in B that is causally related to A is removed from B itself, but not vice versa.

$${\mathrm{Coh}}\left( {A,B\prime } \right) < {\mathrm{Coh}}\left( {A,B} \right)\sim {\mathrm{Coh}}\left( {A\prime ,B} \right)$$ (1)

where Coh denotes the instantaneous phase dependency (i.e., coherence) between the intrinsic components of two time series, and the accent mark represents the time series where the intrinsic components relevant to cause effect dynamics were removed. The realisation of this definition requires two key treatments of the time series. First, the time series must be decomposed into intrinsic components to recover the cause–effect relationship at a specific time scale and instantaneous phase. Second, a phase coherence measurement is required to measure the instantaneous phase dependency between the intrinsic components decomposed from cause–effect time series.

Empirical mode decomposition

To achieve this, we decompose a time series into a finite number of IMFs by using the ensemble EMD14,15,16 technique. Ensemble EMD is an adaptive decomposition method originated from EMD (i.e., the core of Hilbert–Huang Transform) for separating different modes of frequency and amplitude modulations in the time domain14,15.

Briefly, EMD is implemented through a sifting process to decompose the original time-series data into a finite set of IMFs. The sifting process comprises the following steps: (1) connecting the local maxima or minima of a targeted signal to form the upper and lower envelopes by natural cubic spline lines; (2) extracting the first prototype IMF by estimating the difference between the targeted signal and the mean of the upper and lower envelopes; and (3) repeating these procedures to produce a set of IMFs that were represented by a certain frequency–amplitude modulation at a characteristic time scale. The decomposition process is completed when no more IMFs could be extracted, and the residual component is treated as the overall trend of the raw data. Although IMFs are empirically determined, they remain orthogonal to one another, and may therefore contain independent physical meanings15,37.

The IMF decomposed from EMD enables us to use Hilbert transform to derive physically meaningful instantaneous phase and frequency14,29. For each IMF, they represent narrow-band amplitude and frequency-modulated signal S(t), and can be expressed as

$$S\left( t \right) = A\left( t \right){{\cos}}\emptyset \left( t \right)$$ (2)

where instantaneous amplitude A and phase ∅ can be calculated by applying the Hilbert transform, defined as S H = \(\frac{1}{\pi }{\int} {\frac{{S(t\prime )}}{{t - t\prime }}{\mathrm{d}}t\prime }\); A(t) = \(\sqrt {S^2\left( t \right) + S_H^2(t)}\); and ∅(t) = \({\mathrm{arctan}}\left( {{\textstyle{{S_H\left( t \right)} \over {S\left( t \right)}}}} \right)\). The instantaneous frequency is then calculated as the derivative of the phase function ω(t) = d∅(t)/dt.

Thus, the original signal X can be expressed as the summation of all IMFs and residual r,

$$X(t) = \mathop {\sum}

olimits_{j = 1}^k {A_j\left( t \right) \, {{\exp}}\left( {i\mathop {\scriptstyle\int }\omega_{j}(t){\mathrm{d}}t} \right) + r}$$ (3)

where k is the total number of IMFs, A j (t) is the instantaneous amplitude of each IMF; and ω j (t) is the instantaneous frequency of each IMF. Previous literature have shown that IMFs derived with EMD can be used to delineate time dependency38 or phase dependency37,39,40,41,42 in nonlinear and non-stationary data.

The ensemble EMD15,16,43 is a noise-assisted data analysis method to further improve the separability of IMFs during the decomposition and defines the true IMF components S j (t) as the mean of an ensemble of trials, each consisting of the signal plus white noise of a finite amplitude.

$$S_j\left( t \right) = \lim _{{\mathrm{N}} \to \infty }\mathop {\sum}

olimits_{k = 1}^N {\left\{ {S_j\left( t \right) + r \times w_k(t)} \right\}} $$ (4)

where w k (t) is the added white noise, and k is the kth trial of the jth IMF in the noise-added signal. The magnitude of the added noise r is critical to determining the separability of the IMFs (i.e., r is a fraction of a standard deviation of the original signal). The number of trials in the ensemble N must be large so that the added noise in each trial is cancelled out in the ensemble mean of large trials (N = 1000 in this study). The purpose of the added noise in the ensemble EMD is to provide a uniform reference frame in the time–frequency space by projecting the decomposed IMFs onto comparable scales that are independent of the nature of the original signals. With the ensemble EMD method, the intrinsic oscillations of various time scales can be separated from nonlinear and non-stationary data with no priori criterion on the time–frequency characteristics of the signal. Hence, the use of ensemble EMD could complement the constraints of separability in Granger’s paradigm44 and potentially capture simultaneous causal relationships not accounted for by predictive causality methods.

Orthogonality and separability of IMFs

Because r is the only parameter involved in the causal-decomposition analysis, the strategy of selecting r is to maximise the separability while maintaining the orthogonality of the IMFs, thereby avoiding spurious causal detection resulting from poor separation of a given signal. We calculated the nonorthogonal leakage14 and root-mean-square (RMS) of the pairwise correlations of the IMFs for each r with an increment of 0.05 in the uniform space between 0.05 and 1. A general guideline for selecting r in this study is to minimise the RMS of the pairwise correlations of the IMFs (ideally under 0.05) while maintaining the nonorthogonal leakage also under 0.05.

Phase coherence

Next, the Hilbert transform is applied to calculate the instantaneous phase of each IMF and to determine the phase coherence between the corresponding IMFs of two time series18. For each corresponding pair of IMFs from the two time series, denoted as S 1j (t) and S 2j (t), and can be expressed as

$${S}_{1j} (t) = {A}_{1j} (t){{\cos}}{\emptyset}_{1j}(t) \, {\mathrm{and}} \, {S}_{2j}(t) = {A}_{2j}(t) {{\cos}}{\emptyset}_{2j} {(t)},$$ (5)

where A 1j , ∅ 1j can be calculated by applying the Hilbert transform, defined as \(S_{1jH} = \frac{1}{\pi }{\int} {\frac{{S_{1j}(t\prime )}}{{t - t^\prime }}{\mathrm{d}}t\prime }\), and \(A_{1j}\left( t \right) = \sqrt {S_{1j}^2(t) + S_{1jH}^2(t)} \), and ∅ 1j (t) = arctan\(\left( {\frac{{S_{1{\mathrm{j}}H}\left( t \right)}}{{S_{1j}\left( t \right)}}} \right)\); and similarly applied for S 2jH , A 2j , and ∅ 2j . The instantaneous phase difference is simply expressed as ∆ ∅ 12j (t) = ∅ 2j (t)∅ 1j (t). If two signals are highly coherent, then the phase difference is constant; otherwise, it fluctuates considerably with time. Therefore, the instantaneous phase coherence Coh measurement can be defined as

$${\mathrm{Coh}}\left( {S_{1j},S_{2j}} \right) = \frac{1}{T}\left| {{\int}_{\hskip-5pt 0}^T {e^{i\Delta \emptyset _{12j}(t)}{\mathrm{d}}t} } \right|$$ (6)

Note that the integrand (i.e., \(e^{i\Delta \emptyset _{12{\mathrm{j}}}(t)}\)) is a vector of unit length on the complex plane, pointing toward the direction which forms an angle of \(\Delta \emptyset _{12j}(t)\) with the +x axis. If the instantaneous phase difference varies little over the entire signal, then the phase coherence is close to 1. If the instantaneous phase difference changes markedly over the time, then the coherence is close to 0, resulting from adding a set of vectors pointing in all possible directions. This phase coherence definition allows the instantaneous phase dependency to be calculated without being subjected to the effect of time lag between cause and effect (i.e., the time precedence principle), thus avoiding the constraints of time lag in predictive causality methods10.

Causal decomposition between two time series

With the decomposition of the signals by ensemble EMD and measurement of the instantaneous phase coherence between the IMFs, the most critical step in the causal-decomposition analysis is again based on Galilei’s principle: the removal of an IMF followed by redecomposition of the time series (i.e., the decomposition and redecomposition procedure). If the phase dynamic of an IMF in a target time series is influenced by the source time series, removing this IMF in the target time series (i.e., subtract an IMF from the original target time series) with redecomposition into a new set of IMFs results in the redistribution of phase dynamics into the emptied space of the corresponding IMF. Furthermore, because the causal-related IMF is removed, redistribution of the phase dynamics into the corresponding IMF would be exclusively from the intrinsic dynamics of the target time series, which is irrelevant to the dynamics of the source time series, thus reducing the instantaneous phase coherence between the paired IMFs of the source time series and redecomposed target time series. By contrast, this phenomenon does not occur when a corresponding IMF is removed from the source time series because the dynamics of that IMF are intrinsic to the source time series and removal of that IMF with redecomposition would still preserve the original phase dynamics from the other IMFs. Therefore, this decomposition and redecomposition procedure enables quantifying the differential causality between the corresponding IMFs of two time series.

Because each IMF represents a dynamic process operating at a distinct time scale, we treat the phase coherence between the paired IMFs as the coordinates in a multidimensional space, and quantify the variance-weighted Euclidean distance between the phase coherence of the paired IMFs decomposed from the original signals as well as the paired original and redecomposed IMFs, which are expressed as follows:

$$\begin{array}{l}D\left( {S_{1j} \to S_{2j}} \right) = \left\{ {\mathop {\sum}

olimits_{j = 1}^m {W_j\left[ {{\mathrm{Coh}}\left( {S_{1j},S_{2j}} \right) - {\mathrm{Coh}}\left( {S_{1j},S_{2j}^\prime } \right)} \right]^2} } \right\}^{\frac{1}{2}}\\ D\left( {S_{2j} \to S_{1j}} \right) = \left\{ {\mathop {\sum}

olimits_{j = 1}^m {W_j\left[ {{\mathrm{Coh}}\left( {S_{1j},S_{2j}} \right) - {\mathrm{Coh}}\left( {S_{1j}^\prime ,S_{2j}} \right)} \right]} ^2} \right\}^{\frac{1}{2}}\\ W_j = \left( {{\mathrm{Var}}_{1j} \times {\mathrm{Var}}_{2j}} \right){\mathrm{/}}\mathop {\sum}

olimits_{j = 1}^m {\left( {{\mathrm{Var}}_{1j} \times Var_{2j}} \right)} \end{array}$$ (7)

The range of D represents the level of absolute causal strength and is between 0 and 1. The relative causal strength between IMF S 1j and S 2j can be quantified as the relative ratio of absolute cause strength \(D\left( {S_{1j} \to S_{2j}} \right)\) and \(D\left( {S_{2j} \to S_{2j}} \right)\), expressed as follows:

$$\begin{array}{l}C\left( {S_{1j} \to S_{2j}} \right) = D\left( {S_{1j} \to S_{2j}} \right) \Big/ \left[ {D\left( {S_{1j} \to S_{2j}} \right) + D\left( {S_{2j} \to S_{1j}} \right)} \right]\\ C\left( {S_{2j} \to S_{1j}} \right) = D\left( {S_{2j} \to S_{1j}} \right)\Big/\left[ {D\left( {S_{1j} \to S_{2j}} \right) + D\left( {S_{2j} \to S_{1j}} \right)} \right].\end{array}$$ (8)

This decomposition and redecomposition procedure is repeated for each paired IMF to obtain the relative causal strengths at each time scale, where a ratio of 0.5 indicates either that there is no causal relationship or equal causal strength in the case of reciprocal causation, and a ratio toward 1 or 0 indicates a strong differential causal influence from one time series to another. To avoid a singularity when both \(D\left( {S_{1j} \to S_{2j}} \right)\) and \(D\left( {S_{2j} \to S_{1j}} \right)\) approach zero (i.e., no causal change in phase coherence with the redecomposition procedure), D + 1 is used to calculate the relative causal strength when both absolute causal strength D values are <0.05.

In summary, causal decomposition comprises the following three key steps: (1) decomposition of a pair of time series A and B into two sets of IMFs (e.g., IMFs A and IMFs B) and determining the instantaneous phase coherence between each paired IMFs; (2) removing an IMF in a given time series (e.g., time series A), performing the redecomposition procedure to generate a new set of IMFs (IMF A′) and recalculating the instantaneous phase coherence between the original IMFs (IMFs B) and redecomposed IMFs (IMFs A′); and (3) determining the absolute and relative causal strength by estimating the deviation of phase coherence from the phase coherence of the original time series (IMFs A vs. IMFs B) to either of the redecomposed time series (e.g., IMFs A′ vs. IMF B).

Validation of causal strength

To validate the causal strength, a leave-one-sample-out cross-validation is performed for each causal-decomposition test. Briefly, we delete a time point for each leave-one-out test and obtain a distribution of causal strength for all runs where the total number of time points is <100, or a maximum of 100 random leave-one-out tests where the total number of time points was higher than 100. A median value of causal strength is observed.

Deterministic and stochastic model data

The deterministic model was used in accordance with Sugihara et al.5 based on a coupled two-species nonlinear logistic difference system, expressed as follows (initial value x(1) = 0.2, and y(1) = 0.4):

$$\begin{array}{l}x\left( {t + 1} \right) = x\left( t \right)\left[ {3 . 8 - 3 . 8x\left( t \right) - 0 . 02y\left( t \right)} \right]\\ y\left( {t + 1} \right) = y\left( t \right)\left[ {3 . 5 - 3 . 5y\left( t \right) - 0 . 1x\left( t \right)} \right]\end{array}$$ (9)

For the stochastic model, we used part of the example shown in Ding et al.10 for Granger causality, which is expressed as follows (using a random number as the initial value).

$$\begin{array}{l}x\left( {t + 1} \right) = 0 . 95\sqrt 2 x\left( t \right) - 0 . 9025x\left( {t - 1} \right) + w_1(t)\\ y\left( {t + 1} \right) = 0 . 5x\left( {t - 1} \right) + w_2(t)\end{array}$$ (10)

Ecological data and validation

We assessed the causality measures in both modelled and actual predator and prey systems. The Lotka Volterra predator–prey model21,22 is expressed as follows:

$${\mathrm{d}}x/{\mathrm{d}}t = \alpha x - \beta xy \\ {\mathrm{d}}y/{\mathrm{d}}t = \delta xy - \gamma y$$ (11)

where x and y denote the prey and the predator, respectively (α = 1, β = 0.05, δ = 0.02, γ = 0.5 were used in this study).

Experimental data on Paramecium and Didinium are available online45, and these were obtained by scanning the graphics in Veilleux17 and digitising the time series. Wolf and moose field data are available online at the United States Isle Royale National Park23. The lynx and hare data were reconstructed from fur trading records obtained from Hudson’s Bay Company24. The benchmark time series46 was reconstructed from various sources in two periods (the 1844–1904 data were reconstructed from fur records, whereas the 1905–1935 data were derived from questionnaires)24. We used the fur-record time series between the year 1900 and 1920 for illustrative purposes.

Comparison with other causality methods

We compared causal decomposition with CCM, Granger’s causality, and MIME method20. The detail of the calculation of CCM5, Granger causality10, and MIME20 has been documented in the literature. Of note, both the CCM and Granger causality involve the selection of lag order. In this paper, the lag order (i.e., embedding dimension) of 3 was chosen for the application of CCM method to the ecosystem data5, and the lag order in the Granger causality was selected by the Bayesian information Criterion. The MIME is an entropy-based causality method which also employs the time precedence principle47 and is equivalent to Granger’s causality in certain conditions48.

Code availability

The source code for the causal-decomposition analysis (including ensemble EMD (http://rcada.ncu.edu.tw/research1.htm)) is implemented in Matlab (Mathworks Inc., Natick, MA, USA), and the current version (causal-decomposition-analysis-v1.0) or any future versions of the codes will be available at GitHub.

Data availability

The Didinium and Paramecium data that support the findings of this study are available in http://robjhyndman.com/tsdldata/data/veilleux.dat. Wolf and moose field data are available online at the United States Isle Royale National Park. Lynx and hare data are available online at https://github.com/bblais/Systems-Modeling-Spring-2015-Notebooks/tree/master/data/Lynx%20and%20Hare%20Data.