Auto-correlations

Before we decide which model to use, we need to look at auto-correlations.

Autocorrelation correlogram. Seasonal patterns of time series can be examined via correlograms, which display graphically and numerically the autocorrelation function (ACF). Auto-correlation in pandas plotting and statsmodels graphics standardize the data before computing the auto-correlation. These libraries subtract the mean and divide by the standard deviation of the data.

When using standardization, they make an assumption that your data has been generated with a Gaussian law (with a certain mean and standard deviation). This may not be the case in reality.

Correlation is sensitive. Both (matplotlib and pandas plotting) of these functions have their drawbacks. The figure generated by the following code using matplotlib will be identical to figure generated by pandas plotting or statsmodels graphics.

Partial autocorrelations. Another useful method to examine serial dependencies is to examine the partial autocorrelation function (PACF) – an extension of autocorrelation, where the dependence on the intermediate elements (those within the lag) is removed.

Once we determine the nature of the auto-correlations we use the following rules of thumb.

Rule 1: If the A C F shows exponential decay, the P A C F has a spike at lag 1, and no correlation for other lags, then use one autoregressive ( p ) parameter

shows exponential decay, the has a spike at lag 1, and no correlation for other lags, then use one autoregressive parameter Rule 2: If the A C F shows a sine-wave shape pattern or a set of exponential decays, the P A C F has spikes at lags 1 and 2, and no correlation for other lags, the use two autoregressive ( p ) parameters

shows a sine-wave shape pattern or a set of exponential decays, the has spikes at lags 1 and 2, and no correlation for other lags, the use two autoregressive parameters Rule 3: If the A C F has a spike at lag 1, no correlation for other lags, and the P A C F damps out exponentially, then use one moving average ( q ) parameter.

has a spike at lag 1, no correlation for other lags, and the damps out exponentially, then use one moving average parameter. Rule 4: If the A C F has spikes at lags 1 and 2, no correlation for other lags, and the P A C F has a sine-wave shape pattern or a set of exponential decays, then use two moving average ( q) parameter.

has spikes at lags 1 and 2, no correlation for other lags, and the has a sine-wave shape pattern or a set of exponential decays, then use two moving average parameter. Rule 5: If the A C F shows exponential decay starting at lag 1, and the P A C F shows exponential decay starting at lag 1, then use one autoregressive ( p ) and one moving average ( q ) parameter.

Removing serial dependency. Serial dependency for a particular lag can be removed by differencing the series. There are two major reasons for such transformations.