1. Determining a smooth latent variable

SFA is an unsupervised learning method to extract the smoothest (slowest) underlying functions or features from a time series. This can be used for dimensionality reduction, regression and classification. For example, we can have a highly erratic series that is determined by a nicer behaving latent variable.

Let’s start by generating time series D and S:

This is known as the logistic map. By plotting the series S, we can inspect its chaotic nature. The underlying time series D that’s driving the behavior of the curve above is much simpler:

How can we determine the simple, underlying driving force from the erratic time series?

We can use SFA to determine the most slowly varying features of a function. In our case, we would start off with data like S and end up with D, without necessarily knowing beforehand how S is generated.

Implementations of SFA aim at finding features of the input that are linear. But as we can see from our example, the driving force D is highly non-linear! This can be remedied by doing a non-linear expansion of the time series S first, then finding linear features of the expanded data. By doing this we find non-linear features of the original data.

Let’s create a new multivariate time series by stacking time delayed copies of S on it:

Next we do a cubic expansion of the data and extract the SFA features. A cubic expansion turns a 4 dimensional vector [a, b, c, d]ᵀ into the 34 element vector with elements t³, t²v, tvu, t², tv, t for distinct t, u, v ∈{a, b, c, d}.

Keep in mind that the best number of time delayed copies to be added varies from problem to problem. Alternatively, if the original data is too high-dimensional then dimensionality reduction needs to be done, for example with Principal Component Analysis.

Consider thus the following to be the hyperparameters of the method: the method of dimensionality expansion (reduction), the output dimension after expansion (reduction) and the number of slow features to be found.

Now, after adding the time delayed copies, the length of the time series changed from 300 to 297. The corresponding length of the slow feature time series is thus 297 as well. For nicer visualization here, we turn it to length 300 by prepending the first value to it and appending the last value two times. The features found by SFA have zero mean and unit variance, so we normalize D as well before visualizing the results.

Even considering only 300 data points, the SFA features manage to almost completely recover the underlying source - which is quite impressive!