Weekend of a Data Scientist is series of articles with some cool stuff I care about. Idea is to spend weekend by learning something new, reading and coding.

Hi! Just wanna share some thoughts about evaluating Machine Learning models when you working with Time Series data.

When it comes to evaluating your model there are straightforward steps you can take, sklearn.model_selection have pretty much everything you may need (train-test split, k-fold cv, etc.)

There is the video that I really like from PyCon 2016 Jake Vanderplas — Statistics for Hackers its covers model evaluation problem with great examples.

But, just a few words about cross-validation.

When you build your model (or just imported it from sklearn), you need to evaluate its performance. Cross-validation is a statistical method that can help you with that. For example, in K-fold cross-validation, you need to split your dataset into several folds, then you train your model on all folds except one and test model on remain fold. You need to repeat this steps until you tested you model on each of the folds and you final metrics will be average of scores obtained in every fold. This allows you to prevent overfitting, and evaluate model performance in a more robust way than simple train-test split.

Example of data splitting in cross-validation

Evaluating Time Series models

So far so good! But when it comes to time series data we cannot apply cross-validation. Because we want to predict future based on a past and with k-fold cv we may train on data from future to predict past. Another word we want to avoid future-looking when we train our model (a big no-no for Time Series).

So I like to use the following strategy:

Split data on a train and test sets, but the test set will contain only one data point. Then I train my model and predicting on test data point, collecting my prediction and real answer. After that, I move one step forward and append previous data point to the training dataset and repeat that until I will have sufficient amount os statistic. I also retrain my model on each step.

Example of data splitting during bracketing of Time Series model

There is an implementation of the similar approach in sklearn — Time Series Split.

Example with BTC price prediction

Let’s build and backtest our model for predicting time series data.

For the sake of example, I will use a simple linear model — Bayesian Ridge to predict next day BTC/USD — Low price with custom-made backtest.

1. Load data from cryptocompare API

2. Backtest my model

3. Visualize our predictions

visualization of the results

MAPE: 2.22 % — mean absolute percentage error on 20 days

In conclusion:

It is important to properly validate your model and cross-validation generally can give you robust results For time series data it’s important to avoid shuffling and future looking, so depending on your data you can use Time Series Split from sklearn or even build your own backtest Use nested cross-validation if you are working with multiple time series

References:

So how do you evaluate and backtest your models?