Click here to share this article on LinkedIn »

Finance has always been interesting for statisticians, psychologists, data-miners, and other disciplines for many reasons such as its profitability, its chaos and the psychology behind it. To be honest, financial markets are very difficult to predict. This unpredictability is due to the fluctuations, which are a function of many parameters such as political decisions by governments, local and global news, and etc. Despite such complexity, there is still something predictable, and that is the “market psychology”! According to various researches and principles such as Elliott wave principle, financial markets are cyclic waves. In engineering, it is called cyclic signal and the cycles happen due to the mindset behind the market. As a simple example, if cryptocurrency X becomes very expensive, everybody wants to profit and start selling their cryptocurrency. At some point, almost everybody is a seller, so there is no interest to buy. Having many sellers and few buyers causes a drop in the price of the cryptocurrency X. This mind-set repeats in its various forms making a cyclic pattern within the data.

A number of researches indicate that the trends are predictable by machine learning; however, the exact values and exceptional behaviours happening in the market remain poorly predictable. On the other hand, many investors are happy with a trend predictions as they want to know when to buy or sell. Majority of investors are looking for profit, and the exact values are not their key interest. I believe FTS are good tools for the trend prediction because the fuzzy engine at their core can handle the uncertainties a financial market mainly exhibits.

A bit of theory should be enough to start trying things out! Having said this let’s follow CRISP-DM methodology and start coding.

Let’s recap the problem we are solving

As the first step in CRISP-DM, we need to understand the business / problem. The problem is “predicting crypto prices or trends using its historical data”.

Let’s read and understand the data

Cryptocompare provides us a rich set of Rest APIs to read the cryptocurrency data. Using their services we can read bitcoin (BTC) price against USD (US Dollars):

library(jsonlite) dataset.btc <- fromJSON("https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=USD&limit=2000&aggregate=1&e=CCCAGG") # head is the old data and tail the new data tail(dataset.btc$Data) head(dataset.btc$Data)

Let’s understand the data

Here is the description of the data:

Time: UNIX time of the data (it is daily data)

Close: the price of bitcoin at the end of the day

High: the highest price of bitcoin on that specific day

Low: the lowest price of bitcoin on that specific day

Open: the price of bitcoin at opening of the day

Volumefrom: the trading volume from

Volumeto: the trading value to

Let’s prepare the data

Maybe this step is the most important step. The better data is prepared, the quality of the prediction and its accuracy will increase. We look at some possible ways for data preparation:

Trend calculation

As mentioned earlier, trends are more stable lines and their predictions are more possible, in most cases.

R provides various packages and techniques to calculate trends, of which I chose “pracma” package and “movavg” function to smoothen the signal and calculate trends as below:

require(pracma) dataset.btc$Data$high_avg<-movavg(dataset.btc$Data$high,7,"w")

In the above picture you can see the price of Litecoin at its daily high rate in red, and the smoothed signal, which is the trend in the price, in green.

Fixing UNIX Times

The below function will convert a UNIX date to a human readable format.

library(dplyr) library(lubridate) convertUnix <- function(datast){ # This function inputs the dataframe discussed above and adds a new column which is conversion of Unix date to human readable dates # Args: # datast: cryptocurrency dataframe with aforementioned format # # Return: # datast: the same dataset + newly added column data.frame(datast) %>% mutate(date=as.Date(as.POSIXct(time, origin="1970-01-01")))->datast return(datast) }

Deltas vs. Actual Values

It is a common practice in time series prediction to predict the level of changes rather than actual values. This gives a better behavioural insight, especially where [“where” or ‘when”?] the actual values go beyond the boundaries and reaches to the numbers that have never happened before (such as when the Bitcoin’s price reaches to 18k USD which has never happened in the historical data in 2017). These extreme values can still be predictable if deltas are calculated (delta means the level of changes). The below function calculates these deltas.

convertPercent <- function(data){ # This function inputs a vector of numbers and returns a vector that is the differences between ith and (i-1)th element # Args: # data: the numeric input vector and thus not a dataframe # Return: # data: the percentage of the changes data= diff(data)/data[-NROW(data)] * 100 return(data) }

Let’s Model the Data

The forth phase of CRISP-DM methodology is about modelling and prediction. We use the below function to pass actual values or trend values to predict the price of a cryptocurrency and analyze its accuracy.

FTS_Predict<- function(data,year,month,day,freq){ # This function builds a time series and predicts the 5 steps ahead # Args: # data: the values that should be predicted (trend or actual data) # year: starting point of the data in terms of year # month: starting point of the data in terms of month # day: starting point of the data in terms of day # freq: frequency to build the time series # # Return: # crypto_predict: the prediction and all attached information (such as accuracy and etc) #Changing format to TS crypto<-ts(data,start = c(year,month,day),frequency =freq ) # Finding the best C value by DOC function # Abbasov-Mamedova model str.C1<-DOC(crypto,n=7,w=7,D1=0,D2=0,CEF="MAPE",type="Abbasov-Mamedova") C1<-as.numeric(str.C1[1]) crypto_predict<-fuzzy.ts2(crypto,n=7,w=7,D1=0,D2=0,C=C1,forecast=5,type="Abbasov-Mamedova",trace=TRUE,plot=TRUE) return(crypto_predict) }

Let’s Evaluate the Results

In the fifth phase of CRISP-DM methodology, we do some (cross) validation. The following cryptocurrencies are tested and the results are shown. The conclusion we can draw here is that the trend prediction is less erroneous than actual values prediction.

Where:

ME (Mean error): sum(et)/n

(Mean error): sum(et)/n MAE (Mean absolute error): sum(|et|)/n

(Mean absolute error): sum(|et|)/n MPE (Mean percentage error): sum((et/Yt)*100)/n

(Mean percentage error): sum((et/Yt)*100)/n MAPE (Mean absolute percentage error): sum((|et|/Yt)*100)/n

(Mean absolute percentage error): sum((|et|/Yt)*100)/n MSE (Mean squared error): sum(et*et)/n

(Mean squared error): sum(et*et)/n RMSE (Root of mean square error): sqrt(sum(et*et)/n)

(Root of mean square error): sqrt(sum(et*et)/n) U (Theil’s U statistic): RMSE of the forecast/RMSE of the naive forecast

[The Yt is ’observation series’. The Ft is ’Forecasting series’. The et is ’residual series’. The n is size of sample.]

Conclusion and Future Work

In this article, I used Fuzzy Time Series (FTS) to predict cryptocurrency prices. As came into the evaluation sector, FTS is a promising technique especially to predict trends in the cryptocurrencies.

The goal of this article is to introduce FTS into the realm of crypto-prediction and still the following steps can be taken as future work:

Parameter tuning:

There are still room in parameter tuning during data preparation and modelling.

Fusion and deltas:

We have discussed Delta calculation and percentages of changes, but we never used them later on. This is because it is part of a bigger approach where I want to combine more sources of data for prediction. This will be discussed in my next article.