Support Vector Regression

In this article I will show you how to create your own stock prediction Python program using a machine learning algorithm called Support Vector Regression (SVR). The program will read in Facebook (FB) stock data and make a prediction of the price based on the day.

A Support Vector Regression (SVR) is a type of Support Vector Machine,and is a type of supervised learning algorithm that analyzes data for regression analysis. In 1996, this version of SVM for regression was proposed by Christopher J. C. Burges, Vladimir N. Vapnik, Harris Drucker, Alexander J. Smola and Linda Kaufman. The model produced by SVR depends only on a subset of the training data, because the cost function for building the model ignores any training data close to the model prediction.

Support Vector Machine Pros:

It is effective in high dimensional spaces. It works well with clear margin of separation. It is effective in cases where number of dimensions is greater than the number of samples.

Support Vector Machine Regression Cons:

It does not perform well, when we have large data set. Low performance if the data set is noisy ( a large amount of additional meaningless information).

Types Of Kernel:

linear polynomial radial basis function (rbf) sigmoid

If you prefer not to read this article and would like a video representation of it, you can check out the YouTube Video below. It goes through everything in this article with a little more detail, and will help make it easy for you to start programming your own Machine Learning model even if you don’t have the programming language Python installed on your computer. Or you can use both as supplementary materials for learning about Machine Learning !

Start Programming:

The first thing that I like to do before writing a single line of code is to put in a description in comments of what the code does. This way I can look back on my code and know exactly what it does.

#Description: This program predicts the price of FB stock for a specific day

# using the Machine Learning algorithm called

# Support Vector Regression (SVR) Model

Now import the packages to make it easier to write the program.

#import the packages

from sklearn.svm import SVR

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

plt.style.use('fivethirtyeight')

Next I will load the Facebook (FB) stock data that I got from finance.yahoo.com into a variable called ‘df’ short for data frame. Then I will print the first 7 rows of data

NOTE: This is data from Yahoo for the past 30 days, 5–1–2019 to 5–31–2019.

Remember the market is open only on weekdays.

#Load the data

#from google.colab import files # Use to load data on Google Colab

#uploaded = files.upload() # Use to load data on Google Colab

df = pd.read_csv('FB_Stock.csv')

df

Get the number of rows and columns in the data set to see the count of each. There are 22 rows and 7 columns of data.

#Get the number of rows and columns in the data set

df.shape

22 rows and 7 columns in the data set

Print the last row of data (this will be the data that we test on). Notice the date is 05–31–2019, so the day is 31. This will be the input of the model to predict the adjusted close price which is $177.470001.

#Print the last row of data (this will be the data that we test on)

actual_price = df.tail(1)

actual_price

Create the variables that will be used as the independent and dependent data sets by setting them equal to empty lists.

Recreate the data frame by getting all of the data except for that last row which I will use to test the models later, and store the new data with the last row missing back into ‘df’. Then print the new count of rows and columns for the new data set.

#Get all of the data except for the last row

df = df.head(len(df)-1)

print(df)

print(df.shape)

The new data set

The new data with one less row. 21 rows and 7 columns

#Create the lists / X and y data set

days = list()

adj_close_prices = list()

Get all of the rows from the Date column store it into a variable called ‘df_days’ and get all of the rows from the Adj Close Price column and store the data into a variable called ‘df_adj_close_price’.

df_days = df.loc[:,'Date']

df_adj_close = df.loc[:,'Adj Close Price']

Create the independent data set ‘X’ and store the data in the variable ‘days’.

Create the dependent data set ‘y’ and store the data in the variable ‘adj_close_prices’.

Both can be done by appending the data to each of the lists.

NOTE: For the independent data set we want only the day from the date, so I use the split function to get just the day and cast it to an integer while appending the data to the list.

#Create the independent data set 'X' as days

for day in df_days:

days.append( [int(day.split('/')[1]) ] ) #Create the dependent data set 'y' as prices

for adj_close_price in df_adj_close:

adj_close_prices.append(float(adj_close_price))

Look and see what days were recorded in the data set.

print(days)

The days that were recorded in the data set

Next, I will create and train the 3 different Support Vector Regression (SVR)models with three different kernels to see which one performs the best.

#Create and train an SVR model using a linear kernel

lin_svr = SVR(kernel='linear', C=1000.0)

lin_svr.fit(days,adj_close_prices) #Create and train an SVR model using a polynomial kernel

poly_svr = SVR(kernel='poly', C=1000.0, degree=2)

poly_svr.fit(days, adj_close_prices) #Create and train an SVR model using a RBF kernel

rbf_svr = SVR(kernel='rbf', C=1000.0, gamma=0.15)

rbf_svr.fit(days, adj_close_prices)

Last but not least I will plot the models on a graph to see which has the best fit and return the prediction of the day.

#Plot the models on a graph to see which has the best fit

plt.figure(figsize=(16,8))

plt.scatter(days, adj_close_prices, color = 'black', label='Data')

plt.plot(days, rbf_svr.predict(days), color = 'green', label='RBF Model')

plt.plot(days, poly_svr.predict(days), color = 'orange', label='Polynomial Model')

plt.plot(days, lin_svr.predict(days), color = 'blue', label='Linear Model')

plt.xlabel('Days')

plt.ylabel('Adj Close Price')

plt.title('Support Vector Regression')

plt.legend()

plt.show()

The best model from the graph below seems to be the RBF which is a Support Vector Regression model that uses a kernel called radial basis function. However this graph can be misleading.

Now I can start making my FB price prediction. Recalling the last row of data that was left out of the original data set, the date was 05–31–2019, so the day is 31. This will be the input to the models to predict the adjusted close price which is $177.470001.

So now I will predict the price by giving the models a value of 31.

day = [[31]]

print('The RBF SVR predicted price:',rbf_svr.predict(day))

print('The linear SVR predicted price',lin_svr.predict(day))

print('The polynomial SVR predicted price',poly_svr.predict(day))

The polynomial SVR model predicted the price for day 31 to be $180.39533267, which is pretty close to the actual price of $177.470001. In this case the best model seems to be the polynomial SVR. That is it, you are done creating your SVR program to predict FB stock!

If you are interested in reading more on machine learning to immediately get started with problems and examples then I strongly recommend you check out Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. It is a great book for helping beginners learn how to write machine learning programs, and understanding machine learning concepts.

Thanks for reading this article I hope its helpful to you all ! If you enjoyed this article and found it helpful please leave some claps to show your appreciation. Keep up the learning, and if you like machine learning, mathematics, computer science, programming or algorithm analysis, please visit and subscribe to my YouTube channels (randerson112358 & compsci112358 ).