Support vector machines is a pretty straightforward classification algorithm to wrap our heads around. We will be looking at the simplest case “Linear SVM with two classes” by solving a problem :

Objective : Predict the location of a house based on two attributes : House Size in square feet and Price per square feet in dollars.

Data : The data we have is the house price and size for two cities “Arroyo Grande” (marked in red) and “Lompoc” (marked in blue).

Solution using Intuition : Before we apply machine learning, let’s use some intuition. To perform this prediction, all we need is a line (hyperplane) in the graph above to demarcate the red dots from the blue dots- and we can easily think at what angle (slope & intercept) the line should be made to perform this task. Our prediction is that, if a dot falls on one side of the line it should be a red house and if it’s on the other side it is a blue house.

Solution using SVM : Two lines are identified- one which passes through the red points- and one through the blue points which are known as the supporting hyperplanes (marked in dotted green) and the points they pass through are the support vectors. The main objective is to maximize the distance between those hyperplanes (lines) with a constraint that there should not be any points lying between those hyperplanes. After this optimization problem is solved, we identify the line halfway between the two supporting hyperplanes -called the optimal hyperplane. Simple enough, isn’t it? All we did was plot the blue line (which we previously did in our head). This is what our final output looks like :

Please find the R code below for performing SVM on the real estate dataset :