Explain like I am five

Logistic regression is a brother of linear regression but unlike its name, logistic regression is a classification algorithm.

Let’s brush up with first linear regression:

formula:

where,

y = value that has to be predicted

m = slope of the line

x = input data

c = y intercept

With this values, we can predict y values such as.

here the blue points are the x values (the input data).

now using the input data, we can calculate slope and y coordinate such that our predicted line (red line)should cover most of the points.

now using this line we can predict any values of y given its x values.

Now one thing to note from linear regression that it works with continuous data, But if we need linear regression for our classification algorithms, we need to further tweak our algorithm a bit.

First we need to define a threshold such that if our predicted value is lower than threshold then it is of class1 otherwise class2.

Now if you are thinking “oh that’s easy we have to define linear regression with threshold and vola it becomes classification algorithm,there is a trick in it. We have to define threshold value by ourselves, and for large datasets it will be impossible for us to calculate threshold. Moreover the threshold value once defined will be same even if our predicted values change.

For more reference go here.

On the other hand, a logistic regression produces a logistic curve, which is limited to values between 0 and 1.

Logistic regression is similar to a linear regression, but the curve is constructed using the natural logarithm of the “odds” of the target variable, rather than the probability. Moreover, the predictors do not have to be normally distributed or have equal variance in each group.

If you still didn’t understand it, then i recommend you to see the following video which explains logistic regression in a simple way.