The bias-variance tradeoff is an important concept in machine learning that relates to overfitting and underfitting the data by a model. While variance is a familiar concept, what is bias?

Recall that an estimator is unbiased if its expectation tends to the true value of the parameter. Technically, for an estimator , bias is defined as :

A commonly used loss function that we minimize is the mean squared error, with some basic manipulations, we can establish a relation between the error, the variance and the bias of an estimator.

MSE

If the model is able to fit the training dataset very well it would have a low bias. But that is not necessarily a good thing, as it could have very high variance if it is a very high dimensional model or has tons of parameters. Essentially the model is just “memorizing” the data with its parameters instead of generalizing from it. On the other hand, a less powerful model might not do so well on the training data but it generalizes better. Such a model would have a higher bias and lower variance.

Precision and Recall Now moving on to precision and recall, which are related to minimizing false positives and false negatives respectively in classification problems. Precision Recall = where TP = #true positives, FP = #true negatives and FN = #false negatives. In the extreme case, you could have a classifier which simply remembers the training set, in this case you would have a recall close to or even equal to 1 and a precision close to 0. A high recall and low precision model corresponds to the case of having high variance and low bias. Similarly you could have a model which gets some false negatives but gets fewer false positives, ie, it is high precision – low recall, then it corresponds to the high bias – low variance case. However, you need to strike the right balance in both cases. The goal is to reduce the total error in the regression case and to simultaneously increase both precision and recall in the classification scenario (F-score). It is a mistake to focus on optimizing the accuracy instead of the F-score as the classification accuracy is a biased measure for skewed distribution of classes.