A Built-In Image Classification Request

Creating a diverse image classification model requires a lot of images and hours of training time. Luckily, Apple now provides a built-in multi-label classification model that’s wrapped inside the VNClassifyImageRequest . The classification request consists of around 1000 classes. To know the taxonomy of the classifier, simply invoke the following function to get a list of all the classes.

try VNClassifyImageRequest.knownClassifications(forRevision: VNClassifyImageRequestRevision1)

Evaluation of multi-label classifiers is less straightforward, unlike binary classifiers that return a single label as the prediction. Using accuracy as the metric to evaluate a model doesn’t always represent how good or bad the model is. For multi-label classifiers, a false prediction of a target class is not a hard-lined right or wrong. Instead, more emphasis is given to a group of classes that are predicted for the image.

Recall — A metric is used to evaluate the overall relevancy of the model. When you aren’t too concerned with false predictions, you’d prefer a model with higher recall value.

— A metric is used to evaluate the overall relevancy of the model. When you aren’t too concerned with false predictions, you’d prefer a model with higher recall value. Precision —A metric to measure the quality of the model. In cases where false positive isn’t catastrophic, a model with high precision is preferred.

The formulas for precision and recall are illustrated below:

The terms TP and TN are straightforward but it can get tricky to get a hang of their counterparts(FP and FN). We hope the following list clears the terminologies once and for all.

TP and TN — A true positive occurs when a model predicts a class label for the image and the image has that in real. A true negative is just the opposite of a TP. It occurs when a class label is predicted as not a part of the image, and it in fact doesn’t exist in the image.

occurs when a model predicts a class label for the image and the image has that in real. A is just the opposite of a TP. It occurs when a class label is predicted as not a part of the image, and it in fact doesn’t exist in the image. FP — A false positive occurs when a model predicts a class label for an image, but the image does not contain that class. For example, labeling an image as a bike when there isn’t one would fall under the FP category.

occurs when a model predicts a class label for an image, but the image does not contain that class. For example, labeling an image as a bike when there isn’t one would fall under the FP category. FN — A false negative occurs when the class label(s) predicted is not a part of the image and it doesn’t exist in the image as well. For example, predicting a message not as a SPAM when it isn’t in reality falls under the FN category.

The new VNClassifyImageRequest ’s VNClassificationObservation possesses an API to filter the results by precision or recall, whichever suits your use cases.

The following code showcases a way to filter by setting a specific recall value on the precision-recall curve.

let recallFilter = classifications.filter{$0.hasMinimumPrecision(0.0, forRecall: 0.8)}

The equivalent formula for specifying a precision along the recall value is:

classifications.filter{$0.hasMinimumRecall(0.0, forPrecision: 0.8)}

In cases where the minimum recall (or precision, depending on which filter you use) that we specify is greater than zero, the precision value must fall in that valid region on the PR curve for the filter to work.