Working with less data - Image classification with very few images¶

Every ML practitioner has attempted Kaggle's Dogs v/s Cats competition which pretty much is a solved problem thanks to methods like Transfer Learning on top of architectures like ResNet. But this dataset can be used to show some cool techniques by self imposing certain constraints, like using only a fraction of the data to obtain the same results!

This post is to illustrate how - with modern techniques - we can obtain near state-of-the-art results on image classification tasks. I'm going to use Kaggle's Dogs v/s Cats to prove my points, but these results can be extended to any similar task, including microscopic images, satellite images, and the works.

The two techniques I'm going to experiment with are:¶

Data augmentation. Pseudo-labeling. (a variant of semi-supervised learning)

I'm using Keras with a Tensorflow backend, but the same results can be obtained on any modern deep learning library thanks to great implementations of the above techniques being readily available.