Making your first Convolutional Neural Network — Part 4: Improving the network through image augmentation qwertpi Follow Feb 23, 2019 · 5 min read

Introduction

What we are doing

Last time we looked at how to make predictions and you likely found that your model was not very accurate due to the lack of training data. In this post, I would be showing you how to utilize image augmentation to create more training data and therefore hopefully improve model performance. You can find the code we will be writing in this post here. In the next post, I will show you a nice trick of mine for training on the new larger dataset without loading all the data at once therbey preventing your computer crashing.

What actually is image augmentation?

The problem of not having enough data is a common one in machine learning and in image processing we are quite lucky in that we can use image augmentation to generate new data from our existing data. Although in our case we could simply download more data, it is faster to perform image augmentation than to download more images. Also, image augmentation can be implemented in ways that result in less storage space being used than downloading new images. This can be achieved by performing augmentation on images at train time and then deleting the augmented images once they have been used ie. generating images “on the fly”.

Image augmentation is when operations such as cropping, blurring, brightness adjustment and the addition of noise are applied to images in order to produce images that are similar to the original images but not identical

As well as providing us with more data “for free ”, if used correctly, image augmentation can actually make our model work better on real-world data. This is because real-world images sometimes do have some of the objects out of the image (which can be replicated in augmentation by cropping), poor lighting (which can be replicated in augmentation by brightness adjustment) and noise.

Image augmentation can also force the classifier to become generally more resilient. This is because by removing certain data (cropping) and replacing some pixels with random values (noise) we force our model to expand it’s the definition of cat and dog as not every feature will be visible eg. an eye may be obscured by noise forcing it to learn about one-eyed animals. Finally, image augmentation can be seen as acting in a similar way to the dropout layer, which randomly removes neurons from the network for one epoch only in order to force the other neurons to learn as well instead of becoming overly reliant on only a few neurons. Image augmentation does a similar thing by forcing the model to not become reliant on certain features which is even more useful for features that may not always be in images (if for example, the animal is facing away from the camera).

The code!

Ensure you have the library imgaug installed, then create a folder called augmented with subfolders called cats and dogs to store the augmented images in and then we can get to writing code

Imports

1–5 We import the libraries we will be using

6 We initialize our total variable as 0, I’m not sure why I put this here but I’ve archived the GitHub repo now and can’t be bothered to unarchive it just to move a line of code.

Image loading subroutine

8 We create a subroutine called load_images with the argument directories that if not passed will default to a blank list

9 We create a blank list called imgs to store our images in

10 We loop over each directory in the directories list

11 We loop over each file in a list of all the files in the directory directory

12 We load the image in the directory with the filename file

13 We turn the image into a numpy array

14 We append the image to our imgs list

15 We return the imgs list

Defining our augmentations

17 We create an imgaug sequential augmenter which will contain a list of augmentation

18 We multiply every pixel in every image by a random number between 0.5 and 1.5 (a new random number is picked for every image) — this is brightness adjustment

19 We add noise (I’ve no idea what most of the numbers do other than that changing 0.075 changes the severity of the noise and that per_channel means noise is added individually on each color channel resulting in RGB noise not black and white noise)

20 We crop a random amount between 0 and 30% of the images off both sides of the image (a new random number is generated for each side)

21 We flip half of the images from left to right

22 We apply all of the above in a random order (ie a different order is used for every image)

Augmenting the cat images

24–25 We load our images

27 We are going to create 5 sets of augmented images

28 We augment all our cat images

29 We loop over every image in our augmented images

30 We convert the numpy array to an image

31 We save our image to the folder augmented/cat with the file name aug_ followed by the value of total left padded so it is 8 characters long eg. the first file is aug_00000000.png

32 We increment the total by 1

Augmenting the dog images

33 We reset total to 0 as we are now dealing with the dog images so want our images to start at aug_00000000.png again

34–39 The same as the cat images

Conclusion

You now have a lot more images to train with. Next time I’ll show you how to load large quantities of images without crashing your computer so that you can perform training with these augmented images.

Please share this post on social media if you enjoyed it/found it useful. If there are any inaccuracies in this article please let me know. Please feel free to leave feedback in the comments so I know how to improve moving forward. Any questions or problems with the code tell me in the comments.