For newbies,

Object Detection: Finding an object in an image, for example, your camera app can detect faces, here face is the object.

Annotating/Labeling images: Suppose you want to train a machine learning model that can detect people wearing a helmet or not from an image. First you have to tell the algorithm how a helmet looks like, when you have a vast number of examples like this then only the algorithm can find a helmet in a new image. Annotation is the process of highlighting the part of the image where the object (helmet) is present

Preparing the dataset is always been hard and in case of image detection labeling/annotating objects in the image is much harder it takes lots of time and resources because a person has to go image by image and annotate each object in an image one by one manually, which is very time consuming and error prone.

In some situation, we encounter that dataset is very less in the count, for example, if you want to train an image detector for Chrome dinosaur game actors (like Trex, cactus, dragon, etc.) there are very few distinct images (less then 10), so if you go with 10 images only and train the model, it will definitely not able to perform well.

So there are two problems

How to increase the number of Images from a small set of images And how to easily label/annotate objects in a large number of images

Solution

From a single object we can generate multiple objects just by -

Randomly Skewing(distorting width height ratio) the object Resizing the object Randomly Moving the object in Image

By using the above combinations randomly we can create thousands of images, and if we include multiple objects in the same image the randomness will be more and we will be able to create millions of images and everything can be done programmatically

Annotating/labeling the images are nothing but generating the locations of images. As we are doing all those programmatically so we know all the locations of each object in the image so we can generate a file that contains the location of all the objects in the image.

Robo Annotator

A simple application, written in HTML, CSS, and JavaScript which takes list of objects as input and outputs number of images having those objects randomly position and skewed, and xml files (in Pascal VOC format) for each images which contains the location of the objects in the image, which can be used directly to train any model.

Robo Annotator generating images

How to use Robo Annotator

Install Node.js Checkout or download this repository in your local Run command npm install, to install all the libraries in your local To start, run command npm start , you will be able to generate the T-Rex dataset, play around with this To add your own objects: crop your objects and removed the unnecessary parts, if your objects don’t have white(#fff) background, clip the background of the object, you can use pixlr.com , photoshop or any other tool Once you have a list of objects prepared, open config.js , here you can set all the parameter like the a) Dimension of the images, b) No of images to be generated, c) Maximum and minimum width of the objects, d) Objects to be skewed (distorting the object width and height ratio) or not, etc.

Robo Annotator image location files (xml) are in Pascal VOC format which is totally compatible with labelimg(a popular image annotation tool)

To get more updates on this, follow me on twitter @PrnySarkar

Thanks for reading.