Phase 2 — Training a Model on a GPU Machine

Step 3. Finding a Pretrained Model for Transfer Learning:

You can read more about this at medium.com/nanonets/nanonets-how-to-use-deep-learning-when-you-have-limited-data-f68c0b512cab. You need a pretrained model so you can reduce the amount of data required to train. Without it, you might need a few 100k images to train the model.

You can find a bunch of pretrained models here

Step 4. Training on a GPU (cloud service like AWS/GCP etc or your own GPU Machine):

Docker Image

The process of training a model is unnecessarily difficult to simplify the process we created a docker image would make it easy to train.

To start training the model you can run:

sudo nvidia-docker run -p 8000:8000 -v `pwd`:data docker.nanonets.com/pi_training -m train -a ssd_mobilenet_v1_coco -e ssd_mobilenet_v1_coco_0 -p '{"batch_size":8,"learning_rate":0.003}'

The docker image has a run.sh script that can be called with the following parameters

run.sh [-m mode] [-a architecture] [-h help] [-e experiment_id] [-c checkpoint] [-p hyperparameters] -h display this help and exit

-m mode: should be either `train` or `export`

-p key value pairs of hyperparameters as json string

-e experiment id. Used as path inside data folder to run current experiment

-c applicable when mode is export, used to specify checkpoint to use for export

You can find more details at:

To train a model you need to select the right hyper parameters.

Finding the right parameters

The art of “Deep Learning” involves a little bit of hit and try to figure out which are the best parameters to get the highest accuracy for your model. There is some level of black magic associated with this, along with a little bit of theory. This is a great resource for finding the right parameters.

Quantize Model (make it smaller to fit on a small device like the Raspberry Pi or Mobile)

Small devices like Mobile Phones and Rasberry PI have very little memory and computation power.

Training neural networks is done by applying many tiny nudges to the weights, and these small increments typically need floating point precision to work (though there are research efforts to use quantized representations here too).

Taking a pre-trained model and running inference is very different. One of the magical qualities of Deep Neural Networks is that they tend to cope very well with high levels of noise in their inputs.

Why Quantize?

Neural network models can take up a lot of space on disk, with the original AlexNet being over 200 MB in float format for example. Almost all of that size is taken up with the weights for the neural connections, since there are often many millions of these in a single model.

The Nodes and Weights of a neural network are originally stored as 32-bit floating point numbers. The simplest motivation for quantization is to shrink file sizes by storing the min and max for each layer, and then compressing each float value to an eight-bit integer.The size of the files is reduced by 75%.

Code for Quantization:

curl -L "https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz" |

tar -C tensorflow/examples/label_image/data -xz bazel build tensorflow/tools/graph_transforms:transform_graph

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \

--in_graph=tensorflow/examples/label_image/data/inception_v3_2016_08_28_frozen.pb \

--out_graph=/tmp/quantized_graph.pb \

--inputs=input \

--outputs=InceptionV3/Predictions/Reshape_1 \

--transforms='add_default_attributes strip_unused_nodes(type=float, shape="1,299,299,3")

remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true)

fold_batch_norms fold_old_batch_norms quantize_weights quantize_nodes

strip_unused_nodes sort_by_execution_order