The history of Deep learning is full of bold and simple ideas. Sometimes they are genre-defining successes like GAN’s or fade out of fashion like Deep belief networks.

I tried out my own counter-intuitive but a simple idea which was inspired by how children play imagination games. I’ve had some interesting though inconclusive results which I wanted to share.

Experiment details:

DL tool: TensorFlow 1.11 with Keras API

Environment: Google Collaboratory instance with GPU accelerator

Dataset: CIFAR10 (32*32*3 images;10 classes; 50,000 training samples; 10,000 validation samples)

Training batch size: 100, Learning rate: 0.001, Epochs: 112

Loss function: Categorical cross entropy

Optimizer: ADAM

Model: MobileNet V1 (alpha = 0.75; 1.84 M parameters), MobileNet V2 (alpha = 0.75; 1.39 M parameters)

Note that both models don’t use pre-trained weights.

Training and validation profiles

Here ‘normal’ refers to a standard training schedule while ‘imag’ refers to the training schedule that I am proposing.

MobileNet V1