The following problems appeared in the assignments in the Udacity course Deep Learning (by Google). The descriptions of the problems are taken from the assignments (continued from the last post).

Classifying the alphabets with notMNIST dataset with Deep Network

Here is how some sample images from the dataset look like:

Let’s try to get the best performance using a multi-layer model! (The best reported test accuracy using a deep network is 97.1%).

One avenue you can explore is to add multiple layers.

Another one is to use learning rate decay.

Learning L2-Regularized Deep Neural Network with SGD

The following figure recapitulates the neural network with a 3 hidden layers, the first one with 2048 nodes, the second one with 512 nodes and the third one with with 128 nodes, each one with Relu intermediate outputs. The L2 regularizations applied on the lossfunction for the weights learnt at the input and the hidden layers are λ1, λ2, λ3 and λ4, respectively.

The next 3 animations visualize the weights learnt for 400 randomly selected nodes from hidden layer 1 (out of 2096 nodes), then another 400 randomly selected nodes from hidden layer 2 (out of 512 nodes) and finally at all 128 nodes from hidden layer 3, at different steps using SGD and L2 regularized loss function (with λ1 = λ2 = λ3 = λ4

=0.01). As can be seen below, the weights learnt are gradually capturing (as the SGD step increases) the different features of the alphabets at the corresponding output neurons.





Results with SGD

Initialized

Validation accuracy: 27.6%

Minibatch loss at step 0: 4.638808

Minibatch accuracy: 7.8%

Validation accuracy: 27.6%

Validation accuracy: 86.3%

Minibatch loss at step 500: 1.906724

Minibatch accuracy: 86.7%

Validation accuracy: 86.3%

Validation accuracy: 86.9%

Minibatch loss at step 1000: 1.333355

Minibatch accuracy: 87.5%

Validation accuracy: 86.9%

Validation accuracy: 87.3%

Minibatch loss at step 1500: 1.056811

Minibatch accuracy: 84.4%

Validation accuracy: 87.3%

Validation accuracy: 87.5%

Minibatch loss at step 2000: 0.633034

Minibatch accuracy: 93.8%

Validation accuracy: 87.5%

Validation accuracy: 87.5%

Minibatch loss at step 2500: 0.696114

Minibatch accuracy: 85.2%

Validation accuracy: 87.5%

Validation accuracy: 88.3%

Minibatch loss at step 3000: 0.737464

Minibatch accuracy: 86.7%

Validation accuracy: 88.3%

Test accuracy: 93.6%

Batch size = 128, number of iterations = 3001 and Drop-out rate = 0.8 for training dataset are used for the above set of experiments, with learning decay. We can play with the hyper-parameters to get better test accuracy.

Convolution Neural Network

Previously we trained fully connected networks to classify notMNIST characters. The goal of this assignment is to make the neural network convolutional.

Let’s build a small network with two convolutional layers, followed by one fully connected layer. Convolutional networks are more expensive computationally, so we’ll limit its depth and number of fully connected nodes. The below figure shows the simplified architecture of the convolution neural net.

As shown above, the ConvNet uses:

2 convolution layers each with 5×5 kernel 16 filters 2×2 strides SAME padding

each with 64 hidden nodes

nodes 16 batch size

size 1K iterations



Results

Initialized

Minibatch loss at step 0: 3.548937

Minibatch accuracy: 18.8%

Validation accuracy: 10.0%

Minibatch loss at step 50: 1.781176

Minibatch accuracy: 43.8%

Validation accuracy: 64.7%

Minibatch loss at step 100: 0.882739

Minibatch accuracy: 75.0%

Validation accuracy: 69.5%

Minibatch loss at step 150: 0.980598

Minibatch accuracy: 62.5%

Validation accuracy: 74.5%

Minibatch loss at step 200: 0.794144

Minibatch accuracy: 81.2%

Validation accuracy: 77.6%

Minibatch loss at step 250: 1.191971

Minibatch accuracy: 62.5%

Validation accuracy: 79.1%

Minibatch loss at step 300: 0.441911

Minibatch accuracy: 87.5%

Validation accuracy: 80.5%

Minibatch loss at step 350: 0.605005

Minibatch accuracy: 81.2%

Validation accuracy: 79.3%

Minibatch loss at step 400: 1.032123

Minibatch accuracy: 68.8%

Validation accuracy: 81.5%

Minibatch loss at step 450: 0.869944

Minibatch accuracy: 75.0%

Validation accuracy: 82.1%

Minibatch loss at step 500: 0.530418

Minibatch accuracy: 81.2%

Validation accuracy: 81.2%

Minibatch loss at step 550: 0.227771

Minibatch accuracy: 93.8%

Validation accuracy: 81.8%

Minibatch loss at step 600: 0.697444

Minibatch accuracy: 75.0%

Validation accuracy: 82.5%

Minibatch loss at step 650: 0.862341

Minibatch accuracy: 68.8%

Validation accuracy: 83.0%

Minibatch loss at step 700: 0.336292

Minibatch accuracy: 87.5%

Validation accuracy: 81.8%

Minibatch loss at step 750: 0.213392

Minibatch accuracy: 93.8%

Validation accuracy: 82.6%

Minibatch loss at step 800: 0.553639

Minibatch accuracy: 75.0%

Validation accuracy: 83.3%

Minibatch loss at step 850: 0.533049

Minibatch accuracy: 87.5%

Validation accuracy: 81.7%

Minibatch loss at step 900: 0.415935

Minibatch accuracy: 87.5%

Validation accuracy: 83.9%

Minibatch loss at step 950: 0.290436

Minibatch accuracy: 93.8%

Validation accuracy: 84.0%

Minibatch loss at step 1000: 0.400648

Minibatch accuracy: 87.5%

Validation accuracy: 84.0%

Test accuracy: 90.3%

The following figures visualize the feature representations at different layers for the first 16 images for the last batch with SGD during training:

The next animation shows how the features learnt at convolution layer 1 change with iterations.



Convolution Neural Network with Max Pooling

The convolutional model above uses convolutions with stride 2 to reduce the dimensionality. Replace the strides by a max pooling operation of stride 2 and kernel size 2. The below figure shows the simplified architecture of the convolution neural net with MAX Pooling layers.

As shown above, the ConvNet uses:

2 convolution layers each with 5×5 kernel 16 filters 1×1 stride 2×2 Max-pooling SAME padding

each with 64 hidden nodes

nodes 16 batch size

size 1K iterations



Results

Initialized

Minibatch loss at step 0: 4.934033

Minibatch accuracy: 6.2%

Validation accuracy: 8.9%

Minibatch loss at step 50: 2.305100

Minibatch accuracy: 6.2%

Validation accuracy: 11.7%

Minibatch loss at step 100: 2.319777

Minibatch accuracy: 0.0%

Validation accuracy: 14.8%

Minibatch loss at step 150: 2.285996

Minibatch accuracy: 18.8%

Validation accuracy: 11.5%

Minibatch loss at step 200: 1.988467

Minibatch accuracy: 25.0%

Validation accuracy: 22.9%

Minibatch loss at step 250: 2.196230

Minibatch accuracy: 12.5%

Validation accuracy: 27.8%

Minibatch loss at step 300: 0.902828

Minibatch accuracy: 68.8%

Validation accuracy: 55.4%

Minibatch loss at step 350: 1.078835

Minibatch accuracy: 62.5%

Validation accuracy: 70.1%

Minibatch loss at step 400: 1.749521

Minibatch accuracy: 62.5%

Validation accuracy: 70.3%

Minibatch loss at step 450: 0.896893

Minibatch accuracy: 75.0%

Validation accuracy: 79.5%

Minibatch loss at step 500: 0.610678

Minibatch accuracy: 81.2%

Validation accuracy: 79.5%

Minibatch loss at step 550: 0.212040

Minibatch accuracy: 93.8%

Validation accuracy: 81.0%

Minibatch loss at step 600: 0.785649

Minibatch accuracy: 75.0%

Validation accuracy: 81.8%

Minibatch loss at step 650: 0.775520

Minibatch accuracy: 68.8%

Validation accuracy: 82.2%

Minibatch loss at step 700: 0.322183

Minibatch accuracy: 93.8%

Validation accuracy: 81.8%

Minibatch loss at step 750: 0.213779

Minibatch accuracy: 100.0%

Validation accuracy: 82.9%

Minibatch loss at step 800: 0.795744

Minibatch accuracy: 62.5%

Validation accuracy: 83.7%

Minibatch loss at step 850: 0.767435

Minibatch accuracy: 87.5%

Validation accuracy: 81.7%

Minibatch loss at step 900: 0.354712

Minibatch accuracy: 87.5%

Validation accuracy: 83.8%

Minibatch loss at step 950: 0.293992

Minibatch accuracy: 93.8%

Validation accuracy: 84.3%

Minibatch loss at step 1000: 0.384624

Minibatch accuracy: 87.5%

Validation accuracy: 84.2%

Test accuracy: 90.5%

As can be seen from the above results, with MAX POOLING, the test accuracy increased slightly.



The following figures visualize the feature representations at different layers for the first 16 images during training with Max Pooling:

Till now the convnets we have tried are small enough and we did not obtain high enough accuracy on the test dataset. Next we shall make our convnet deep to increase the test accuracy.

Deep Convolution Neural Network with Max Pooling

Let’s Try to get the best performance you can using a convolutional net. Look for exampleat the classic LeNet5 architecture, adding Dropout, and/or adding learning rate decay.

Let’s try with a few convnets:

1. The following ConvNet uses:

2 convolution layers (with Relu ) each using 3×3 kernel 16 filters 1×1 stride 2×2 Max-pooling SAME padding

layers (with ) each using all weights initialized with truncated normal distribution with sd 0.01

initialized with truncated normal distribution with sd 0.01 single hidden layer ( fully connected ) with 1024 hidden nodes

layer ( ) with 1024 nodes 128 batch size

size 3K i terations

0.01 (=λ1=λ2) for regularization

No dropout

No learning decay

Results

Minibatch loss at step 0: 2.662903

Minibatch accuracy: 7.8%

Validation accuracy: 10.0%

Minibatch loss at step 500: 2.493813

Minibatch accuracy: 11.7%

Validation accuracy: 10.0%

Minibatch loss at step 1000: 0.848911

Minibatch accuracy: 82.8%

Validation accuracy: 79.6%

Minibatch loss at step 1500: 0.806191

Minibatch accuracy: 79.7%

Validation accuracy: 81.8%

Minibatch loss at step 2000: 0.617905

Minibatch accuracy: 85.9%

Validation accuracy: 84.5%

Minibatch loss at step 2500: 0.594710

Minibatch accuracy: 83.6%

Validation accuracy: 85.7%

Minibatch loss at step 3000: 0.435352

Minibatch accuracy: 91.4%

Validation accuracy: 87.2%

Test accuracy: 93.4%



As we can see, by introducing couple of convolution layers, the accuracy increased from 90% (refer to the earlier blog) to 93.4% under the same settings.

Here is how the hidden layer weights (400 out of 1024 chosen randomly) changes, although the features don’t clearly resemble the alphabets anymore, which is quite expected.

2. The following ConvNet uses:

2 convolution layers (with Relu ) each using 3×3 kernel 32 filters 1×1 stride 2×2 Max-pooling SAME padding

layers (with ) each using all weights initialized with truncated normal distribution with sd 0.1

initialized with truncated normal distribution with sd 0.1 2 hidden layers ( fully connected ) both with 256 hidden nodes

layers ( ) both with 256 nodes 128 batch size

size 6K i terations

0.7 dropout

learning decay starting with 0.1

Results

Minibatch loss at step 0: 9.452210

Minibatch accuracy: 10.2%

Validation accuracy: 9.7%

Minibatch loss at step 500: 0.611396

Minibatch accuracy: 81.2%

Validation accuracy: 81.2%

Minibatch loss at step 1000: 0.442578

Minibatch accuracy: 85.9%

Validation accuracy: 83.3%

Minibatch loss at step 1500: 0.523506

Minibatch accuracy: 83.6%

Validation accuracy: 84.8%

Minibatch loss at step 2000: 0.411259

Minibatch accuracy: 89.8%

Validation accuracy: 85.8%

Minibatch loss at step 2500: 0.507267

Minibatch accuracy: 82.8%

Validation accuracy: 85.9%

Minibatch loss at step 3000: 0.414740

Minibatch accuracy: 89.1%

Validation accuracy: 86.6%

Minibatch loss at step 3500: 0.432177

Minibatch accuracy: 85.2%

Validation accuracy: 87.0%

Minibatch loss at step 4000: 0.501300

Minibatch accuracy: 85.2%

Validation accuracy: 87.1%

Minibatch loss at step 4500: 0.391587

Minibatch accuracy: 89.8%

Validation accuracy: 87.7%

Minibatch loss at step 5000: 0.347674

Minibatch accuracy: 90.6%

Validation accuracy: 88.1%

Minibatch loss at step 5500: 0.259942

Minibatch accuracy: 91.4%

Validation accuracy: 87.8%

Minibatch loss at step 6000: 0.392562

Minibatch accuracy: 85.9%

Validation accuracy: 88.4%

Test accuracy: 94.6%

3. The following ConvNet uses:

3 convolution layers (with Relu ) each using 5×5 kernel with 16, 32 and 64 filters, respectively 1×1 stride 2×2 Max-pooling SAME padding

layers (with ) each using all weights initialized with truncated normal distribution with sd 0.1

initialized with truncated normal distribution with sd 0.1 3 hidden layers ( fully connected ) with 256, 128 and 64 hidden nodes respectively

layers ( ) with 256, 128 and 64 nodes respectively 128 batch size

size 10K i terations

0.7 dropout

learning decay starting with 0.1

Results

Minibatch loss at step 0: 6.788681

Minibatch accuracy: 12.5%

Validation accuracy: 9.8%

Minibatch loss at step 500: 0.804718

Minibatch accuracy: 75.8%

Validation accuracy: 74.9%

Minibatch loss at step 1000: 0.464696

Minibatch accuracy: 86.7%

Validation accuracy: 82.8%

Minibatch loss at step 1500: 0.684611

Minibatch accuracy: 80.5%

Validation accuracy: 85.2%

Minibatch loss at step 2000: 0.352865

Minibatch accuracy: 91.4%

Validation accuracy: 85.9%

Minibatch loss at step 2500: 0.505062

Minibatch accuracy: 84.4%

Validation accuracy: 87.3%

Minibatch loss at step 3000: 0.352783

Minibatch accuracy: 87.5%

Validation accuracy: 87.0%

Minibatch loss at step 3500: 0.411505

Minibatch accuracy: 88.3%

Validation accuracy: 87.9%

Minibatch loss at step 4000: 0.457463

Minibatch accuracy: 84.4%

Validation accuracy: 88.1%

Minibatch loss at step 4500: 0.369346

Minibatch accuracy: 89.8%

Validation accuracy: 88.7%

Minibatch loss at step 5000: 0.323142

Minibatch accuracy: 89.8%

Validation accuracy: 88.5%

Minibatch loss at step 5500: 0.245018

Minibatch accuracy: 93.8%

Validation accuracy: 89.0%

Minibatch loss at step 6000: 0.480509

Minibatch accuracy: 85.9%

Validation accuracy: 89.2%

Minibatch loss at step 6500: 0.297886

Minibatch accuracy: 92.2%

Validation accuracy: 89.3%

Minibatch loss at step 7000: 0.309768

Minibatch accuracy: 90.6%

Validation accuracy: 89.3%

Minibatch loss at step 7500: 0.280219

Minibatch accuracy: 92.2%

Validation accuracy: 89.5%

Minibatch loss at step 8000: 0.260540

Minibatch accuracy: 93.8%

Validation accuracy: 89.7%

Minibatch loss at step 8500: 0.345161

Minibatch accuracy: 88.3%

Validation accuracy: 89.6%

Minibatch loss at step 9000: 0.343074

Minibatch accuracy: 87.5%

Validation accuracy: 89.8%

Minibatch loss at step 9500: 0.324757

Minibatch accuracy: 92.2%

Validation accuracy: 89.9%

Minibatch loss at step 10000: 0.513597

Minibatch accuracy: 83.6%

Validation accuracy: 90.0%

Test accuracy: 95.5%

To be continued…