Neural network takes input and process them in hidden layers using weights. These weights are adjusted during the training. Finally it predicts the output. Weights are adjusted to find patterns in order to make better prediction. The user does not need to specify any pattern, the network will learn on its own.

Pytorch is a framework for building and training neural networks, which is implemented in Python. In this tutorial I am using Fashion-MNIST dataset, consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28×28 gray-scale image, associated with a label from 10 classes.

Here’s an example how the data looks (each class takes three-rows):

First lets load the data set using the following code. This is provided through the torchvision package. The code below will download the MNIST dataset, then create training and test datasets for us.

import torch from torchvision import datasets,transforms transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))]) trainset = datasets.FashionMNIST('~/.pytorch/F_MNSIT_data/',download=True,train=True,transform=transform) trainloader = torch.utils.data.DataLoader(trainset,batch_size=64,shuffle=True) testset = datasets.FashionMNIST('~/.pytorch/F_MNSIT_data/',download=True,train=False,transform=transform) testloader = torch.utils.data.DataLoader(testset,batch_size=64,shuffle=True)

trainloader will load the images in batches from training set and each batch consists of 64 images. Similarly testloader will load the images in batches from testset and each batch consists of 64 images.

Building the network:

In the data-set each image is 28×28 which is a total of 784 pixels and there are 10 classes. Lets include two hidden layers and ReLU activation functions for the layers and to return the logits or log-softmax from the forward pass.

from torch import nn # Network architecture model = nn.Sequential(nn.Linear(784,128), nn.ReLU(), nn.Linear(128,64), nn.ReLU(), nn.Linear(64,10), nn.LogSoftmax(dim=1))

Train the Network:

In order to train the neural network you need to define the loss function() & optimizer().

Loss function will compare the actual & predicted values.

Optimizer will minimize the loss using a learning rate. The learning rate determines how fast the optimal weights for the model are calculated. A smaller learning rate may lead to more accurate weights (up to a certain point), but the time it takes to compute the weights will be longer.

In this tutorial I am using NLLLoss function and Adam optimizer.

from torch import optim # Definiing the criterion and optimiser criterion = nn.NLLLoss() optimizer = optim.Adam(model.parameters(), lr=0.005)

Training the Model:

During training we will make a forward pass through the network and calculate the log probabilities or logits [remember the output of neural network is log probabilities as we are using log SoftMax as activation function in the output layer].

Use these logits to calculate the loss. Perform a backward pass through the network to calculate the gradients. Take a step with the optimizer to update the weights.

The number of epochs is the number of times the model will cycle through the data. The more epochs we run, the more the model will improve, up to a certain point. After that point, the model will stop improving during each epoch. In addition, the more epochs, the longer the model will take to run

The following code will train the neural network

#Training the network epochs = 5 for e in range(epochs): running_loss = 0 for images, labels in trainloader: images = images.view(images.shape[0],-1) optimizer.zero_grad() logits = model.forward(images) loss = criterion(logits,labels) loss.backward() optimizer.step() running_loss += loss.item() else: print(f'Training loss {running_loss/len(trainloader)}')

Testing the Model:

To test the model, we will iterate through the test set and calculate the probabilities and compare it with the actual probabilities and find the accuracy.

for images, labels in testloader: log_ps = model(images) test_loss += criterion(log_ps, labels) ps = torch.exp(log_ps) top_p, top_class = ps.topk(1, dim=1) equals = top_class == labels.view(*top_class.shape) accuracy += torch.mean(equals.type(torch.FloatTensor))

This process should be done for each epoch. Below code will do training and testing simultaneously at each epoch level.

epochs = 30 steps = 0 train_losses, test_losses = [], [] for e in range(epochs): running_loss = 0 for images, labels in trainloader: optimizer.zero_grad() images = images.view(images.shape[0],-1) log_ps = model(images) loss = criterion(log_ps, labels) loss.backward() optimizer.step() running_loss += loss.item() else: test_loss = 0 accuracy = 0 # Turn off gradients for validation, saves memory and computations with torch.no_grad(): for images, labels in testloader: images = images.view(images.shape[0],-1) log_ps = model(images) test_loss += criterion(log_ps, labels) ps = torch.exp(log_ps) top_p, top_class = ps.topk(1, dim=1) equals = top_class == labels.view(*top_class.shape) accuracy += torch.mean(equals.type(torch.FloatTensor)) train_losses.append(running_loss/len(trainloader)) test_losses.append(test_loss/len(testloader)) print("Epoch: {}/{}.. ".format(e+1, epochs), "Training Loss: {:.3f}.. ".format(running_loss/len(trainloader)), "Test Loss: {:.3f}.. ".format(test_loss/len(testloader)), "Test Accuracy: {:.3f}".format(accuracy/len(testloader)))

result for the above code:

Epoch: 1/30.. Training Loss: 0.396.. Test Loss: 0.429.. Test Accuracy: 0.845 Epoch: 2/30.. Training Loss: 0.354.. Test Loss: 0.407.. Test Accuracy: 0.850 Epoch: 3/30.. Training Loss: 0.333.. Test Loss: 0.373.. Test Accuracy: 0.864 Epoch: 4/30.. Training Loss: 0.318.. Test Loss: 0.390.. Test Accuracy: 0.864 Epoch: 5/30.. Training Loss: 0.305.. Test Loss: 0.373.. Test Accuracy: 0.873 Epoch: 6/30.. Training Loss: 0.294.. Test Loss: 0.369.. Test Accuracy: 0.872 Epoch: 7/30.. Training Loss: 0.284.. Test Loss: 0.379.. Test Accuracy: 0.869 Epoch: 8/30.. Training Loss: 0.282.. Test Loss: 0.356.. Test Accuracy: 0.878 Epoch: 9/30.. Training Loss: 0.270.. Test Loss: 0.384.. Test Accuracy: 0.866 Epoch: 10/30.. Training Loss: 0.265.. Test Loss: 0.372.. Test Accuracy: 0.878 Epoch: 11/30.. Training Loss: 0.261.. Test Loss: 0.389.. Test Accuracy: 0.868 Epoch: 12/30.. Training Loss: 0.253.. Test Loss: 0.366.. Test Accuracy: 0.877 Epoch: 13/30.. Training Loss: 0.248.. Test Loss: 0.352.. Test Accuracy: 0.879 Epoch: 14/30.. Training Loss: 0.243.. Test Loss: 0.366.. Test Accuracy: 0.875 Epoch: 15/30.. Training Loss: 0.236.. Test Loss: 0.363.. Test Accuracy: 0.877 Epoch: 16/30.. Training Loss: 0.235.. Test Loss: 0.382.. Test Accuracy: 0.881 Epoch: 17/30.. Training Loss: 0.231.. Test Loss: 0.385.. Test Accuracy: 0.874 Epoch: 18/30.. Training Loss: 0.229.. Test Loss: 0.380.. Test Accuracy: 0.878 Epoch: 19/30.. Training Loss: 0.221.. Test Loss: 0.396.. Test Accuracy: 0.879 Epoch: 20/30.. Training Loss: 0.220.. Test Loss: 0.384.. Test Accuracy: 0.881 Epoch: 21/30.. Training Loss: 0.214.. Test Loss: 0.388.. Test Accuracy: 0.877 Epoch: 22/30.. Training Loss: 0.211.. Test Loss: 0.385.. Test Accuracy: 0.878 Epoch: 23/30.. Training Loss: 0.209.. Test Loss: 0.413.. Test Accuracy: 0.878 Epoch: 24/30.. Training Loss: 0.208.. Test Loss: 0.387.. Test Accuracy: 0.881 Epoch: 25/30.. Training Loss: 0.205.. Test Loss: 0.410.. Test Accuracy: 0.878 Epoch: 26/30.. Training Loss: 0.200.. Test Loss: 0.415.. Test Accuracy: 0.882 Epoch: 27/30.. Training Loss: 0.200.. Test Loss: 0.404.. Test Accuracy: 0.872 Epoch: 28/30.. Training Loss: 0.199.. Test Loss: 0.402.. Test Accuracy: 0.881 Epoch: 29/30.. Training Loss: 0.194.. Test Loss: 0.427.. Test Accuracy: 0.877 Epoch: 30/30.. Training Loss: 0.193.. Test Loss: 0.426.. Test Accuracy: 0.878

Congrats! We have built a neural network using Pytorch! Though we achieved moderate testing accuracy of 87.8%[not so great] still the model suffers from over-fitting. I will explain how to deal with the over-fitting in my next article.

You can download the code from my Github repo.