It’s a CSV file with 785 columns:

The first column contains the label. It indicates which one of the 10 possible digits is visible in the image.

The next 784 columns are the pixel intensity values (0..1) for each pixel in the image, counting from left to right and top to bottom.

Let’s get started. Here’s how to set up a new console project in NET Core:

$ dotnet new console -o Mnist

$ cd Mnist

Next, I need to install required packages:

$ dotnet add package Microsoft.ML

$ dotnet add package CNTK.GPU

$ dotnet add package XPlot.Plotly

$ dotnet add package Fsharp.Core

Microsoft.ML is the Microsoft machine learning package. We will use to load and process the data from the dataset.

The CNTK.GPU library is Microsoft’s Cognitive Toolkit that can train and run deep neural networks.

And Xplot.Plotly is an awesome plotting library based on Plotly. The library is designed for F# so we also need to pull in the Fsharp.Core library.

The CNTK.GPU package will train and run deep neural networks using your GPU. You’ll need an NVidia GPU and Cuda graphics drivers for this to work.

If you don’t have an NVidia GPU or suitable drivers, the library will fall back and use the CPU instead. This will work but training neural networks will take significantly longer.

CNTK is a low-level tensor library for building, training, and running deep neural networks. The code to build deep neural network can get a bit verbose, so I’ve developed a little wrapper called CNTKUtil that will help you write code faster.

You can download the CNTKUtil files and save them in a new CNTKUtil folder at the same level as your project folder.

Then make sure you’re in the console project folder and crearte a project reference like this:

$ dotnet add reference ..\CNTKUtil\CNTKUtil.csproj

Now you are ready to start writing code. Edit the Program.cs file with Visual Studio Code and add the following code:

The Digit class holds one single MNIST digit image. Note how the PixelValues field is tagged with a VectorType attribute. This tells ML.NET to combine the 784 individual pixel columns into a single vector value. Also note the LoadColumn attribute that tells ML.NET to load the first CSV column into the Number field and all subsequent columns into the PixelValues field.

We also have a GetFeatures method that returns the pixel values as a float array, and a GetLabel method that returns a one-hot encoded float array of the digit value. For each digit image only a single element in the float array will contain a 1 value to indicate the numerical value of that digit.

The features are the pixels in the image that we will use to train the neural network on, and the label is the digit value that we’re trying to predict. So here we’re training on all 784 digit pixels in the dataset to predict the value of the digit.

Now it’s time to start writing the main program method:

This code uses the LoadFromTextFile method to load the CSV data directly into memory. Note the columnDef variable that instructs ML.NET to load CSV columns 1..784 into the PixelValues column, and CSV column 0 into the Number column.

Finally we call CreateEnumerable to convert the training and test data to an enumeration of Digit instances. So now we have the training data in training and the testing data in testing. Both are enumerations of Digit instances.

But CNTK can’t train on an enumeration. It requires float[][] values for both features and labels.

So we need to set up four float arrays:

These LINQ expressions set up four arrays containing the feature and label data for the training and testing partitions.

Now we need to tell CNTK what shape the input data has that we’ll train the neural network on, and what shape the output data of the neural network will have:

Note the first Var method which tells CNTK that our neural network will use a 2-dimensional tensor of 28 by 28 floating point pixel values as input. This shape matches the 784 values returned by the Digit.GetFeatures method.

Note that this shape refers to a 2-dimensional tensor of 28x28 values while the GetFeatures method returns a 1-dimensional array of 784 values. This is not a problem. The CNTK library will automatically reshape the 1-dimensional array to a 2-dimensional tensor for us.

The second Var method tells CNTK that we want our neural network to output a 1-dimensional tensor of 10 float values. This shape matches the 10 values returned by the Digit.GetLabel method.

Our next step is to design the neural network.

We will use a deep neural network with a 512-node input layer and a 10-node output layer. We’ll use the ReLU activation function for the input layer and Softmax activation for the output layer.

The sofmax function creates a mutually exclusive list of output classes where only a single class can be the correct answer. If we had used sigmoid, the neural network might predict more than one digit value simultaneously. We don’t want that here.

Here’s how to build the neural network:

Each Dense call adds a new dense feedforward layer to the network. We’re stacking one layer with ReLU activation and one layer with Softmax activation.

Then we use the ToSummary method to output a description of the architecture of the neural network to the console.

Now we need to decide which loss function to use to train the neural network, and how we are going to track the prediction error of the network during each training epoch.

For this assignment we’ll use CrossEntropyWithSoftmax as the loss function because it’s the standard metric for measuring multiclass classification loss with softmax.

We’ll track the error with the ClassificationError metric. This is the number of times (expressed as a percentage) that the model predictions are wrong. An error of 0 means the predictions are correct all the time, and an error of 1 means the predictions are wrong all the time.

Next we need to decide which algorithm to use to train the neural network. There are many possible algorithms derived from Gradient Descent that we can use here.

For this assignment we’re going to use the RMSPropLearner. You can learn more about the RMS algorithm here: https://towardsdatascience.com/understanding...

These configuration values are a good starting point, but you can tweak them if you like to try and improve the quality of your predictions.

We’re almost ready to train. Our final step is to set up a trainer and an evaluator for calculating the loss and the error during each training epoch:

The GetTrainer method sets up a trainer which will track the loss and the error for the training partition. And GetEvaluator will set up an evaluator that tracks the error in the test partition.

Now we’re finally ready to start training the neural network!

Add the following code:

We’re training the network for 50 epochs using a batch size of 128. During training we’ll track the loss and errors in the loss, trainingError and testingError arrays.

Once training is done, we show the final testing error on the console. This is the percentage of mistakes the network makes when predicting digits.

Note that the error and the accuracy are related: accuracy = 1 — error. So we also report the final accuracy of the neural network.

Here’s the code to train the neural network. Put this inside the for loop:

The Index().Shuffle().Batch() sequence randomizes the data and splits it up in a collection of 128-record batches. The second argument to Batch() is a function that will be called for every batch.

Inside the batch function we call GetBatch twice to get a feature batch and a corresponding label batch. Then we call TrainBatch to train the neural network on these two batches of training data.

The TrainBatch method returns the loss and error, but only for training on the 128-record batch. So we simply add up all these values and divide them by the number of batches in the dataset. That gives us the average loss and error for the predictions on the training partition during the current epoch, and we report this to the console.

So now we know the training loss and error for one single training epoch. The next step is to test the network by making predictions about the data in the testing partition and calculate the testing error.

Put this code inside the epoch loop and right below the training code:

We don’t need to shuffle the data for testing, so now we can call Batch directly. Again we’re calling GetBatch to get feature and label batches, but note that we’re now providing the testing_data and testing_labels arrays.

We call TestBatch to test the neural network on the 128-record test batch. The method returns the error for the batch, and we again add up the errors for each batch and divide by the number of batches.

That gives us the average error in the neural network predictions on the test partition for this epoch.

After training completes, the training and testing errors for each epoch will be available in the trainingError and testingError arrays. Let’s use XPlot to create a nice plot of the two error curves so we can check for overfitting:

This code creates a Plot with two Scatter graphs. The first one plots the trainingError values and the second one plots the testingError values.

Finally we use File.WriteAllText to write the plot to disk as a HTML file.

We’re now ready to build the app, so this is a good moment to save your work ;)

Go to the CNTKUtil folder and type the following:

$ dotnet build -o bin/Debug/netcoreapp3.0 -p:Platform=x64

This will build the CNKTUtil project. Note how we’re specifying the x64 platform because the CNTK library requires a 64-bit build.

Now go to the Mnist folder and type:

$ dotnet build -o bin/Debug/netcoreapp3.0 -p:Platform=x64

This will build your app. Note how we’re again specifying the x64 platform.

Now run the app:

$ dotnet run

The app will create the neural network, load the dataset, train the network on the data, and create a plot of the training and testing errors for each epoch.

Here’s the neural network being trained on my laptop:

And here are the results:

The plot looks great. The curves both have the expected ‘hockey stick’ shape and both the training and testing curve stay close together. There is no sign of overfitting.

The final error is 0.126 on training and 0.134 on testing. The final accuracy is 0.87 which means the neural network correctly identifies 87 digits out of every 100.

That’s a great result!

But do keep in mind that a human would correctly identify 97.5 digits out of every 100. We still have a long way to go to beat human performance.

So what do you think?

Are you ready to start writing C# machine learning apps with CNTK?