A framework for trusted pretained neural networks

How a repository of trained data will help advance humanity towards the Singularity

For humanity to achieve its unstated goal of building intelligence that will be far beyond us, I think it might be necessary to have a central repository of pretrained neural networks. Instead of having to reinvent the wheel for each task, we could stand on the shoulders of giants for our future tasks. It’s not as straightforward as it seems, however. There is considerable precedence in the open source world of freely using other people’s work to build on your own and there are even pretty solid mechanisms for including and distributing open source code. There are already many repositories that include weights for neural networks, so that is definitely a clear starting point.

Transfer Learning

Figure 1: A basic fully connected neural network

Readers of Shallow Thoughts about Deep Learning are probably pretty knowledgable about neural network basics. The diagram in Figure 1 shows a basic neural network with the boldness of the edge representing the weight. In Transfer Learning, the basic idea is to use a neural network that’s trained to do one task to do another. The example Andrew Ng uses is actually a pretty great one, so let’s use that.

Cats, cats and… my cat

Figure 2: Adorable cat

If you have a dataset of millions of cats (and non cats), you can use that to train a neural network to detect if a picture contains a cat or not. Pretty awesome, right? But if you wanted to extend the network identify pictures of YOUR cat, with the magic of transfer learning, you don’t need to create a whole new neural network. Instead, you can use the existing neural network, and its weights, and with small tweaks, have it identify only pictures of your cat. You probably don’t have millions of images of your cat, so it would be close to impossible to train a deep network to identify pictures of your cat with only a handful of pictures. But by adding a few layers at the end of the “is it a cat” network and training with pictures of YOUR cat, you can almost immediately have a great classifier that identifies pictures of your cat with a high degree of accuracy.

Using existing networks for new tasks

Since neural networks, especially deep ones, take a very long time to train with lots of data, it’s often impractical to start from scratch for every task.

Figure 3: MNIST Database of handwritten digits

Figure 4: Arabic numerals

The MNIST database, shown in Figure 3, is another example of a large dataset — over 60,000 images of handwritten digits — which has been solved down to an error rate of only 0.21%. If you wanted to create a classifier that identifies digits in Arabic numerals, then you don’t have to start from scratch! The neural networks that exist for identification in the MNIST dataset do a great job of segmenting a digit into its various components. By adding a layer or 2 and changing the output layer, you could easily have a highly accurate classifier that identifies Arabic numerals.

Dependency Management

For all of you software developers out there, I’m sure you’ve used some sort of dependency manager like npm / bower for you Javascript folks or gems for the ruby crowd or even pip for the Pythonistas of the world. The purpose of these dependency / package managers is to quickly use code in your applications that other people wrote. In the node.js world, if you want to create a web server, rather than doing all of the work from scratch, you can use a package called express, which has a ton of functionality built in (caching, templating, etc). Over 200 contributors have committed code to making it a more complete and robust package, and in turn, have addressed bugs, added functionality to keep it modern and created more and more test suites to prove it does what it says it does. The same is true of most other packages out there. The thing about the express package is that it depends on other packages — and on the cycle goes.

But dependency management systems (or package management systems) are pretty mature now. They’re great at getting you the exact version of the code you need, reliably and complete with all of the dependencies every time. In fact, most package managers are so good that developers explicitly do not commit package source code to their repositories.

Version Management

Software gets better over time and I think that holds even more true for machine learning algorithms. As our compute power grows, we’re able to train larger and deeper networks with greater accuracy.

Benchmarks

It’s hard to say exactly how accurate a neural network is. Typically, the figure given is “accuracy” or some measure of error like “mean squared error.” But, this is slightly misleading. I have trained networks that have given an accuracy of 97% on the training set, but a paltry 70% on the testing data set. So if you were given my benchmark of 97%, you might be tempted to use my neural network as a basis for something that you were working on. If you’re then told that it only performed at a 70% level on test data, you might think twice. But let’s say you’re happy with a 70% accuracy and dig deeper — let’s get a sense of how many items were in the training set and in the data set. If you had those numbers (14,000 and 500 respectively), there’s still one big question that needs to be answered. How (dis)similar is the training data distribution from the test set distribution?

Benchmarking neural networks is NOT as straightforward as one would hope.

Okay, you know the training accuracy, the test accuracy, the number of elements in each set and the similarity of the distributions. Now you need to figure out how similar is your task to the pretrained one. If we go back to the cat example and your actual task is to identify Honda Accords, it may not be a great pretrained network to start with. Let’s assume your task is a natural extension of the pretrained data — identifying dogs.

Now you have to ask yourself how long did the training actually take? If it’s a relatively simple network and takes 12 minutes to train yourself, maybe you don’t want to bother using a pretrained network since you can just train it yourself on the initial data set (plus whatever other dog data you may have). However, time is also misleading. If you’re training on a Raspberry Pi and I’m training on a massively parallel cluster with 402 GPU’s, then what takes me 12 minutes might take you… well, a bit longer. So a better metric to use would be FLOPs — floating point operations per second or perhaps even the number of multiply-add operations per instance.

Whatever repository is to be created would need to have a comprehensive benchmarking suite so that potential users of the pretrained neural network know what they’re getting.

Trust but verify

Figure 5: Ronald Reagan

How would we trust that the pretrained network does what it says it does? In a package repository, like npm, the author of the package says what the package does with any level of documentation / testing that they think is appropriate. There are no enforced standards, and it’s entirely possible that the package is filled with bugs or straight up lies about its capabilities. I think some sort of a committee or reputation aspect might be important, which would be different from most existing package management systems. Let’s walk through how it could work…

You pretrained a neural network to identify pictures of cats and submit it to this repository. A committee comprised of people that are (assumed to be) unknown to you evaluate your network, not for the code or the structure, but rather to try to get some benchmarks. They would also create / find some additional test data that you haven’t seen before and run it against your network. The beauty of neural networks is that once you have it trained, running values through it is not that computationally complex.

This would slow down the process of contribution a bit compared to other open source ecosystems, but the third party test set would be pretty important to build trust for potential future users.

Reputation

Like many things in the open source world, reputation would start to be build by contributors of pretrained neural networks. The reputation doesn’t necessarily mean anything and it would be up to potential future users to determine how much importance they give to the reputation of the creator of the network.

Malicious Actors

I can see a world in which malicious actors submit pretrained neural networks that supposedly solved a task, but was really not very good at that task. It could be a neural network that overfits to a particular data set, so it APPEARS to do well (see my example above in Benchmarks), but really would do poorly on data it hasn’t seen before. Third party test set validation would help in this case, but only to a point. If the test set became known to the malicious parties, they could train the network to solve for the test set as well. Given that this is all (potentially) an open-source and unpaid contribution, it’s easy to imagine a world in which the verifiers aren’t as meticulous as one would hope. It would be a difficult problem to solve, but we would evolve to find an answer. Potentially a bounty system could be used to try to prove / disprove a pretrained neural networks claims.

Existing efforts

There are currently existing efforts in this direction. The Caffe Model Zoo, for example, hosts many pretrained models for a variety of different tasks as does the Tensorflow Model Zoo. Where they fall short is the blind third party verification. We are trusting that the benchmarks reported (speed, etc) are accurate and trusted.

The Singularity

For machines to become ever more “intelligent,” we need to have a compounding rate of growth. If we don’t, then we’ll continue to create great problem solvers with the illusion of intelligence. Fans of The Singularity concept can see that by having larger and larger pretrained neural networks, we can build more complex systems that start to mimic real intelligence. From that point on, it’s only a matter of time (and computational power) until machines surpass human intelligence. What we choose to do from there is anyones guess.

To read more about deep learning, please visit my publication, Shamoon Siddiqui’s Shallow Thoughts About Deep Learning.