Deep Learning goes commodity

It’s not just the advances in algorithms that have changed, the availability of those algorithms to engineers and businesses has exploded and changed the way we work with data. I think Yann LeCun, Facebook’s head of AI, captured this succinctly when he rechristened deep learning as “differentiable programming”

Implictly both LeCun and Karpathy are echoing the same sentiment, that deep learning / differentiable programming is becoming an activity of software engineers as much as it is an activity for data scientists.

Why ?

The “hard part” of deep learning was getting the math right and implementing it:

There is an intellectual overhead in implementing efficient matrix multiplication, in recalling the definition of cross entropy and implementing is in a way that is numerically stable. It takes some sophistication to recall that a convolution is equivalent to a Fourier transform as well as to a Toeplitz matrix and to have both the mathematical and engineering pedigree to know when to prefer which. And when you want to scale, you also need an intimate understanding of what a GPU is and how to best use it.

Today, to do a deep learning project, we don’t need to know any of that.

The open source frameworks out there like Tensorflow and PyTorch have abstracted a lot of the math and engineering stuff away, to the point where we can execute a very successful deep learning project without knowing what a Toeplitz matrix is.

To drive that point home, take a look at the following equations which define an LSTM, one of the building blocks of neural networks for sequences.

Equations for an LSTM

And compare that with the following code, which is what we write today when we want to use an LSTM

Code for an LSTM

Deep Learning for everyone

That simplicity and abstraction mean that virtually anyone with programming skills can take an online course and do something useful in a very short time. This isn’t just a change for the individual, it’s also a huge shift for businesses. The cost of trying deep learning has gone down dramatically.

To go back to that XKCD comic, 3 years ago we needed a research team and five years to find a bird in a photo. Today we need an engineer and two weeks.

For a business, dabbling in machine learning is risky. There is an investment of time and resources which might not yield anything. In the “old days”, the upfront costs of taking such risks were large. You needed a team of scarce and expensive professionals, and you had to give them the time to flesh out their infrastructure before they could start being productive. Those costs add up fast.

For many businesses, the old way made doing an ML project was prohibitively risky. Today, because of the amazing tooling that is available to anyone, it is riskier to not try.

But deep learning is not a panacea. I’d say its shifted the burden of labor as well as the costs from the algorithms to the data.