It is getting easier and easier to do deep learning (DL). There exist papers, blogs, frameworks, books, courses, newsletters, conferences, and many more resources. If you don’t want to implement it yourself, there are machine learning (ML) API services on AWS, GCE, Azure and companies like Clarifai and Bonsai, to name a few. Thinking back to a few years ago, neural nets were sometimes regarded as “just a fad” —I can remember skipping this topic in my grad school ML class because the professor didn’t like them and thought they were hyped. So let’s talk about some of things that have happened to the field of DL in the last few years that have shifted this view.

While initially most DL papers came from a few academics, most of the new papers, results, tools, and datasets now come from big companies. And much of the current state of everything can be attributed to big companies. So the natural question is “What’s in it for them?” Why would Google open source something like Tensorflow, or why would Microsoft publish a paper detailing how to make state-of-the-art vision algorithms? On the surface it seems like developing this technology provides a competitive advantage, so it doesn’t make sense why you might want to make it free or widely available. As it turns out, this actually represents some pretty nifty business strategy.

First, let’s profile a big company to see what components it has, and to see things from its perspective. Let’s start by thinking of Amazon, what they do and where they make money.

Let’s start with what they do:

They make a website that lets you buy a lot of stuff from many people They have physical infrastructure in place to sell you things Consumer electronics [we are going to ignore this… ] AWS

In the process of doing business, they are generating one more critical resource, data. The data is super important because not only does this help to make Amazon better, but it is also proprietary, so other web sellers can’t get better. Amazon can analyze the data to figure out better recommendations, predict who will want what and preship items, figure out consumption patterns, A/B test, and the list goes on and on. Then, when Amazon is doing a better job as a marketplace, more people will buy and sell, and they will collect more data, creating a positive feedback loop.

This isn’t unique to Amazon. The more Amazon sells, the better it gets at selling. The more searches Google provides, the better searches it can provide. The more movies Netflix recommends, the better recommendations it can make. So effectively by collecting data, you're raising the barrier to entry for competition. This is what makes data super duper valuable.

You can think of Microsoft, Google, and Facebook, all working roughly in the same way. They all have some version of AWS and some software cash-cow that makes money with the help of data. Their business models are thus roughly the same.

Now, the bulk of Amazon’s profit comes from AWS, and less from their retail division. For Google, Facebook, and Microsoft, it is the complete opposite. Most of Google and Facebook’s money comes from ads, and Microsoft’s from selling software.

The two largest money makers consist of some sort of software and of the computing infrastructure. In contrast, a small portion of money comes from companies selling data indirectly, via machine learning APIs.

Now that we have an idea of the rough profile of one of these companies, let’s talk about deep learning, but as a collection of products.

The argument I am about to present takes its roots from this highly recommended blog post by Joel Spolsky. The relevant takeaways are:

Think of products as either complements or substitutes. Complement products go with other products (e.g., when you buy peanut butter, you often buy jelly). Substitute products act as replacements (e.g., I might buy a PB&J instead of a turkey sandwich for lunch because it is cheaper). When the cost of a complementary product decreases (all else equal), the price of the other product can increase (e.g., if peanut butter were free, you could charge more for jelly, and the cost of a PB&J would either stay the same or decrease). “Smart companies try to commoditize their products’ complements.” If you can commoditize your products’ complements, then you can charge more and make more money.

Let’s now think about DL as related to five different products: data, compute power, research, DL tools, and software (that could benefit from DL — for example, Netflix recommendation). Let’s analyze these in terms of their relationship with each other.

Let’s talk about this chart a bit. As you can see, most of these products complement each other pretty well — cheaper GPUs make it so you can do more research, deploy more DL helped software, crunch more data, etc. Basically, more of any one of these components makes the other ones better. If you talk to people about why “AI has taken off in the last five years,” they will invariably mention it is because of Moore’s Law. I argue that having more computation and cheaper computation is not in itself the propelling force; rather, it provides the necessary momentum to accelerate data, research, tools, and — most importantly — profitability.

It makes sense that the diagonal in the above chart would be mostly substitutes. If AWS is cheaper the GCE, then I will use AWS since these are commodities. The one exception to this is research. Research complementing other research makes sense (the more research is widely disseminated, the more focused it can be). Furthermore, releasing research is beneficial even between competitors, since data — the main fuel for ML — is proprietary.

With this in mind, think of where big companies make money from:

Data in an indirect way (e.g., making ads better) — basically every tech company Data in a direct way (e.g., selling you ML APIs) — Google, Microsoft, Amazon Compute power — Amazon, Google, NVIDIA, Intel Software — Facebook, Google, Amazon, Netflix

Think of what the big companies give out for free:

Research Tools — Tensorflow, Paddle, Torch Data — limited scope data (e.g., MS Coco, Webscope), sponsoring data (e.g., ImageNet)

By giving these away for free, companies want to commoditize tools and research (aka the difficult parts), and monetize the resources needed for ML (which is a large capital investment that creates a high barrier of entry).

Given where companies make money, I’d predict that DL research is going to continue to look at more resource intensive models so that really only big companies can take advantage of it.

Finally, I want to thank my friends Eric Bakan, Stedman Hood, and Corina Grigore for helping me flesh out this idea.