Deep Learning Framework Power Scores 2018

Who’s on top in usage, interest, and popularity?

Deep learning continues to be the hottest thing in data science. Deep learning frameworks are changing rapidly. Just five years ago, none of the leaders other than Theano were even around.

I wanted to find evidence for which frameworks merit attention, so I developed this power ranking. I used 11 data sources across 7 distinct categories to gauge framework usage, interest, and popularity. Then I weighted and combined the data in this Kaggle Kernel.

UPDATE SEPT 20, 2018: Due to popular demand, I expanded the frameworks evaluated to include Caffe, Deeplearning4J, Caffe2, and Chainer. Now all deep learning frameworks with more than 1% reported usage on KDNuggets usage survey are included.

UPDATE SEPT 21, 2018: I made a number of methodological improvements in several of the metrics.

Without further ado, here are the Deep Learning Framework Power Scores:

While TensorFlow is the clear winner, there were some surprising findings. Let’s dive in!

The Contenders

All of these frameworks are open source. All except one work with Python, and some can work with R or other languages.

TensorFlow is the undisputed heavyweight champion. It has the most GitHub activity, Google searches, Medium articles, books on Amazon and ArXiv articles. It also has the most developers using it and is listed in the most online job descriptions. TensorFlow is backed by Google.

Keras has an “API designed for human beings, not machines.” It is the second most popular framework in nearly all evaluation areas. Keras sits on top of TensorFlow, Theano, or CNTK. Start with Keras if you are new to deep learning.

PyTorch is the third most popular overall framework and the second most popular stand-alone framework. It is younger than TensorFlow and has grown rapidly in popularity. It allows customization that TensorFlow does not. It has the backing of Facebook.

Caffe is the fourth most popular framework. It has been around for nearly five years. It is relatively in demand from employers and often mentioned in scholarly articles, but has little reported recent usage.

Theano was developed at the University of Montreal in 2007 and is the oldest significant Python deep learning framework. It has lost much of its popularity and its leader stated that major releases were no longer on the roadmap. However, updates continue to be made. Theano still the fifth highest scoring framework.

MXNET is incubated by Apache and used by Amazon. It is the sixth most popular deep learning library.

CNTK is the Microsoft Cognitive Toolkit. It reminds me of many other Microsoft products in the sense that it is trying to compete with Google and Facebook offerings and is not winning significant adoption.

Deeplearning4J, also called DL4J, is used with the Java language. It’s the only semi-popular framework not available in Python. However, you can import models written with Keras to DL4J. This was the only framework where two different search terms occasionally had different results. I used the higher number for each metric. As the framework scored quite low, this made no material difference.

Caffe2 is another Facebook open source product. It builds on Caffe and is now being housed in the PyTorch GitHub repository. Because it no longer has its own repository I used the GitHub data from its old repository.

Chainer is a framework developed by the Japanese company Preferred Networks. It has a small following.

FastAI is built on PyTorch. Its API was inspired by Keras and requires even less code for strong results. FastAI is bleeding edge as of mid-Sept 2018. It’s undergoing a rewrite for version 1.0 slated for October 2018 release. Jeremy Howard, the force behind FastAI has been a top Kaggler and President of Kaggle. He discusses why FastAI switched from Keras to make their own framework here.

FastAI is not yet in demand for careers nor is it being used widely. However, it has a large built-in pipeline of users through its popular free online courses. It is also both powerful and easy to use. Its adoption could grow significantly.

Criteria

I chose the following categories to provide a well-rounded view of popularity and interest in deep learning frameworks.

The evaluation categories are:

Online Job Listings

KDnuggets Usage Survey

Google Search Volume

Medium Articles

Amazon Books

ArXiv Articles

GitHub Activity

Searches were performed Sept. 16 to Sept. 21, 2018. Source data is in this Google sheet.

I used the plotly data visualization library and Python’s pandas library to explore popularity. For the interactive plotly charts, see my Kaggle Kernel here.

Online Job Listings

What deep learning libraries are in demand in today’s job market? I searched job listings on LinkedIn, Indeed, Simply Hired, Monster, and Angel List.

TensorFlow is the clear winner when it comes to frameworks mentioned in job listings. Learn it if you want a job doing deep learning.

I searched using the term machine learning followed by the library name. So TensorFlow was evaluated with machine learning TensorFlow. I tested several search methods and this one gave the most relevant results.

An additional keyword was necessary to differentiate the frameworks from unrelated terms because Caffe can have multiple meanings.

Usage

KDnuggets, a popular data science website, polled data scientists around the world on the software that they used. They asked:

What Analytics, Big Data, Data Science, Machine Learning software you used in the past 12 months for a real project?

Here are the results for the frameworks in this category.

Keras showed a surprising amount of use — nearly as much as TensorFlow. It’s interesting that US employers are overwhelmingly looking for TensorFlow skills, when — at least internationally — Keras is used almost as frequently.

This category is the only one that includes international data because it would have been cumbersome to include international data for the other categories.

KDnuggets reported several years of data. While I used 2018 data only in this analysis, I should note that Caffe, Theano, MXNET, and CNTK saw usage fall since 2017.

Google Search Activity

Web searches on the largest search engine are a good gauge of popularity. I looked at search history in Google Trends over the past year. Google doesn’t provide absolute search numbers, but it does provide relative figures.

I updated this article Sept. 21, 2018 so these scores would include worldwide searches in the Machine Learning and Artificial Intelligence category for the week ended Sept. 15, 2018. Thanks to François Chollet for his suggestion to improve this search metric.

Keras was not far from TensorFlow. PyTorch was in third and other frameworks had relative search volume scores at or below four. These scores were used for the power score calculations.

Let’s look briefly at how search volume has changed over time to provide more historical context. The chart from Google directly below shows searches over the past two years.

TensorFlow = red, Keras = yellow, PyTorch = blue, Caffe = green

Searches for Tensor Flow haven’t really been growing for the past year, but Keras and PyTorch have seen growth. Google Trends allows only five terms to be compared simultaneously, so the other libraries were compared on separate charts. None of the other libraries showed anything other than minimal search interest relative to TensorFlow.

Publications

I included several publication types in the power score. Let’s look at Medium articles first.

Medium Articles

Medium is the place for popular data science articles and guides. And you’re here now — fantastic!

Finally a new winner. In terms of mentions in Medium articles, Keras broke the tape ahead of TensorFlow. FastAI outperformed relative to its usual showing.

I hypothesize that these results might have occurred because Keras and FastAI are beginner friendly. They have quite a bit of interest from new deep learning practitioners, and Medium is often a forum for tutorials.

I used Google site search of Medium.com over the past 12 months with the framework name and “learning” as the keyword. This method was necessary to prevent incorrect results for the term “caffe”. It had the smallest reduction in articles of several search options.

Now let’s see which frameworks have books about them available on Amazon.

Amazon Books

I searched for each deep learning framework on Amazon.com under Books->Computers & Technology.

TensorFlow for the win again. MXNET had more books than expected and Theano had fewer. PyTorch had relatively few books, but that may be because of the framework’s youth. This measure is biased in favor of older libraries because of the time it takes to publish a book.

ArXiv Articles

ArXiv is the online repository where most scholarly machine learning articles are published. I searched for each framework on arXiv using Google site search results over the past 12 months.

More of the same from TensorFlow for scholarly articles. Notice how much more popular Keras was on Medium and Amazon than in scholarly articles. Pytorch was second in this category, showing its flexility for implementing new ideas. Caffe also performed relatively well.

GitHub Activity

Activity on GitHub is another indicator of framework popularity. I broke out stars, forks, watchers, and contributors in the charts below because they make more sense separately than combined.

TensorFlow is clearly the most popular framework on GitHub, with a whole lot of engaged users. FastAI has a decent following considering it isn’t even a year old. It’s interesting to see that contributor levels are closer for all of the frameworks than the other three metrics.

After gathering and analyzing the data it was time to consolidate it into one metric.

Power Scoring Procedure

Here’s how I created the power score:

Scaled all features between 0 and 1. Aggregated Job Search Listings and GitHub Activity subcategories. Weighted categories according to the weights below.

As shown above, Online Job Listings and KDnuggets Usage Survey make up half of the total score, while web searches, publications, and GitHub attention make up the other half. This split seemed like the most appropriate balance of the various categories.

4. Multiplied weighted scores by 100 for comprehensibility.

5. Summed category scores for each framework into a single power score.

Here’s the data: