AutoML Concepts

There are two major concepts to grasp as far as AutoML is concerned: neural architecture search and transfer learning.

Neural Architecture Search

Neural architecture search is the process of automating the design of neural networks. Usually, reinforcement learning or evolutionary algorithms are used in the design of these networks. In reinforcement learning, models are punished for low accuracies and rewarded for high accuracies. Using this technique the model will always strive to obtain higher accuracies.

Several neural architecture search papers have been published such as Learning Transferable Architectures for Scalable Image Recognition, Efficient Neural Architecture Search (ENAS), and Regularized Evolution for Image Classifier Architecture Search just to mention a few.

Transfer Learning

Transfer learning, as the name suggests, is a technique where one uses pre-trained models to transfer what its learned when applying the model to a new but similar dataset. This enables us to obtain high accuracies while using less computation time and power. Neural architecture search is good for problems that require the discovering of new architectures, while transfer learning works best for problems where the datasets are similar to the ones used in pre-training models.

AutoML solutions

Let’s now look at some of the available automated machine learning solutions.

Auto-Keras

According to the official site:

Auto-Keras is an open source software library for automated machine learning (AutoML). It is developed by DATA Lab at Texas A&M University and community contributors. The ultimate goal of AutoML is to provide easily accessible deep learning tools to domain experts with limited data science or machine learning background. Auto-Keras provides functions to automatically search for architecture and hyperparameters of deep learning models.

It can be installed using a simple pip command:

pip install auto-keras

Auto-Keras is still undergoing final testing before its final release. The official site warns that they will not be held liable for any loss incurred as a result of using the libraries on the site. This package is based on the Keras deep learning package.

Auto-Sklearn

Auto-Sklearn is an automated machine learning package based on Scikit-learn. It’s a replacement for the Scikit-learn estimator. Its installation is also via a simple pip command:

pip install auto-sklearn

In Ubuntu, a C++11 building environment and SWIG are required:

sudo apt-get install build-essential swig

Installation via Anaconda is as follows:

conda install gxx_linux-64 gcc_linux-64 swig

It isn’t possible to run Auto-Sklearn on Windows. However, one can try some hacks such as a docker image or running via a virtual machine.

The Tree-Based Pipeline Optimization Tool (TPOT)

According to its official site:

The goal of TPOT is to automate the building of ML pipelines by combining a flexible expression tree representation of pipelines with stochastic search algorithms such as genetic programming. TPOT makes use of the Python-based scikit-learn library as its ML menu.

This software is open source and is available on GitHub.

Google’s AutoML

Its official site states that:

Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs, by leveraging Google’s state-of-the-art transfer learning, and Neural Architecture Search technology.

Google’s AutoML solution is not open source. Its pricing can be viewed here.

H20

H2O is an open source distributed in-memory machine learning platform. It is available in both R and Python. This package provides support for statistical & machine learning algorithms.