May 14, 2019 by Piotr Płoński Automl

Automated Machine Learning is the end-to-end process of applying machine learning in an automatic way.

The full autoML pipeline usually consists of:

data pre-processing,

feature engineering,

feature extraction,

feature selection,

model training,

algorithm selection,

hyperparameter optimization

The outlined steps can be very time-consuming. There is a lot of ML algorithms that can be applied at each step of the analysis. The difficulty in manual construction of ML pipeline lays in the difference between data formats, interfaces and computational-intensity of ML algorithms. The Automated Machine Learning solutions aim to solve this problem by checking automatically different combinations of ML algorithms. The process of automated machine learning is controlled by statistical or machine learning algorithm.

Open source AutoML (in alphabetical order)

Auto-Keras

Auto-Keras provides automated architecture and hyperparameters search for deep learning models.

Main authors: developed by DATA Lab at Texas A&M University

ML task: image classification

Usage: Python package

Language: Python

Code on github: https://github.com/keras-team/autokeras

auto-sklearn

automated scikit-learn alternative. Auto-sklearn uses Bayesian optimization, meta-learning and ensemble construction. The package was presented at NIPS, 2015.

Main authors: M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum, F. Hutter

ML tasks: binary classification, multiclass classification, regression

Usage: Python Package

Language: Python

Code on github: https://github.com/automl/auto-sklearn

automl-gs

Automl-gs generates raw Python code using Jinja templates and trains a model using the generated code. It provides data preprocessing and hyperparameters tuning. It uses Neural Networks (Tensorflow) and Xgboost.

Main author: Max Woolf

ML tasks: binary classification, multiclass classification, regression

Usage: Python package, command line

Language: Python

Code on github: https://github.com/minimaxir/automl-gs

Auto-Weka

Automated Machine Learning with WEKA

Main authors: Lars Kotthoff, Chris Thornton, Frank Hutter, Holger Hoos, and Kevin Leyton-Brown.

ML tasks: binary classification, multiclass classification, regression

Usage: User Interface

Language: Java

Code on github: https://github.com/automl/autoweka

FeatureTools use Deep Feature Synthesis to perform automated feature engineering on relational and temporal data.

Main authors: James Max Kanter and Kalyan Veeramachaneni

ML task: feature engineering

Usage: Python package

Language: Python

code on github: https://github.com/Featuretools/featuretools

h2o automl

H2O AutoML provides automated feature preprocessing, machine learning model tuning and training.

Main authors: H2O.ai

ML tasks: binary classification, multiclass classification, regression

Usage: Python or R package

Language: Java

code on github: https://github.com/h2oai/h2o-3

Ludwig

Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code.

Main authors: Uber

ML tasks: all (tabular data classification, regression, image recognition, NLP, time series)

Usage: Python package, command line scripts

Language: Python

code on github: https://github.com/uber/ludwig

mljar-supervised

mljar-supervised provides automated feature preprocessing, machine learning model tuning and training

Main authors: Piotr Płoński (MLJAR, Inc.)

ML tasks: binary classification, (multiclass classification, regression, anomaly detection, time series, work in progress)

Usage: Python package

Language: Python

Code on github: https://github.com/mljar/mljar-supervised

Neural Network Intelligence (NNI)

AutoML toolkit for neural architecture search and hyper-parameter tuning. Helps you to train NN locally or remotely.

Main authors: Microsoft Research (MSR)

ML tasks: all (you need to define the NN architecture and NNI will help you to tune it and train locally or in the cloud)

Usage: Python package, command line

Language: Python

Code on github: https://github.com/microsoft/nni

tpot

AutoML tool that optimizes machine learning pipelines using genetic programming

Main authors: Randal S. Olson, Ryan J. Urbanowicz, Peter C. Andrews, Nicole A. Lavender, La Creis Kidd, and Jason H. Moore

ML tasks: binary classification, multiclass classification and regression

Usage: Python package

Language: Python

Code on github: https://github.com/EpistasisLab/tpot

TransmografAI

AutoML library for building machine learning workflows on Apache Spark

Main authors: Salesforce

ML tasks: binary classification, multiclass classification, and regression

Usage: Scala, Java packages

Language: Scala

Github code: https://github.com/salesforce/TransmogrifAI

Auto_ml (unmaintained)

It is worth to mention auto_ml Python package created by Preston Parry https://github.com/ClimbsRocks/auto_ml which is unmaintained.

Proprietary AutoML available in the cloud or on-premise (alphabetical order)

Below is the list of AutoML services available in the cloud or on-premise. Services listed here offer very similar functionality:

the user provides the input data set, usually as a flat file,

the user select target column which will be predicted, and input features

the user selects time limit for AutoML training,

AutoML is checking many possible data pipelines, train, and tune them,

in the end, AutoML selects the best performing algorithm (according to selected metric and validation),

the best model can be deployed in the cloud and accessed with REST API or can be used for batch predictions in the service.

Proprietary AutoML providers:

In most cases of AutoML in the cloud, the user is tied to the provider - there is no option to download model and use it locally. In MLJAR service you can download models and use them locally. (if you are aware of other providers where user can download model and use as he wants, please let me know in comments, I will update the post)

If you found some software or service missing in the list, please let me know in the comments!

Machine Learning made simple. mljar automates the common way to build complete Machine Learing Pipeline to find the best ML models for your data! Start for Free! or Read More

Comments

Please enable JavaScript to view the comments powered by Disqus.