Introduction

A few days ago Microsoft announced their new Microsoft R Server 9.0 version. Among a lot of new things, it includes some new and improved machine learning algorithms in their MicrosoftML package.

Fast linear learner, with support for L1 and L2 regularization. Fast boosted decision tree. Fast random forest. Logistic regression, with support for L1 and L2 regularization.

GPU-accelerated Deep Neural Networks (DNNs) with convolutions. Binary classification using a One-Class Support Vector Machine.

And the nice thing is, the MicrosoftML package is now also available in the Microsoft R client version, which you can download and use for free.

Don’t give up on single trees yet….

Despite all the more modern machine learning algorithms, a good old single decision tree can still be useful. Moreover, in a business analytics context they can still keep up in predictive power. In the last few months I have created different predictive response and churn models. I usually just try different learners, logistic regression models, single trees, boosted trees, several neural nets, random forests. In my experience a single decision tree is usually ‘not bad’, often only slightly less predictive power than the more fancy algorithms.

An important thing in analytics is that you can ‘sell‘ your predictive model to the business. A single decision tree is a good way to to do just that, and with an interactive decision tree (created by Microsoft R) this becomes even more easy.

Here is an example: a decision tree to predict the survival of Titanic passengers.

The interactive version of the decision tree can be found on my GitHub.

Cheers, Longhow