Announcing ML.NET 0.7 (Machine Learning .NET)

Cesar

November 8th, 2018

We’re excited to announce today the release of ML.NET 0.7 – the latest release of the cross-platform and open source machine learning framework for .NET developers (ML.NET 0.1 was released at //Build 2018). This release focuses on enabling better support for recommendation based ML tasks, enabling anomaly detection, enhancing the customizability of the machine learning pipelines, enabling using ML.NET in x86 apps, and more.

This blog post provides details about the following topics in the ML.NET 0.7 release:

Enhanced support for recommendation tasks with Matrix Factorization

Recommender systems enable producing a list of recommendations for products in a catalog, songs, movies, and more. We have improved support for creating recommender systems in ML.NET by adding Matrix factorization (MF), a common approach to recommendations when you have data on how users rated items in your catalog. For example, you might know how users rated some movies and want to recommend which other movies they are likely to watch next.

We added MF to ML.NET because it is often significantly faster than Field-Aware Factorization Machines (which we added in ML.NET 0.3) and it can support ratings which are continuous number ratings (e.g. 1-5 stars) instead of boolean values (“liked” or “didn’t like”). Even though we just added MF, you might still want to use FFM if you want to take advantage of other information beyond the rating a user assigns to an item (e.g. movie genre, movie release date, user profile). A more in-depth discussion of the differences can be found here.

A sample app using Matrix Factorization based on user’s ratings can be found here (Movie Recommendation Engine). In this sample app you can see how to use ML.NET to build a movie recommendation engine where you have data such as UserId, ProductId and Ratings available to you for what users bought and rated.

ML.NET’s MF uses LIBMF.

Enabled anomaly detection scenarios – detecting unusual events

Anomaly detection enables identifying unusual values or events. It is used in scenarios such as fraud detection (identifying suspicious credit card transactions) and server monitoring (identifying unusual activity).

ML.NET 0.7 enables detecting two types of anomalous behavior:

Spike detection: spikes are attributed to sudden yet temporary bursts in values of the input data. These could be outliers due to outages, cyber-attacks, viral web content, etc.

Change point detection: change points mark the beginning of more persistent deviations in the behavior of the data. For example, if product sales are relatively consistent and become more popular (monthly sales double), there is a change point when the trend changes.

These anomalies can be detected on two types of data using different ML.NET components:

IidSpikeDetector and IidChangePointDetector are used on data assumed to be from one stationary distribution (each data point is independent of previous data, such as the number of retweets of each tweet). SsaSpikeDetector and SsaChangePointDetector are used on data that has a season/trend components (perhaps ordered by time, such as product sales)



Sample code using anomaly detection with ML.NET can be found here.

Improved customizability of ML.NET pipelines

ML.NET offers a variety of data transformations (e.g. processing text, images, categorical features, etc.). However, some use cases require application-specific transformations, such as calculating cosine similarity between two text columns. We have now added support for custom transforms so you can easily include custom business logic.

The CustomMappingEstimator allows you to write your own methods to process data and bring them into the ML.NET pipeline. Here is what it would look like in the pipeline:

var estimator = mlContext.Transforms.CustomMapping<MyInput, MyOutput>(MyLambda.MyAction, "MyLambda") .Append(...) .Append(...)

Below is the definition of what this custom mapping will do. In this example, we convert the text label (“spam” or “ham”) to a boolean label (true or false).

public class MyInput { public string Label { get; set; } } public class MyOutput { public bool Label { get; set; } } public class MyLambda { [Export("MyLambda")] public ITransformer MyTransformer => ML.Transforms.CustomMappingTransformer<MyInput, MyOutput>(MyAction, "MyLambda"); [Import] public MLContext ML { get; set; } public static void MyAction(MyInput input, MyOutput output) { output.Label= input.Label == "spam" ? true : false; } }

A more complete example of the CustomMappingEstimator can be found here.

x86 support in addition to x64

With this release of ML.NET you can now train and use machine learning models on x86 / 32-bit architecture devices (Windows only, for now). Previously, ML.NET was limited to x64 devices (Windows, Linux and Mac). Note that some components that are based on external dependencies (e.g. TensorFlow) are not available in x86-Windows.

NimbusML – experimental Python bindings for ML.NET

NimbusML provides experimental Python bindings for ML.NET. We have seen feedback from the external community and internal teams regarding the use of multiple programming languages. We wanted to enable as many people as possible to benefit from ML.NET and help teams to work together more easily. ML.NET not only enables data scientists to train and use machine learning models in Python (with components that can also be used in scikit-learn pipelines), but it also enables saving models which can be easily used in .NET applications through ML.NET (see here for more details).

In case you missed it: provide your feedback on the new API

ML.NET 0.6 introduced a new set of APIs for ML.NET that provide enhanced flexibility. These APIs in 0.7 and upcoming versions are still evolving and we would love to get your feedback so you can help shape the long-term API for ML.NET.

Want to get involved? Start by providing feedback through issues at the ML.NET GitHub repo!

Additional resources

The most important ML.NET concepts for understanding the new API are introduced here.

for understanding the new API are introduced here. A cookbook (How to guides) that shows how to use these APIs for a variety of existing and new scenarios can be found here.

that shows how to use these APIs for a variety of existing and new scenarios can be found here. A ML.NET API Reference with all the documented APIs can be found here.

Get started!

If you haven’t already, get started with ML.NET here Next, explore some other great resources:

Tutorials and resources at the Microsoft Docs ML.NET Guide

Code samples at the machinelearning-samples GitHub repo

We look forward to your feedback and welcome you to file issues with any suggestions or enhancements in the ML.NET GitHub repo.

This blog was authored by Gal Oshri and Cesar de la Torre

Thanks,

The ML.NET Team