Learning Machine Learning 0.1

Machine Learning seen by a beginner: the basic key concepts

First question: what is Machine Learning?

In an easy way, I would say that Machine Learning is a mix between math, statistics and a few of AI that when facing a question or a problem finds patterns and provides a prediction. This prediction could be a value or a set of values and they come from the analysis of a bunch of related data that we must collect previously and then is processed by ML.

How to predict?

As I said before Machine Learning uses essentially math and statistics to predict values. There are lots of algorithms and techniques that we can explore and use for it. Basically our job as ML engineers is to find which one or which that fit most appropriately with your scope project.

The chosen algorithms should be part of the model as well as our dataset, the set of context samples. The model also contains the flow that our data should follow in the learning process and the output should be our expected prediction.

Classifying, Regressing or Clustering?

There are lots of ways to get the learning from our dataset depending the kind of data and also the problem we have. If our dataset consists in a bundle of observations and we have a target to predict, we are facing a supervised learning and we can handle it with classification (target is a label or a class) or regression (target is a continuous variable).

Classification: Is the cat sleepy or active?

Regression: How long is the cat sleepy?

On the other hand, if we don’t have a target we’ll gain unsupervised learning by grouping or clustering data by their similarities.

Clustering: How can we group them?

Which data?

For each problem, we need to get a bunch of data so we can learn with that. This data should be a set of observations or samples composed by one or more attributes related to our problem. After applying that all math and statistic we’ll need a different set of data so we can test the learning achieved. In order that we use to separate our dataset in a training set (the big one) and a testing set (the small one).

How to predict… better?

I believe the secret of Machine Learning is to find the most appropriate algorithms for our problem. This accuracy comes with persistency and experience so as we become familiar with the algorithms and refining our models, the predictions will become increasingly credible and closer to the reality.