SHARE Twitter

Facebook

Pinterest

LinkedIn

Reddit

StumbleUpon

Hello everybody, in the last post we were talking about Standardization one crucial task, when it is needed, to take in mind when we are in the preprocessing part. Now we will speak of normalization, which is another crucial step when we are building artificial intelligence models.

Normalization is a technique commonly applied as part of data preparation for machine learning. The idea is to take values in numeric columns of a dataset and transform them to a standard scale, without distorting differences in the ranges of values. Not always that we are building an AI model, we need data normalization. It is only required when features have different varieties.

There are many examples where we can take advantage of this technique; let’s talk about some of them.

In regression and multivariate analysis where the relationships are of interest, we can do the normalization to reach a linear, more robust relationship.

Frequently when the relationship between two datasets is non-linear, we transform the data to reach a linear relationship. Here, normalization doesn’t mean normalizing data. It means normalizing residues by changing data. So normalization of data implies to normalize residuals using the method of transformation.

Another case is when we are analyzing the frequency of concurrence of the same phenomena in two different population with different size, and we want to compare them. Here, we normalize both of them, because otherwise, you don’t know how significant the influence of your phenomena is with the total number of cases.

In Data Minning approaches, we need to normalize the inputs; otherwise, the network will be unconditioned. In essence, we do normalization to accomplish the same range of values in the inputs we will use for our model. This technique can guarantee stable convergence of weight and biases.

In distance-based classification, for instance, we need to normalize each feature value of a feature vector not to get conditioned by features with a broader range of possible values when computing distances.

When a feature has a range [-1, 1] and another feature has a range [-100, 100], then, a small variation in feature two is probably more influencing than a big variation in feature one when computing the distance of two feature vectors.

So finalizing, we saw how normalization is significant when dealing with different situations of the data we want to use. In Data Minning, Classifications problems, analyzing phenomena using diverse populations, and so on. Learn how to normalize and how to standardize (this post) is a crucial part of the process of learning how to build our artificial intelligence models. We are going to continue in our next post, so keep connected.



Thanks for reading