Weight as a Constant

After clarifying that there is no parameter to learn, we can make a simple modification to the data used. The new data is given in the next table. Did you catch the modifications? It’s actually quite simple. Each output Y is no longer equal to the input X—it’s now double the input, which is 2X. We can still use the previous function ( Y=X ) to predict the outputs and calculate the total error.

The details of the error calculations are given in the next table. The total error, in this case, is not 0 as in the previous example, but its value is 14. The existence of error within the data means that the model function cannot do the mapping between the input and the output correctly.

In order to reduce the error, we have to make some modifications over that function. The question is, what are the sources of modifications within this function ( Y=X ) that can reduce the prediction error? The function just has 2 variables X and Y. One represents the input and the other, the output. We cannot change either of them. As a conclusion, the function is non-parametric, so there’s no way to change it in order to reduce the error.

But all is not lost! If the function currently doesn’t have a parameter, why not add one or more parameters? Don’t hesitate to design your machine learning model in ways that reduce the error. If you find that adding something to the function fixes the problem, start adding it at once.

In the new data, the output Y is double the input X. But the function isn’t changed to reflect this, and we still use Y=X . We can change the function by making the output Y equals 2X rather than X. The next function will be Y=2X . After using this function, the total prediction error is calculated according to the next table. The total error is now 0 again. NICE.

After adding 2 to the function, does our model become parametric? NO. The model is still non-parametric. A parametric model learns the values of some parameters based on the data. Here, the value is calculated independently on the data, so the model is still non-parametric. The previous model has 2 multiplied by X but the value 2 is independent of the data. As a result, the model is still non-parametric.

Let’s change the previous data according to the next table.

Because there is no learning step, we can go ahead towards the testing step that calculates the prediction error after calculating the predicted output based on the last function ( Y=2X ). The total error is calculated according to the next table. The total error is no longer 0 but it is now 14. Why did that happen?

The model used for solving this problem was created when the output Y is double the input (2X). Now, the output Y is no longer equal to 2X but 3X. So, it’s expected that we find an increase in the error. In order to eliminate this error, we have to modify the model function by using 3 rather than 2. The new function will be Y=3X .

The new function Y=3X will make the total error for the new data 0. But when working with the previous data in which Y is just double X, there will be an error. So, working with the proceeding data, we have to use 3 for getting multiplied by X to return a total error of 0. Working with the previous data, we had to change it to 2.

It seems that we have to change the model ourselves each time the data is changed. It’s tiresome. But there is a solution. We can avoid using constants in the function and replace them with variables. This is algebra—the field of using variables rather than constants.