$\begingroup$

I have been reading about regression models for missing data imputation and I'm quite confused regarding the following: if I can perfectly predict the value of feature f2 using feature f1, why would I use f2? If both were real, would this mean that they are highly correlated, even if in a non-linear fashion? As far as I know, this class of imputation methods tries to predict a feature using another set of features.

EDIT 1:

To give some technical/theoretical background, in section 3.2.1 of the book "Flexible Imputation of Missing Data":

For univariate $Y$ we write lowercase $y$ for $Y$ . Any predictors in the imputation model are collected in $X$. Symbol $X_{obs}$ indicates the subset of $n_1$ rows of $X$ for which $y$ is observed, and $X_{mis}$ is the complementing subset of n 0 rows of $X$ for which $y$ is missing. The vector containing the $n_1$ observed data in $y$ is denoted by $y_{obs}$ , and the vector of $n_0$ imputed values in $y$ is indicated by $\dot{y}$. This section reviews four different ways of creating imputations under the normal linear model. The four methods are: