In a previous post we discussed Bayesian Parameter Estimation, where we have a latent model parameter we would like to estimate, and we assume a prior distribution over possible model parameters. However, sometimes it is difficult or unnatural to assign a prior distribution to the latent variable of interest. Thus, a lot of theory has been developed about parameter estimation where the latent model parameter is not a random variable, but rather deterministic but unknown.

Let the hidden parameter of interest be and suppose we have received observations about . We are interested in creating an estimator that minimizes the mean squared error:

Since we have absolutely no knowledge about the true value of , we would like our estimator to not depend on the parameters being estimated. We call such an estimator valid.

In general, finding a valid estimator that minimizes mean-square error is not always possible. Thus, we often put restrictions on the class of candidate estimators. We will restrict our attention to estimators that are unbiased (recall this means that ). It turns out that for unbiased estimators, minimizing mean-square error is equivalent to minimizing the variance of the estimator.

Lemma 1: For an unbiased estimator , the mean-square error is equivalent to the variance of .

Proof Lemma 1: Let us consider the case where for simplicity. Note that the error variance is . Now noting that since we are only taking expectation over , this equal to .

Thus, we focusing our attention on estimators that are both valid (does not depend on the parameter we are trying to estimate) and unbiased; we call such parameters admissible. Since in this case, minimizing the mean-square error is equivalent to minimizing the variance of the estimator, we are now looking for minimum-variance unbiased (MVU) estimators.

The MVU estimator does not always exist, or even if it exists, might be hard to find. There is not always a general way to find MVU estimators; however there exist bounds on the performance of the performance of MVU estimators. The bound that we will now discuss is known as the Cramer-Rao bound. This gives a lower bound on the variance of any admissible (unbiased and valid) estimator. We present the scalar version of the bound.

Theorem 1 (Cramer-Rao bound): For any admissible estimator , provided that satisfies the regularity condition:

then we have the lower bound

where is the Fischer information in about :

Proof Theorem 1 (sketch): We can compute the correlation between and , which we know must be at most 1. Computing both the co-variance and variances we can see that , which simplifies to the Cramer-Rao bound.

We note that the Fisher information cannot be computed for all problems and the derivation also relies on the regularity condition of . Also, although any estimator that satisfies the Cramer-Rao bound with equality is an MVU estimator, it is possible that no estimator can meet the Cramer-Rao bound for all or even any .

We call unbiased estimators that satisfy the Cramer-Rao bound with equality efficient. It can be shown that an estimator is efficient if and only if . Thus, if an efficient estimator exists, it must be in fact unique. Furthermore, we can show the following:

Claim 1: If an efficient estimator exists, it is the maximum likelihood estimator: that is .

Although if an efficient estimator exists it is the ML estimator, If an efficient estimator does not exist, then the ML estimator need not have any special properties.