Stepan comments on X explains Z% of the variance in Y

Stepan 6 Jul 2025 18:38 UTC
3 points
0
Is it correct to say that the mean is a good estimator whenever the variance is finite?
Well, yes, in the sense that the law of large numbers applies, i.e.
$lim n \to \infty Pr {| ¯ x - E [X] | < ε} = 1 \forall ε > 0$
The condition for that to hold is actually weaker. If all the $x_{i}$ are not only drawn from the same distribution, but are also independent, the existence of a finite $E [X]$ is necessary and sufficient for the sample mean to converge in probability to $E [X]$ as $n$ goes to infinity, if I understand the theorem correctly (I can’t prove that yet though; the proof with a finite variance is easy). If $x_{i}$ aren’t independent, the necessary condition is still weaker than the finite variance, but it’s cumbersome and impractical, so finite variance is fine I guess.
But that kind of isn’t enough to always justify the use of a sample mean as an estimator in practice? As foodforthought says, for a normal distribution it’s simultaneously the lowest MSE estimator, the maximum likelihood estimator, and is an unbiased estimator, but that’s not true for other distributions.
A quick example: suppose we want to determine the parameter $p$ of a Bernoulli random variable, i.e. “a coin”. The prior distribution over $p$ is uniform; we flip the coin $n = 10$ times, and use the sample success rate, $\frac{k}{n}$ , i.e. the mean, i.e. the maximum likelihood estimate. Per simulation the mean squared error $E [{(\frac{k}{n} - p)}^{2}]$ is about 0.0167. However, if we use $\frac{k + 1}{n + 2}$ instead, the mean squared error drops to 0.0139 (code).
Honestly though, all of this seems like frequentist cockamamie to me. We can’t escape prior distributions; we may as well stop pretending that they don’t exist. Just calculate a posterior and do whatever you want with it. E.g., how did I come up with the $\frac{k + 1}{n + 2}$ example? Well, it’s the expected value of the posterior beta distribution for $p$ if the prior is uniform, so it also gives a lower MSE.