Never introduce an infinity into a probability problem except as the limit of finite processes!
Hence we need a prior over joint distributions of (X, Y). And yes, I do mean a prior distribution over probability distributions: we are saying that (X, Y) has some unknown joint distribution, which we treat as being drawn at random from a large collection of distributions. This is therefore a non-parametric Bayes approach: the term non-parametric means that the number of the parameters in the model is not finite.
Non-parametric methods are limits of finite processes. Or, more precisely, they are rules that work for any finite data set you have. Think about using histograms to approximate a density empirically, for any dataset we have a finite number of bins, but the number of parameters depends on the size of the data. That’s basically what “non-parametric” means.
Please keep your religious language out of my statistics, thank you.
It is worth noting that the issue of non-consistency is just as troublesome in the finite setting. In fact, in one of Wasserman’s examples he uses a finite (but large) space for X.
You’re violating Jaynes’s Infinity Commandment:
Never introduce an infinity into a probability problem except as the limit of finite processes!
Non-parametric methods are limits of finite processes. Or, more precisely, they are rules that work for any finite data set you have. Think about using histograms to approximate a density empirically, for any dataset we have a finite number of bins, but the number of parameters depends on the size of the data. That’s basically what “non-parametric” means.
Please keep your religious language out of my statistics, thank you.
It is worth noting that the issue of non-consistency is just as troublesome in the finite setting. In fact, in one of Wasserman’s examples he uses a finite (but large) space for X.