I think that sampling from the posterior is a much safer bet. For example, suppose that you may answer a query either with the empty string (because you had a heart attack immediately) or with a complex philosophical treatise with many degrees of freedom. Then maximum likelihood will always give you the empty string! (Sorry if I’m misunderstanding this.)
My original post is here, though I’m afraid it’s somewhat less precise and clear, and it may be too ambitious given what the technique can actually deliver.
Here is a first attempt at scaling down to more realistic predictors.
Thanks for writing this up.
I think that sampling from the posterior is a much safer bet. For example, suppose that you may answer a query either with the empty string (because you had a heart attack immediately) or with a complex philosophical treatise with many degrees of freedom. Then maximum likelihood will always give you the empty string! (Sorry if I’m misunderstanding this.)
My original post is here, though I’m afraid it’s somewhat less precise and clear, and it may be too ambitious given what the technique can actually deliver.
Here is a first attempt at scaling down to more realistic predictors.