There is no direct translation—throwing out the actual data is not legitimate. Even if all you were given were the ranks, the likelihood of that ordering under the alternative hypothesis depends on the underlying probability densities, so you haven’t simplified the problem.
Also, there’s nothing exclusively Bayesian about likelihood ratios, at least when the null and alternative hypotheses are completely specified.
Fair enough—in that case, how would you have produced a statistical measure of how well the distributions matched? What kind of assumptions would you employ?
I have three answers for that. The first is I wouldn’t have bothered—in this field, data collection is not super-precise, so three data points per method isn’t going to tell me shit. But this really avoids the question, so...
The second answer is that Bayesian analogues to this approach do exist. But we know a priori that the two methods won’t generate data from the same distribution—there’s really no need to even formulate the null hypothesis. What we really care about is the accuracy and precision of the new method, so...
The third answer is given enough data, I would have set up a hierarchical Bayesian model to estimate the accuracy and precision of the new method, where accuracy is defined as “matching the old method as closely as possible”.
How difficult would it be to rewrite the Mann-Whitney U test to give a Bayesian likelihood ratio?
There is no direct translation—throwing out the actual data is not legitimate. Even if all you were given were the ranks, the likelihood of that ordering under the alternative hypothesis depends on the underlying probability densities, so you haven’t simplified the problem.
Also, there’s nothing exclusively Bayesian about likelihood ratios, at least when the null and alternative hypotheses are completely specified.
Fair enough—in that case, how would you have produced a statistical measure of how well the distributions matched? What kind of assumptions would you employ?
I have three answers for that. The first is I wouldn’t have bothered—in this field, data collection is not super-precise, so three data points per method isn’t going to tell me shit. But this really avoids the question, so...
The second answer is that Bayesian analogues to this approach do exist. But we know a priori that the two methods won’t generate data from the same distribution—there’s really no need to even formulate the null hypothesis. What we really care about is the accuracy and precision of the new method, so...
The third answer is given enough data, I would have set up a hierarchical Bayesian model to estimate the accuracy and precision of the new method, where accuracy is defined as “matching the old method as closely as possible”.