The question is whether the likelihood that the 21st experiment will validate the best theory constructed from 20 data points and invalidate the best theory constructed from 10 data points, when that theory also fits the other ten, is greater than the likelihood scientist B is just being dumb.
The likelihood of the former is very hard to calculate, but it’s definitely less than 1⁄11, in other words, over 91% of the time the first theory will still be, if not the best possible theory, good enough to predict the results of one more experiment. The likelihood that a random scientist, who has 20 data points and a theory that explains them, will come up with a different theory which is total crap, is easily more than 1 in 10.
Ergo, we trust theory A.
I still think CEV is dangerously vague. I can’t really hold up anything as an alternative, and I agree that all the utility functions that have been offered so far have fatal flaws in them, but pointing at some humans with brains and saying “do what’s in there, kind of! but, you know, extrapolate...” doesn’t give me a lot of confidence.
I’ve asked this before without getting an answer, but can you break down CEV into a process with discrete ordered steps that transforms the contents of my head into the utility function the AI uses? Not just a haphazard pile of modifiers (knew more, thought faster, were more the people we would wish we were if we knew what we would know if we were the people we wanted to be), but an actual flowchart or something.