Rohin Shah comments on An analogy as the midwife of thermodynamics

Rohin Shah 22 Feb 2022 17:58 UTC
4 points
0
Again, I’m not claiming that this is true in general. I think it is plausible to reach, idk, 90%, maybe higher, that a specific idea will revolutionize the world, even before getting any feedback from anyone else or running experiments in the world. (So I feel totally fine with the statement from Keynes that you quoted.)
I would feel very differently about this specific case if there was an actual statement from Sadi of the form “I believe that this particular theorem is going to revolutionize thermodynamics” (and he didn’t make similar statements about other things that were not revolutionary).
it seems like that 1000 number basically was chosen to roughly match the number of hypotheses you think were plausibly put forth before the correct one showed up. If so, then this is… pretty obviously not proper procedure, in my view.
I totally agree that’s what I did, but it seems like a perfectly fine procedure. Idk where the disconnect is, but maybe you’re thinking of “1000” as coming from a weirdly opinionated prior, rather than from my posterior.
From my perspective, I start out having basically no idea what the “justifiable prior” on that hypothesis is. (If you want, you could imagine that my prior on the “justifiable prior” was uniform over log-10 odds of −60 to 10; my prior is more opinionated than that but the extra opinions don’t matter much.) Then, I observe that the hypothesis we got seems to be kinda ad hoc with no great story even in hindsight for why it worked while other hypotheses didn’t. My guess is then that it was about as probable (in foresight) as the other hypotheses around at the time, and combined with the number of hypotheses (~1000) and the observation that one of them worked, you get the probability of 1/1000.
(I guess a priori you could have imagined that hypotheses should either have probability approximately 10^-60 or approximately 1, since you already have all the bits you need to deduce the answer, but it seems like in practice even the most competent people frequently try hypotheses that end up being wrong / unimportant, so that can’t be correct.)
As a different example, consider machine learning. Suppose you tell me that <influential researcher> has a new idea for RL sample efficiency they haven’t tested, and you want me to tell you the probability it would lead to a 5x improvement in sample efficiency on Atari. It seems like the obvious approach to estimate this probability is to draw the graph of how much sample efficiency improved from previous ideas from that researcher (and other similar researchers, to increase sample size), and use that to estimate P(effect size > 5x | published), and then apply an ad hoc correction for publication bias. I claim that my reasoning above is basically analogous to this reasoning.