there is definitely a fixable deficiency in LLMs. but, for HCH,
HCH seems bottlenecked by 3:
reasoning doesn’t compensate for lack of evidence (other than by making v-information converge to shannon information).
3.1. also, it’s often massively compute-cheaper to just go get more evidence from reality than to figure it out by thinking.
3.2. when chaos is involved, one often has less information than one might naively appear to; even to idealized shannon reasoning ie solomonoff induction, assuming we really live in a probabilistic universe (which we sure seem to), then measurement uncertainty is guaranteed to blow up if predicting the weather from finite samples.
But also, like, a lot of capability gains are coming from “take a long time to figure this out, train to figure it out more quickly than that” as a core part of what’s going on inside RLVR. So I don’t think it’s a total wash either.
Looks like you meant to write something more here? Or is the bottleneck what you wrote in point 3?
But also, like, a lot of capability gains are coming from “take a long time to figure this out, train to figure it out more quickly than that” as a core part of what’s going on inside RLVR. So I don’t think it’s a total wash either.
This is true. But also, in order for this to work in the case of a freaking big [space of things to be figured out], you need to start with some [good search heuristics]/[capacity to predict which search paths are promising to pursue]. It seems to me that tons of available [easily checkable-for-validity examples] on the internet of math, programming, etc, suffice to give you such heuristics/prediction/planning skills in the human range and somewhat extrapolate/enhance/refine them (via the “how could I have thought that faster?” method, as you say), but that this approach has its limits.
I think some mix of your options:
there is definitely a fixable deficiency in LLMs. but, for HCH,
HCH seems bottlenecked by 3:
reasoning doesn’t compensate for lack of evidence (other than by making v-information converge to shannon information).
3.1. also, it’s often massively compute-cheaper to just go get more evidence from reality than to figure it out by thinking.
3.2. when chaos is involved, one often has less information than one might naively appear to; even to idealized shannon reasoning ie solomonoff induction, assuming we really live in a probabilistic universe (which we sure seem to), then measurement uncertainty is guaranteed to blow up if predicting the weather from finite samples.
But also, like, a lot of capability gains are coming from “take a long time to figure this out, train to figure it out more quickly than that” as a core part of what’s going on inside RLVR. So I don’t think it’s a total wash either.
Looks like you meant to write something more here? Or is the bottleneck what you wrote in point 3?
This is true. But also, in order for this to work in the case of a freaking big [space of things to be figured out], you need to start with some [good search heuristics]/[capacity to predict which search paths are promising to pursue]. It seems to me that tons of available [easily checkable-for-validity examples] on the internet of math, programming, etc, suffice to give you such heuristics/prediction/planning skills in the human range and somewhat extrapolate/enhance/refine them (via the “how could I have thought that faster?” method, as you say), but that this approach has its limits.