Mateusz Bagiński comments on Wei Dai’s Shortform

Mateusz Bagiński 7 Mar 2026 11:22 UTC
2 points
0
HCH seems bottlenecked by:
Looks like you meant to write something more here? Or is the bottleneck what you wrote in point 3?
But also, like, a lot of capability gains are coming from “take a long time to figure this out, train to figure it out more quickly than that” as a core part of what’s going on inside RLVR. So I don’t think it’s a total wash either.
This is true. But also, in order for this to work in the case of a freaking big [space of things to be figured out], you need to start with some [good search heuristics]/[capacity to predict which search paths are promising to pursue]. It seems to me that tons of available [easily checkable-for-validity examples] on the internet of math, programming, etc, suffice to give you such heuristics/prediction/planning skills in the human range and somewhat extrapolate/enhance/refine them (via the “how could I have thought that faster?” method, as you say), but that this approach has its limits.