I think evan’s post on deep double descent looks really prescient (i.e. I think its now widely accepted that larger models tend to generalize better than smaller models conditioned on achieving the same training loss)
the implications for scheming risk are a little less clear: reasoning models don’t have strong speed priors (and do inherit simplicity priors from the NN), but don’t seem to be schemers (perhaps due to output-to-thinking generalization). I don’t think we should update much from this though, given the narrow range of tasks and still-limited situational awareness.
I think evan’s post on deep double descent looks really prescient (i.e. I think its now widely accepted that larger models tend to generalize better than smaller models conditioned on achieving the same training loss)
https://www.lesswrong.com/posts/nGqzNC6uNueum2w8T/inductive-biases-stick-around
the implications for scheming risk are a little less clear: reasoning models don’t have strong speed priors (and do inherit simplicity priors from the NN), but don’t seem to be schemers (perhaps due to output-to-thinking generalization). I don’t think we should update much from this though, given the narrow range of tasks and still-limited situational awareness.