The neural tangent kernel[1] provides an intuitive story for how neural networks generalize: a gradient update on a datapoint will shift similar (as measured by the hidden activations of the NN) datapoints in a similar way.
The vast majority of LLM capabilities still arise from mimicking human choices in particular circumstances. This gives you a substantial amount of alignment “for free” (since you don’t have to worry that the LLMs will grab excess power when humans don’t), but it also limits you to ~human-level capabilities.
“Gradualism” can mean that fundamentally novel methods only make incremental progress on outcomes, but in most people’s imagination I think it rather means that people will keep the human-mimicking capabilities generator as the source of progress, mainly focusing on scaling it up instead of on deriving capabilities by other means.
Maybe I should be cautious about invoking this without linking to a comprehensible explanation of what it means, since most resources on it are kind of involved...
The neural tangent kernel[1] provides an intuitive story for how neural networks generalize: a gradient update on a datapoint will shift similar (as measured by the hidden activations of the NN) datapoints in a similar way.
The vast majority of LLM capabilities still arise from mimicking human choices in particular circumstances. This gives you a substantial amount of alignment “for free” (since you don’t have to worry that the LLMs will grab excess power when humans don’t), but it also limits you to ~human-level capabilities.
“Gradualism” can mean that fundamentally novel methods only make incremental progress on outcomes, but in most people’s imagination I think it rather means that people will keep the human-mimicking capabilities generator as the source of progress, mainly focusing on scaling it up instead of on deriving capabilities by other means.
Maybe I should be cautious about invoking this without linking to a comprehensible explanation of what it means, since most resources on it are kind of involved...