I’m curious about your sense of the path towards AI safety applications, if you have a more specific and/or opinionated view than the conclusion/discussion section.
My main opinionated view relative to current discourse is that if someone is trying to apply any of this directly to LLMs, then they are probably very deeply confused about natural abstractions and also language and also agency/intelligence/etc in general.
The path we’re optimizing for right now is to figure out the whole damn core of the theory of agency, get it across the theory-practice gap, and then not be so clueless about everything. See The Plan. Possibly with AI acceleration of the research at some point; our decisions right now are basically the same regardless of whether the research will be accelerated by AI later.
I’m curious about your sense of the path towards AI safety applications, if you have a more specific and/or opinionated view than the conclusion/discussion section.
My main opinionated view relative to current discourse is that if someone is trying to apply any of this directly to LLMs, then they are probably very deeply confused about natural abstractions and also language and also agency/intelligence/etc in general.
The path we’re optimizing for right now is to figure out the whole damn core of the theory of agency, get it across the theory-practice gap, and then not be so clueless about everything. See The Plan. Possibly with AI acceleration of the research at some point; our decisions right now are basically the same regardless of whether the research will be accelerated by AI later.