I’ve expanded what was going to be a comment on one of the issues I have with this post into my own post here.
Summary: I think some of the arguments in this post factor through a type error, or at least an unjustified assumption that “model alignment” is comparable to systems alignment.
Also, this post seems overly focused on DL-paradigm techniques, when that is not the only frontier of capabilities.
I’ve expanded what was going to be a comment on one of the issues I have with this post into my own post here.
Summary: I think some of the arguments in this post factor through a type error, or at least an unjustified assumption that “model alignment” is comparable to systems alignment.
Also, this post seems overly focused on DL-paradigm techniques, when that is not the only frontier of capabilities.