I would give examples of things that shouldn’t have been published and are why I agree, but that would be missing the point, wouldn’t it?
Let’s put it this way: I think most “alignment” or “safety” research is in fact nothing of the kind, and most people responding are deluding themselves so as to avoid having to consider the possibility of needing to go back to the drawing board.
As usual, capability (ability to figure out things about ai) generalizes further than alignment (ability to aim your ability to understand ai at actually making your knowledge produce utilitarian(-prioritarian)-morally-good outcomes).
I would give examples of things that shouldn’t have been published and are why I agree, but that would be missing the point, wouldn’t it?
Let’s put it this way: I think most “alignment” or “safety” research is in fact nothing of the kind, and most people responding are deluding themselves so as to avoid having to consider the possibility of needing to go back to the drawing board.
As usual, capability (ability to figure out things about ai) generalizes further than alignment (ability to aim your ability to understand ai at actually making your knowledge produce utilitarian(-prioritarian)-morally-good outcomes).