Note that I don’t consider “get useful research, assuming that the model isn’t scheming” to be an AI control problem.
AI control isn’t a complete game plan and doesn’t claim to be; it’s just a research direction that I think is a tractable approach to making the situation safer, and that I think should be a substantial fraction of AI safety research effort.
Note that I don’t consider “get useful research, assuming that the model isn’t scheming” to be an AI control problem.
AI control isn’t a complete game plan and doesn’t claim to be; it’s just a research direction that I think is a tractable approach to making the situation safer, and that I think should be a substantial fraction of AI safety research effort.