Agree that current AI paradigm can be used to make significant progress in alignment research if used correctly. I’m thinking something like Cyborgism; leaving most of the “agency” to humans and leveraging prosaic models to boost researcher productivity which, being highly specialized in scope, wouldn’t involve dangerous consequentialist cognition in the trained systems.
However, the problem is that this isn’t what OpenAI is doing—iiuc, they’re planning to build a full-on automated researcher that does alignment research end-to-end, for which orthonormal was pointing out that this is dangerous due to their cognition involving dangerous stuff.
So, leaving aside the problems with other alternatives like pivotal act for now, it doesn’t seem like your points are necessarily inconsistent with orthonormal’s view that OpenAI’s plans (at least in its current form) seem dangerous.
I think OpenAI is probably agnostic about how to use AIs to get more alignment research done.
That said, speeding up human researchers by large multipliers will eventually be required for the plan to be feasible. Like 10-100x rather than 1.5-4x. My guess is that you’ll probably need AIs running considerably autonomously for long stretches to achieve this.
Agree that current AI paradigm can be used to make significant progress in alignment research if used correctly. I’m thinking something like Cyborgism; leaving most of the “agency” to humans and leveraging prosaic models to boost researcher productivity which, being highly specialized in scope, wouldn’t involve dangerous consequentialist cognition in the trained systems.
However, the problem is that this isn’t what OpenAI is doing—iiuc, they’re planning to build a full-on automated researcher that does alignment research end-to-end, for which orthonormal was pointing out that this is dangerous due to their cognition involving dangerous stuff.
So, leaving aside the problems with other alternatives like pivotal act for now, it doesn’t seem like your points are necessarily inconsistent with orthonormal’s view that OpenAI’s plans (at least in its current form) seem dangerous.
I think OpenAI is probably agnostic about how to use AIs to get more alignment research done.
That said, speeding up human researchers by large multipliers will eventually be required for the plan to be feasible. Like 10-100x rather than 1.5-4x. My guess is that you’ll probably need AIs running considerably autonomously for long stretches to achieve this.