No. But it seems tractable to have better tests of whether AIs would do intermediate bad actions using this methodology.
No. But it seems tractable to have better tests of whether AIs would do intermediate bad actions using this methodology.