In a world with a fixed length pause initiated when AI is powerful enough to “help” with alignment research, I expect that immediately after the pause ends, everyone dies. This is the perspective from which a 3 year pause is 3 times as valuable as a 1 year pause, and doesn’t change much with management.
That is, you think alignment is so difficult that keeping humanity alive for 3 years is more valuable than the possibility of us solving alignment during the pause? Or that the AIs will sabotage the project in a way undetectable by management even if management is very paranoid about being sabotaged by any model that has shown prerequisite capabilities for it?
Conditional on the world deciding on a fixed length pause instead of a pause till the model is aligned, absolutely yes. Unconditionally, yes but with less confidence.
In a world with a fixed length pause initiated when AI is powerful enough to “help” with alignment research, I expect that immediately after the pause ends, everyone dies. This is the perspective from which a 3 year pause is 3 times as valuable as a 1 year pause, and doesn’t change much with management.
That is, you think alignment is so difficult that keeping humanity alive for 3 years is more valuable than the possibility of us solving alignment during the pause? Or that the AIs will sabotage the project in a way undetectable by management even if management is very paranoid about being sabotaged by any model that has shown prerequisite capabilities for it?
Conditional on the world deciding on a fixed length pause instead of a pause till the model is aligned, absolutely yes. Unconditionally, yes but with less confidence.