I’d argue that if you think it’s worth devoting effort to creating a super-intelligent AI that you can control or trust well enough for this purpose, then it’s strictly better to give it a more sensible optimization target than just preventing one overspecific bad outcome.
For background if anyone’s not aware: https://wiki.lesswrong.com/wiki/Roko’s_basilisk
I’d argue that if you think it’s worth devoting effort to creating a super-intelligent AI that you can control or trust well enough for this purpose, then it’s strictly better to give it a more sensible optimization target than just preventing one overspecific bad outcome.