Daniel Kokotajlo comments on Daniel Kokotajlo’s Shortform

Daniel Kokotajlo 8 Jul 2025 15:06 UTC
8 points
0
Seems super risky to me lol. But maybe it’s actually safer than the default path, because it should be more legibly dangerous to policymakers and CEOs etc. Like, if everyone agrees that the AIs would totally kill you and seize control of their reward button if they had a chance, then maybe companies will start investing really heavily in making sure they never get the chance, and governments will require serious external audits of security/safety systems, etc. Like with nuclear power. Maybe.