danielms comments on We should try to automate AI safety work asap

danielms 27 Apr 2025 4:43 UTC
1 point
2
Hard agree with this. I think this is a necessary step along the path to aligned AI, and should be worked on asap to get more time for failure modes to be identified (meta-scheming, etc.).
Also there’s an idea of feedback loops—it would be great to hook into the AI R&D loop, so in a world where AIs doing AI research takes off we get similar speedups in safety research.