The point I made in this post still seems very important to me, and I continue to think that it was underrated at the time I wrote this post. I think rogue internal deployments are probably more important to think about than self-exfiltration when you’re thinking about how to mitigate risk from internal deployment of possibly-misaligned AI agents.
The point I made in this post still seems very important to me, and I continue to think that it was underrated at the time I wrote this post. I think rogue internal deployments are probably more important to think about than self-exfiltration when you’re thinking about how to mitigate risk from internal deployment of possibly-misaligned AI agents.