Rohin Shah comments on rohinmshah’s Shortform

Rohin Shah 19 Feb 2026 16:02 UTC
4 points
0
Okay, would you like to bet on whether some of the largest research programs had plans going into them? I haven’t checked, but I would put at least 10:1 odds that if we pick say 3 projects like Apollo Program, Manhattan Project, and others on a similar scale and type they will all have had a high level roadmap of things to try which could plausibly address the core challenges quite early on^[1], even if a lot of details ended up changing when they ran into reality.
By this standard there is totally a plan / roadmap which is elaborated in that paper.
But also this notion of a plan / roadmap has approximately no relation to the way “plan” is used in AI safety discourse in my experience.

EDIT: There’s a 10 page executive summary you could read. Or you could read Section 6 on misalignment. Within that probably Amplified Oversight is the most relevant section. But I also don’t expect that this will change your mind ~at all because it isn’t really written with you as the intended audience. The AI summary is sometimes wrong/mistaken, sometimes correct but missing the point, and occasionally correct in a non-misleading way.