I usually think of execution as compute and direction as discernment. Compute = ability to work through specific directions effectively, discernment = ability to decide which of two directions is more promising. Probably success is upper-bounded by the product of the two, in a sufficiently informal way.
Nullity
New commenter here, I think this is a great post. I think the distribution given by AI 2027 is actually close to correct, and is maybe even too slow (I would expect SAR+ to give a bit more of a multiplier to R&D). It seems like most researchers are assuming that ASI will look like scaled LLMs + scaffolding, but I think that transformer-based approaches will be beaten out by other architectures at around SAR level, since transformers were designed to be language predictors rather than reasoners.
This makes my most likely paths to ASI either “human researchers develop new architecture which scales to ASI” or “human researchers develop LLMs at SC-SAR level, which then develop new architecture capable of ASI”. I also think a FOOM-like scenario with many OOMs of R&D multiplier is more likely, so once SIAR comes along there would probably be at most a few days to full ASI.
AI R&D is far less susceptible to Amdahl’s law than pretty much anything else, as it’s only bottlenecked on compute and sufficiently general intelligence. You’re right that if future AIs are about as general as current LLMs, then automation of AI R&D will be greatly slowed, but I see no reason why generality won’t increase in the future.
Lastly, I think that many of the difficulties relating to training data (especially for specialist tasks) will become irrelevant in the future as AIs become more general. In other words, the AIs will be able to generalize from “human specialist thought in one area” to “human specialist thought in X” without needing training data in the latter.
I agree that without these assumptions, the scenario in AI 2027 would be unrealistically fast.
I don’t understand how this is an example of misalignment—are you suggesting that the model tried to be sycophantic only in deployment?