Eli Tyre comments on Plans A, B, C, and D for misalignment risk

Eli Tyre 10 Oct 2025 3:24 UTC
5 points
0
Thank you for writing this!

Some important things that I learned / clarified for myself from this comment:
- Many plans depend on preserving the political will to maintain a geopolitical regime that isn’t the nash equilibrium, for years or decades. A key consideration for those plans is “how much much of the benefit of this plan will we have gotten, if the controlled regime breaks down early?”
  - Plans that depend on having human level AIs do alignment work (if those plans work at all), don’t have linear payoff in time spent working, but they are much closer to linear than plans that depend on genetically engineered super geniuses doing the alignment work.
    In the AI alignment researcher plan, the AIs can be making progress as soon as they’re developed. In the super-genius plan, we need to develop the genetic engineering techniques and (potentially) have the super-geniuses grow up before they can get to work. The benefits to super-geniuses are backloaded, instead of linear.
    (I don’t want to overstate this difference however, because if the plan of automating alignment research is just fundamentally unworkable, it doesn’t matter that the returns to automated alignment research would be closer to linear in time, if it did work. The more important crux is “could this work at all?”)
- The complexity of “controlled takeoff” is in setting up the situation so that things are actually being done responsibly and safely, instead of only seeming so to people that aren’t equipped to judge. The complexity of “shut it all down” is in setting up an off-ramp. If “shut it all down” is also including “genetically engineer super-geniuses” as part of the plan, then it’s not clearly simpler than “controlled takeoff.”