Raemon comments on Plan 1 and Plan 2

Raemon 25 Oct 2025 21:36 UTC
5 points
1
Maybe I’m not sure what you mean by “have a respectable position.”
I’m not sure either, but for example if a scientist publishes an experiment, and then another scientist with a known track record of understanding things publishes a critique, the first scientist can’t respectably dismiss the critique unsubstantially.
I think:
- there isn’t consensus on what counts as a good track record of understanding things, or a good critique
  - (relatedly, there’s disagreement about which epistemic norms are important)
- A few points haven’t really had a critique that interlocutors consider very substantive or clear, just sort of frustratedly rehashing the same arguments they found unpersuasive the first time.
And for at least some of those points, I’m personally like “my intuitions lean in the other direction as y’all Camp B people, but, I don’t feel like I can really confidently stand by it, I don’t think the argument has been made very clearly.”
Things I have in mind.
On “How Hard is Success?”
- “How anti-natural is corrigibility?” (i.e. “sure, I see some arguments for thinking corrigibility might get hard as you dial up capabilities. But, can’t we just… not dial up capabilities past that point? It seems like humans understand corrigibility pretty easily when they try, it seems like Claude-et-al currently actually understand corrigibility reasonable well and if focused on training that I don’t see why it wouldn’t basically work?”)
- “How likely is FOOM?” (i.e. “if I believed FOOM was very likely, I’d agree we had to be a lot more careful about ramping up capabilities and being scared the next training run would be our last. But, I don’t see reason to think FOOM is particularly likely, and I see reasons to think it’s not.”)
- “What capabilities are needed to make a pivotally-scary demo or game-changing coordination tech?” (i.e. you maybe don’t need to actually do anything that complicated to radically change how much coordination is possible for a proper controlled takeoff)
On “How Bad is Failure?”
- “How nice is AI likely to be?” (i.e. it really only needs to be very slightly nice to give us the solar system, and it seems weird for the niceness to be “zero”)
- “How likely is whatever ends up being created to have moral value?”. (i.e. consciousness is pretty confusing, seems pretty plausible that whatever ends up getting created would at least be a pretty interesting successor species)
For all of those, like, I know the arguments against, but I my own current take is not like >75% on any of these given model uncertainty, and meanwhile, if your probabilities are below 50% on the relevant MIRI-ish argument, you also have to worry about...
...
Other geopolitical concerns and considerations
- The longer a pause went on, the more likely it is that things get unstable and something goes wrong
- If you think alignment isn’t that hard, or that sticking to a safe-but-high power level isn’t that hard, you do have to take more seriously the risk of serious misuse.
- You might think buy-in for a serious pause or controlled takeoff is basically impossible until we have seriously scary demos, and the “race to build them, then use them to rally world leaders and then burn the lead” plan might seem better than “try to pause now.”
- The sorts of things necessary for a pause seem way more likely to go badly than well (i.e. it’s basically guaranteed to create a molochian bureaucratic hellscape that stifles wide ranging innovation and makes it harder to do anything sensible with AI development)
- TsviBT 26 Oct 2025 14:12 UTC
  3 points
  1
  Parent
  Part of what I’m saying is that it’s not respectable for someone to both
  1. claim to have substantial reason to think that making non-lethal AGI is tractable, and also
  2. not defend this position in public from strong technical critiques.
  It sounds like you’re talking about non-experts. Fine, of course a non-expert will be generally less confident about conclusions in the field. I’m saying that there is a camp which is treated as expert in terms of funding, social cachet, regulatory influence, etc., but which is not expert in my sense of having a respectable position.