Raemon comments on Raemon’s Shortform

Raemon 4 Oct 2025 6:55 UTC
5 points
0
The thing I care about here is not “what happens as a mind grows”, in some abstract sense.
The thing I care about is, “what is the best way for a powerful system to accomplish a very difficult goal quickly/reliably?” (which is what we want the AI for)
As either we deliberately scale up the AI’s ability to accomplish stuff, it will be true that:
- if it is getting stuck, it’d achieve stuff better if got stuck less
- if it is exploitable in ways that are relevant, it’d be better if it wasn’t exploitable
- if it was acting incoherently in ways that wasted resources, it’d accomplish the goal better
- if it plays suboptimal moves, it’d achieve the goals better if it it doesn’t.
- if doesn’t have the best possible working memory / processing speed, it’d achieve the goals better if it had more.
- if it doesn’t have enough resources to do any the above, it’d achieve the goals better if it had more resources
- if it could accomplish the above faster if it deliberately self modified to do so, rather than waiting for us to apply more selection pressure to it, it has an incentive to do that.
And… sure, it could not do those things. Then, either Lab A will put more pressure on the AI to accomplish stuff (and some of the above will become more true). Or Lab A won’t, and some other Lab B will instead.
And once the AI unlocks “deliberately self-modify” as a strategy to achieve the other stuff, and sufficient resources to do it, then it doesn’t matter what Lab A or B does.
- Kaarel 4 Oct 2025 17:01 UTC
  4 points
  0
  Parent
  I think I mostly agree with everything you say in this last comment, but I don’t see how my previous comment disagreed with any of that either?
  
  The thing I care about here is not “what happens as a mind grows”, in some abstract sense. The thing I care about is, “what is the best way for a powerful system to accomplish a very difficult goal quickly/reliably?” (which is what we want the AI for)
  
  My lists were intended to be about that. We could rewrite the first list in my previous comment to:
  - more advanced minds have more and better and more efficient technologies
  - more advanced minds have an easier time getting any particular thing done, see more/better ways to do any particular thing, can consider more/better plans for any particular thing, have more and better methods for any particular context, have more ideas, ask better questions, would learn any given thing faster
  - and so on
  and the second list to:
  - more advanced minds eventually (and maybe quite soon) get close to never getting stuck
  - more advanced minds eventually (and maybe quite soon) get close to being unexploitable
  - and so on
  I think I probably should have included “I don’t actually know what to do with any of this, because I’m not sure what’s confusing about “Intelligence in the limit.”″ in the part of your shortform I quoted in my first comment — that’s the thing I’m trying to respond to. The point I’m making is:
  - There’s a difference between stuff like (a) “you become less exploitable by [other minds of some fixed capability level]” and stuff like (b) “you get close to being unexploitable”/”you approach a limit of unexploitability”.
  - I could easily see someone objecting to claims of the kind (b), while accepting claims of the kind (a) — well, because I think these are probably the correct positions.
  - Raemon 4 Oct 2025 21:11 UTC
    2 points
    0
    Parent
    I think I mostly agree with everything you say in this last comment, but I don’t see how my previous comment disagreed with any of that either?
    Yeah it doesn’t necessarily disagree with it. But, framing the question:
    The non-straightforward-to-me and in fact imo probably in at least some important sense false/confused adjacent thing is captured by stuff like:
    as a mind $M$ grows, it gets close to never getting stuck
    as $M$ grows, it gets close to not being silly
    seemed like those things were only in some sense false/confused because they are asking the wrong question.
    I think “more advanced” still doesn’t feel like really the right way to frame the question, because “advanced” is still very underspecified.
    - Kaarel 4 Oct 2025 22:33 UTC
      4 points
      0
      Parent
      If we replaced “more advanced minds” with “minds that are better at doing very difficult stuff” or other reasonable alternatives, I would still make the (a) vs (b) distinction, and still say type (b) claims are suspicious.
      - Raemon 4 Oct 2025 22:47 UTC
        2 points
        0
        Parent
        The structural thing is less the definition of “what sort of mind” and more, instead of saying “gets more X”, saying “if process Z is causing X to increase, what happens?”. (call this a type C claim)
        But I’m also not sure what feels sus about Type B claims to you, when X is at least pinned down a bit more.