RogerDearnaley comments on An Explication of Alignment Optimism

RogerDearnaley 13 Feb 2026 18:17 UTC
11 points
7
There were some old ideas about “tool AI”, “oracle AI”, and “myopic AI” as being “less dangerous” forms of AI. What we actually have is “AI that is bad at long-range tasks, and especially planning” for now, plus a tremendous economic incentive to make that hole in its capacity go away as fast as possible, and realistic graphs suggesting that hole may take another 2-5 years to completely go away.

That’s… not ideal, but better than the “no warning shots at all” worst case.
- Oliver Daniels 14 Feb 2026 1:48 UTC
  1 point
  0
  Parent
  Yeah I mostly agree.
  
  It’s not that I expect the AIs of the next 2-5 years to be myopic in some strict sense, but rather that (relative to reasonable pre-LLM priors) I expect their capabilities to arise more out of (generalized) imitation, and still be be sort of globally incoherent (i.e pursing different conflicting objectives).
  
  but this source of optimism gets weaker as RL becomes more important, and it sure does seem to be becoming more important.