Oliver Daniels comments on An Explication of Alignment Optimism

Oliver Daniels 14 Feb 2026 1:48 UTC
1 point
0
Yeah I mostly agree.

It’s not that I expect the AIs of the next 2-5 years to be myopic in some strict sense, but rather that (relative to reasonable pre-LLM priors) I expect their capabilities to arise more out of (generalized) imitation, and still be be sort of globally incoherent (i.e pursing different conflicting objectives).

but this source of optimism gets weaker as RL becomes more important, and it sure does seem to be becoming more important.