Noosphere89 comments on But Have They Engaged With The Arguments? [Linkpost]

Noosphere89 8 Sep 2025 16:50 UTC
6 points
1
For what it’s worth, I agree that empirical results have made me worry more relative to last year, and it’s part of the reason I no longer have p(doom) below 1-5%.
But there are other important premises which I don’t think are supported well by empirics, and are arguably load-bearing for the confidence that people have.
One useful example from Paul Christiano is there’s a conflation between solving the alignment problem on the first critical try, and not being able to experiment at all, and while this makes AI governance way harder, it doesn’t make the science problem nearly as difficult:
Eliezer often equivocates between “you have to get alignment right on the first ‘critical’ try” and “you can’t learn anything about alignment from experimentation and failures before the critical try.” This distinction is very important, and I agree with the former but disagree with the latter. Solving a scientific problem without being able to learn from experiments and failures is incredibly hard. But we will be able to learn a lot about alignment from experiments and trial and error; I think we can get a lot of feedback about what works and deploy more traditional R&D methodology. We have toy models of alignment failures, we have standards for interpretability that we can’t yet meet, and we have theoretical questions we can’t yet answer.. The difference is that reality doesn’t force us to solve the problem, or tell us clearly which analogies are the right ones, and so it’s possible for us to push ahead and build AGI without solving alignment. Overall this consideration seems like it makes the institutional problem vastly harder, but does not have such a large effect on the scientific problem.
From this list of disagreements
I mostly agree with the rest of your comment.