I think there’s an aesthetic clash here somewhere. I have an intuition or like… an aesthetic impulse, telling me basically… “advocacy is dumb”. Whenever I see anybody Doing An Activism, they’re usually… saying a bunch of… obviously false things? They’re holding a sign with a slogan that’s too simple to possibly be the truth, and yelling this obviously oversimplified thing as loudly as they possibly can? It feels like the archetype of overconfidence.
This is exactly the same thing that I have felt in the past. Extremely well said. It is worth pointing out explicitly that this is not a rational thought—it’s an Ugh Field around advocacy, and even if the thought is true, that doesn’t mean all advocacy has to be this way.
My model of Eliezer says something like this:
AI will not be aligned by default, because AI alignment is hard and hard things don’t spontaneously happen. Rockets explode unless you very carefully make them not do that. Software isn’t automatically secure or reliable, it takes lots of engineering effort to make it that way.
Given that, we can presume there needs to be a specific example of how we could align AI. We don’t have one. If there was one, Eliezer would know about it—it would have been brought to his attention, the field isn’t that big and he’s a very well-known figure in it. Therefore, in the absence of a specific way of aligning AI that would work, the probability of AI being aligned is roughly zero, in much the same way that “Throw a bunch of jet fuel in a tube and point it towards space” has roughly zero chance of getting you to space without specific proof of how it might do that.
So, in short—it is reasonable to assume that AI will be aligned only if we make it that way with very high probability. It is reasonable to assume that if there was a solution we had that would work, Eliezer would know about it. You don’t need to know everything about AGI x-risk for that—anything that promising would percolate through the community and reach Eliezer in short order. Since there is no such solution, and no attempts have come close according to Eliezer, we’re in trouble.
Reasons you might disagree with this:
You think AI is a long way away, and therefore it’s okay that we don’t know how to solve it yet.
You think “alignment by default” might be possible.
You think some approaches that have already been brought up for solving the problem are reasonably likely to succeed when fleshed out more.