Instrumental convergence and optimality of power-seeking are facts that describe important facets of reality. They unpack to precise + empirical + useful models of many dynamics in economics, games, markets, biology, computer security, and many adversarial interactions among humans generally.
But they don’t unpack to optimality being a real thing. No real entity actually optimizes anything, except maybe everything minimizes action. “It’s useful in economics” doesn’t mean you can just extrapolate it wherever.
I think people who build AI systems every day are “wildly miscalibrated” on how empirically well-supported and widely applicable these dynamics and methods of thinking are outside their own field.
What is supported by what? Is the claim that thinking about utility worked for economists, so everyone should think about utility, or that empirical research shows that anyone smart is trying to conquer the world, or what is the claim and what it is the evidence?
It is all ungrounded philosophy without quantifying what actual theories match reality by how much.
Or it’s the writer’s fault and calling it “one shot” is just a bad choice of words, when it being correct depends on specific decomposition into shots and “irretrievability” is better. Like, people are forced to say “there’ll be multiple first critical tries” to describe a situation where you can repeatedly fail to notice AI scheming. And it’s especially bad when you simultaneously endorse “you can’t try after ASI kills you” and “ASI that can actually kill you is importantly different situation”. Irretrievability, distribution shift and their correlation should be argued independently. Otherwise people will go off topic like this:
You can test lethal levels of nonsuper intelligence instead!