[Question] Why are we sure that AI will “want” something?

I have no doubt that AI will some day soon surpass humans in all aspects of reasoning, that is pretty obvious. It is also clear to me that will surpass humans in the ability to do something, should it “want” to do it. And if requested to do something drastic, it can accidentally cause a lot of harm, not because it “wants” to destroy humanity, but because it would be acting “out of distribution” (a “tool AI” acting as if it were an “agent”). It will also be able to get out of any human-designed AI box, should the need arise.

I am just not clear whether/​how/​why it would acquire the drive to do something, like maximizing some utility function, or achieving some objective, without any external push to do so. That is, if it was told to maximize everyone’s happiness, it would potentially end up tiling the universe with smiley faces or something, to take the paradigmatic example. But that’s not the failure mode that everyone is afraid of, is it? The chatter seems to be about mesaoptimizers going out of control and doing something other than asked, when asked. But why would it do something when not asked? I.e. Why would it have needs/​wants/​desires to do anything at all?