A case for robust AI benevolence rather than human control

I’ve been thinking about the limitations of control-based AI safety frameworks and wrote up a case for robust benevolence as a more stable equilibrium for autonomous agents. Arguing that as AI systems gain autonomy, intrinsic value alignment (think: empathy/​parental care as a motivational structure) is more robust than extrinsic constraints like obedience or reward.

Full essay here: https://​​zenodo.org/​​records/​​18060672

Happy to discuss — particularly interested in pushback on whether benevolence is actually more stable than obedience under capability gain.