I remember that I tried to raise similar issues before. Assuming poorly solvable alignment, a collection of minds has two forces[1] which drive its changes: instrumental convergence that you describe and moral reflection.
If instrumentally convergent behavior was more common before the decolonisation, then it arguably means that the human civilisation’s changes were driven by moral reflection. Or that it managed to turn from one attractor to another.
Compare it with the AI 2027 goals forecast. It has six sources of goals. The first two mean that alignment is solved. The third and fourth also depend on the developers. The fifth is instrumental convergence, and the sixth is other sources like moral reflection, tropes from training data and the idea that there is a True Morality waiting to be discovered.
I remember that I tried to raise similar issues before. Assuming poorly solvable alignment, a collection of minds has two forces[1] which drive its changes: instrumental convergence that you describe and moral reflection.
If instrumentally convergent behavior was more common before the decolonisation, then it arguably means that the human civilisation’s changes were driven by moral reflection. Or that it managed to turn from one attractor to another.
Compare it with the AI 2027 goals forecast. It has six sources of goals. The first two mean that alignment is solved. The third and fourth also depend on the developers. The fifth is instrumental convergence, and the sixth is other sources like moral reflection, tropes from training data and the idea that there is a True Morality waiting to be discovered.