This was enlightening for me. I suspect the concept of treating agents (artificial, human, or otherwise) as multiple interdependent subsystems working in coordination, each with its own roles, goals, and rewards, rather than as a single completely unified system is critical for solving alignment problems.
I recently read Entangled Life (by Merlin Sheldrake), which explores similar themes. One of the themes is that the concept of the individual is not so easily defined (perhaps not even entirely coherent). Every complex being is made up of smaller systems and also part of a larger ecosystem and none of these levels can truly be understood independently of the others.
This was enlightening for me. I suspect the concept of treating agents (artificial, human, or otherwise) as multiple interdependent subsystems working in coordination, each with its own roles, goals, and rewards, rather than as a single completely unified system is critical for solving alignment problems.
I recently read Entangled Life (by Merlin Sheldrake), which explores similar themes. One of the themes is that the concept of the individual is not so easily defined (perhaps not even entirely coherent). Every complex being is made up of smaller systems and also part of a larger ecosystem and none of these levels can truly be understood independently of the others.