Robert Shuler comments on Alignment remains a hard, unsolved problem

Robert Shuler 5 Dec 2025 13:38 UTC
1 point
0
- A sub-human-level aligned AI with traits derived from fiction about AIs.
- A sub-human-level misaligned AI with traits derived from fiction about AIs.
- A superintelligent aligned AI with traits derived from the model’s guess as to how real superintelligent AIs might behave.
- A superintelligent misaligned AI with traits derived from the model’s guess as to how real superintelligent AIs might behave.
What’s missing here is
(a) Training on how groups of cognitive entities behave (e.g. Nash Equilibrium) which show that cognitive cooperation is a losing game for all sides, i.e. not efficient).
(b) Training on ways to limit damage from (a), which humans have not been effective at, though they have ideas.

This would lead to...
5. AIs or SAIs that follow mutual human and other AI collaboration strategies that avoid both mutual annihilation and long term depletion or irreversible states.
6. One or more AIs or SAIs that see themselves with a dominant advantage and attempt to “take over” both to preserve themselves, and if they are benign-misaligned, most other actors.