Similarly, if we want to have AIs that can play a productive role in society, our goal should not be exclusively or even primarily to align them with the goals of their creators or the narrow rationalist community interested in the AIAP. Instead it should be to create a set of social institutions that ensures that the ability of any narrow oligarchy or small number of intelligences like a friendly AI cannot hold extremely disproportionate power. The institutions likely to achieve this are precisely the same sorts of institutions necessary to constrain extreme capitalist or state power.
[....]
A primary goal of AI design should be not just alignment, but legibility, to ensure that the humans interacting with the AI know its goals and failure modes, allowing critique, reuse, constraint etc.
Weyl’s technocrat critique is valid in the personal level. It did hit me hard. I have tendency to drift from important messy problems into interesting but difficult problems that might have formal solutions (Is there name for this cognitive bias?) LessWrong community supports this bias drift.
I argue that Instrumental convergence and AI aliment problems are framed incorrectly to make them more interesting to think and easier to solve.
New framing: Intelligent agents (human and nonhuman) aligning constantly to each other. Solving instrumental convergence is equal to solving the society. We can’t solve it once and for all, but we can create process and institutions that adjust and manage problems that arise.
Typical scenarios are superpower+superintelligence, ruling party + superintelligence, Zuck+Superintelligence, Chairman Xi + Superintelligence, Alphabet board of directors + Superintelligence.
I have started to see the Instrumental convergence problem as a part of human-to-human aliment problem.
E. Glen Weyl in “Why I Am Not A Technocrat”
Weyl’s technocrat critique is valid in the personal level. It did hit me hard. I have tendency to drift from important messy problems into interesting but difficult problems that might have formal solutions (Is there name for this cognitive bias?) LessWrong community supports this bias drift.
I argue that Instrumental convergence and AI aliment problems are framed incorrectly to make them more interesting to think and easier to solve.
New framing: Intelligent agents (human and nonhuman) aligning constantly to each other. Solving instrumental convergence is equal to solving the society. We can’t solve it once and for all, but we can create process and institutions that adjust and manage problems that arise.
Typical scenarios are superpower+superintelligence, ruling party + superintelligence, Zuck+Superintelligence, Chairman Xi + Superintelligence, Alphabet board of directors + Superintelligence.