Organizations—firms, associations, etc. - are systems that are often not well-aligned with their intended purpose—whether to produce goods, make a profit, or do good. But specifically, they resist being discontinued. That is one of the aspects of organizational dysfunction discussed in Systemantics. I keep coming back to it as I think it should be possible to study at least some aspects in AI Alignment in existing organizations. Not because they are superintelligent but because their elements—sub-agents—are observable, and the misalignment often is too.
I think early AGI may actually end up being about designing organizations that robustly pursue metrics that their (flawed, unstructured, chaotically evolved) subagents don’t reliably directly care about. Molochean equilibrium fixation and super-agent alignment may turn out to be the same questions.
Organizations—firms, associations, etc. - are systems that are often not well-aligned with their intended purpose—whether to produce goods, make a profit, or do good. But specifically, they resist being discontinued. That is one of the aspects of organizational dysfunction discussed in Systemantics. I keep coming back to it as I think it should be possible to study at least some aspects in AI Alignment in existing organizations. Not because they are superintelligent but because their elements—sub-agents—are observable, and the misalignment often is too.
I think early AGI may actually end up being about designing organizations that robustly pursue metrics that their (flawed, unstructured, chaotically evolved) subagents don’t reliably directly care about. Molochean equilibrium fixation and super-agent alignment may turn out to be the same questions.