It seems to me like the core diff/crux between plex and Audrey is whether the Solution to the problem needs to take the form of
The top level of the system having some codified/crystallized values that protect the parts, leaving their agency plenty of room and optionality to flourish.
Some sort of decentralized but globally adaptive mycelium-like cooperation structure, where various components (communities, Kamis) act in just enough unison to prevent really bad outcomes and ensuring that we remain in a good basin.
Plex leans strongly towards “we need (1) and (2) is unstable”. Audrey leans at least moderately towards “(2) is viable, and if (2) is viable, then (2) is preferred over (1)”.
If I double click on this crux to get a crux upstream of it, I imagine something like:
How easy is it to screw over the world given a certain level of intelligence that we would expect from a bounded Kami-like system (+ some plausible affordances/optimization channels)?
Consequently:
How strong and/or centrally coordinated and/or uniformly imposed do the safeguards to prevent it need to be?[1]
And then:
What is the amount of “dark optimization power” (roughly, channels of influence that can be leveraged to achieve big outcomes, likely to be preferred by some sort of entity but that we (humans) are not aware of) that can be accessed by beings that we can expect to exist within the next decades?
This is pretty close, but a more central crux is something like: Does a system fractally slip towards power-seeking across all parameters left free.[1]
Even if it’s very hard to screw the world over at a given power level, if each AI/system/Kami has a ratcheting internal selection pressure towards being more dominated by power-seeking subsystems, eventually the world gets screwed.
The crux you listed is important for how fast the world is destroyed without a singleton, but not really relevant for whether it is destroyed without a singleton.
Non-free parameters are ones pinned down by formal/well-defined things held in place by optimization, or by stronger systems or meta-systems enforcing properties to be maintained by a system effectively.
It seems to me like the core diff/crux between plex and Audrey is whether the Solution to the problem needs to take the form of
The top level of the system having some codified/crystallized values that protect the parts, leaving their agency plenty of room and optionality to flourish.
Some sort of decentralized but globally adaptive mycelium-like cooperation structure, where various components (communities, Kamis) act in just enough unison to prevent really bad outcomes and ensuring that we remain in a good basin.
Plex leans strongly towards “we need (1) and (2) is unstable”. Audrey leans at least moderately towards “(2) is viable, and if (2) is viable, then (2) is preferred over (1)”.
If I double click on this crux to get a crux upstream of it, I imagine something like:
How easy is it to screw over the world given a certain level of intelligence that we would expect from a bounded Kami-like system (+ some plausible affordances/optimization channels)?
Consequently:
How strong and/or centrally coordinated and/or uniformly imposed do the safeguards to prevent it need to be?[1]
And then:
What is the amount of “dark optimization power” (roughly, channels of influence that can be leveraged to achieve big outcomes, likely to be preferred by some sort of entity but that we (humans) are not aware of) that can be accessed by beings that we can expect to exist within the next decades?
This collapses a lot of complexity of potential solutions into three dimensions but just to convey the idea.
This is pretty close, but a more central crux is something like: Does a system fractally slip towards power-seeking across all parameters left free.[1]
Even if it’s very hard to screw the world over at a given power level, if each AI/system/Kami has a ratcheting internal selection pressure towards being more dominated by power-seeking subsystems, eventually the world gets screwed.
The crux you listed is important for how fast the world is destroyed without a singleton, but not really relevant for whether it is destroyed without a singleton.
Non-free parameters are ones pinned down by formal/well-defined things held in place by optimization, or by stronger systems or meta-systems enforcing properties to be maintained by a system effectively.