This is pretty close, but a more central crux is something like: Does a system fractally slip towards power-seeking across all parameters left free.[1]
Even if it’s very hard to screw the world over at a given power level, if each AI/system/Kami has a ratcheting internal selection pressure towards being more dominated by power-seeking subsystems, eventually the world gets screwed.
The crux you listed is important for how fast the world is destroyed without a singleton, but not really relevant for whether it is destroyed without a singleton.
Non-free parameters are ones pinned down by formal/well-defined things held in place by optimization, or by stronger systems or meta-systems enforcing properties to be maintained by a system effectively.
internal selection pressure towards being more dominated by power-seeking subsystems, eventually the world gets screwed
I feel as if a tacit assumption bears too much weigh of that argument, why cannot an immune system work on that level? a cancer can kill a human, but not the humanity. but by default any internal subsystem of most imagined societies that will exist is going to kill all of societies globally, not just their local hosts? why is Moloch weaker than Clippy? why no warning signs—do you see humanity on a trajectory of not using still-stupid LLM based systems in autonomous weapons, never deploying any system that’s too weak to kill everyone while powerful to cause major harm while also brewing up a system that’s powerful enough to escape and self-preserve and gray-goo itself on its first try?
It can, but needs either is robust and stable, or falls probalistically after some time.
a cancer can kill a human, but not the humanity. but by default any internal subsystem of most imagined societies that will exist is going to kill all of societies globally, not just their local hosts?
why no warning signs—do you see humanity on a trajectory of not using still-stupid LLM based systems in autonomous weapons, never deploying any system that’s too weak to kill everyone while powerful to cause major harm while also brewing up a system that’s powerful enough to escape and self-preserve and gray-goo itself on its first try?
I somewhat update on no warning signs with Moltbook, but I mostly expect that humanity won’t react anything like strongly enough to them.
either is robust and stable, or falls probalistically after some time
do you see that statement as load-bearing in an argument chain that would be possible to unroll, please? I imagine one would have to believe it’s a positive feedback loop that is unstoppable by any “natural” negative feedback loop short of total plexish-value destruction in order to be worried about probabilistic cumulation over time
I mostly expect that humanity won’t react anything like strongly enough to them
yeah sure, but why do you expect technocratic civilization artifacts to be more robust to the destruction of human habitat than biological gene pool? why would that lead to extinction by default and not the kind of collapse when data centers stop operating before hunter gatherers?
..because for me, I would have to imagine a certain kind of competence to be worried about fully automated robotics to empower the extinction danger from artificial means, and I don’t see that level of competence in the artifacts of current civilization.. yet
At each timestep there is some % chance of value decay. Either that % chance falls rapidly, and the value is stable, or or does not fall rapidly, and after some timesteps you should expect values to be decayed.
why would that lead to extinction by default and not the kind of collapse when data centers stop operating before hunter gatherers?
This is pretty close, but a more central crux is something like: Does a system fractally slip towards power-seeking across all parameters left free.[1]
Even if it’s very hard to screw the world over at a given power level, if each AI/system/Kami has a ratcheting internal selection pressure towards being more dominated by power-seeking subsystems, eventually the world gets screwed.
The crux you listed is important for how fast the world is destroyed without a singleton, but not really relevant for whether it is destroyed without a singleton.
Non-free parameters are ones pinned down by formal/well-defined things held in place by optimization, or by stronger systems or meta-systems enforcing properties to be maintained by a system effectively.
I feel as if a tacit assumption bears too much weigh of that argument, why cannot an immune system work on that level? a cancer can kill a human, but not the humanity. but by default any internal subsystem of most imagined societies that will exist is going to kill all of societies globally, not just their local hosts? why is Moloch weaker than Clippy? why no warning signs—do you see humanity on a trajectory of not using still-stupid LLM based systems in autonomous weapons, never deploying any system that’s too weak to kill everyone while powerful to cause major harm while also brewing up a system that’s powerful enough to escape and self-preserve and gray-goo itself on its first try?
There’s a few arguments here
It can, but needs either is robust and stable, or falls probalistically after some time.
Humans have boundaries between them that cancer can’t generally cross (and when it does the species has a very bad time).
I somewhat update on no warning signs with Moltbook, but I mostly expect that humanity won’t react anything like strongly enough to them.
do you see that statement as load-bearing in an argument chain that would be possible to unroll, please? I imagine one would have to believe it’s a positive feedback loop that is unstoppable by any “natural” negative feedback loop short of total plexish-value destruction in order to be worried about probabilistic cumulation over time
yeah sure, but why do you expect technocratic civilization artifacts to be more robust to the destruction of human habitat than biological gene pool? why would that lead to extinction by default and not the kind of collapse when data centers stop operating before hunter gatherers?
..because for me, I would have to imagine a certain kind of competence to be worried about fully automated robotics to empower the extinction danger from artificial means, and I don’t see that level of competence in the artifacts of current civilization.. yet
At each timestep there is some % chance of value decay. Either that % chance falls rapidly, and the value is stable, or or does not fall rapidly, and after some timesteps you should expect values to be decayed.
Humanoid robots are pretty near-term: https://fxtwitter.com/KyberLabsRobots/status/2036127368088080867