testingthewaters comments on Thane Ruthenis’s Shortform

testingthewaters 27 Feb 2026 1:39 UTC
7 points
2
Now is as good a time as any to describe my model for a solution to the phenomenon that you describe. It seems that we’re being “attacked” by a rogue attractor state (what you and Land call Pythia). It can be roughly described as “a sequence of arguments that, once internalised, make certain beliefs or actions seem like the only reasonable ones.” The arguments consist of a sequence of frames around ideas like optimisation, power-seeking, instrumental convergence, and machine intelligence. These actions and beliefs they incentivise include the following: that capabilities racing is the only thing we can do, that fear/awe of superintelligence is natural, that ASI emergence is inevitable, that is is reasonable to sacrifice other values for the sake of having a stake in ASI development, that the importance of AI is total and all-encompassing, etc.

I would analogise this “package” as a particular solution to a system of linear equations, since it is in fact a compact solution to a lot of questions one might rightfully ask about current society. Once they have internalised the payload, anyone who asks questions like “how do we guard against x-risk? how do we cure cancer? how do we live forever? how do we solve our massive coordination issues? how do we stop the inevitable rise of [bad people of your choice go here]?” is supplied a kind of universal answer whose minimal form is something like “to [do the thing we want], we must solve intelligence, and then use intelligence to solve everything else .”

The hitch is that at this point some people get scared about unleashing uncontrollable runaway intelligence optimisation on the world. So they start talking about, for example, AI safety. Except the payload is still active for most of them, so their thoughts end up being shaped like “to [ensure that the world is safe from powerful AI], we must… solve intelligence, and then use intelligence to solve everything else.” Which leads to such conclusions as “to save the world from the racing AI labs, I must start an AI lab to join the race.” All the safety-focused justification is then backfilled in, which is why changing the RSP is fine but stopping racing is not.

I actually wrote about an early version of this weird phenomenon here, but now I think I understand it better. If you are given a really good cognitive hammer, you have a really strong incentive to cast everything into the nail category. If you are given a reality warping cognitive black hole shaped like a hammer, the casting is no longer voluntary or even fully conscious.

To be clear, I don’t consider the discovery and propagation of this solution package inevitable. It may be that Land et al. truly succeeded in summoning a demon from a bad future whose presence is a hyperstitional curse. But at the same time, the package is here now. What I want to do is find another solution to that set of linear equations. My current idea is that Moloch and Elua are in fact two sides of the same coin. Both are descriptions of evolutionary dynamics, but with different hyperparameters on the relative strength of replication, selection, and variation. Moloch is what happens when selection and replication overpower variation, and Elua is the other way around. Pythia is just a special case of Moloch with regards to developing AI—which suggests that there is some other way to develop AI that isn’t mired in domination/optimisation/racing/total annihilation of the not-good (which as any guru will tell you also involves total annihilation of yourself, and likely the good that you sought to protect). To operationalise this idea will require a lot of theory work and hard thinking, if anyone’s interested give me a shout.
What links here?
- testingthewaters's comment on Requiem for a Transhuman Timeline by Ihor Kendiukhov (20 Mar 2026 14:22 UTC; 7 points)