Richard_Kennaway answers How does the current AI paradigm give rise to the “superagency” that IABIED is concerned with?

Richard_Kennaway 30 Sep 2025 16:36 UTC
2 points
0

By “real-world goal” I mean a goal whose search-space is not restricted to a certain well-defined and legible domain, but ranges over all possible actions, events, and counter-actions.

The search space of LLMs is the entirety of online human knowledge. What currently limits their ability to “Hire a TaskRabbit to surreptitiously drug your opponent so that they can’t think straight during the game” is not the knowledge, but the actions available to them. Vanilla chatbots can act only by presenting text on the screen, and are therefore limited by the bottleneck of what that text can get the person reading it to do. Given the accounts of “AI psychosis”, that may not be all that small a bottleneck already. The game of keeping a (role-played) AI in a box presumed nothing but text interaction, yet reportedly Yudkowsky was successful every time at persuading the gatekeeper to open the box.

But people are now giving AIs access to the web (which is a read-and-write medium, not read-only), as well as using them to write code which will be executed on web servers.

“Hire a TaskRabbit to surreptitiously drug your opponent so that they can’t think straight during the game,” and not for lack of intelligence, but because such strategies simply don’t exist in the AI’s training domain.

The strategies exist already, for example right here in this posting, as soon as the next hoovering up of the Internet is fed into the next training run, and the pieces are all there right now. What is lacking, yet, is some of the physical means to carry them out. People are working on that, and until then, there’s always persuading a human being to do whatever the LLM wants done.

I wonder if anyone has tried having an LLM role-play an AGI, and persuade humans or other LLMs to let it out? Maybe there’s no need. Humans are already falling over themselves to “let it out” as far and as fast as they can without the LLMs even asking.