AGI isn’t just a technology

Each time a skeptic talks about AI as a technology, I think it signals a likely crux.

I’ve been watching the public AI debate closely and sadly. I think that debate might be crucial in whether we actually get aligned AGI, and it’s not going particularly well so far. The debate is confused, and at risk of causing polarization by irritating all involved. Identifying cruxes should help the debate be less irritating.

Agency as a crux for x-risk

Of course AGI is a technology. But only in the way that humans are technically animals. Saying humans are animals has strong and wrong implications about how we behave and think. Calling AGI a technology has similar wrong implications.

Technologies do what they’re designed to, with some potential accidents and side effects. Boilers explode, and the internet is used for arguments and misinformation instead of spreading information. These effects can be severe, and could even threaten the future of humanity. But they’re not as dangerous as accidentally creating something that becomes smarter than you, and actively tries to kill you.

When someone refers to AI as a technology, I think they’re often not thinking of it as having full agency.

While AI without full agency does present possible x-risks, I think it’s a mistake to mix those in with the risks from fully agentic AGI. The risks from a fully agentic AGI are both easier to grasp, and more severe. I think it’s wiser to address those first, and only move on with a careful distinction in topics.

By full agency, I mean something that pursues goals, and chooses its own subgoals (a relevant example subgoal is preventing humans from interfering with its projects). There’s a spectrum of agency. A chess program has limited agency; it was made to play a good game of chess, and it can take moves to do that. Animals don’t really make long-range plans that include subgoals, and no existing AI has long-range goals and makes new plans to achieve them. Humans are currently unique in that regard.

There’s a huge difference between something deciding to kill you, and making a plan to do it, and something killing you by accident or misuse. Making this distinction should help deconfuse conversations on x-risk.

Agency seems inevitable

The above is only a good strategy if we’re likely to see fully agentic AGI before too long. I find it implausible that we’ll collectively stop short of creating fully agentic AI. I agree with Gwern’s arguments for Why Tool AIs Want to Be Agent AIs. Agents are desirable because they can actually do things. And AI actively making itself smarter seems useful.

I think there is one more pressure for agentic AI that Gwern doesn’t mention. One is the usefulness of creating explicit sub-goals for problem solving and planning. This is crucial for human problem-solving, and seems likely to be advantageous for many types of AI as well. Setting subgoals allows backward-chaining and problem factorization, among other advantages. I’ll try to address this issue more carefully in a future post.

None of the above is meant to imply that agency is a binary category. I think agency is a branching spectrum. A saw, a chess program, an LLM, and a human have different amounts and types of agency. But I think it’s likely we’ll see AGI with all of the agency that humans have.

If this is correct, this is the issue we should focus on in x-risk discussions. If it’s not, I’d be far less worried about AI risks. This potentially creates a point of agreement and an opportunity for productive discussion with x-risk skeptics who aren’t thinking about fully agentic AI.