That is completely correct. To clarify in the light of the examples you give, my definition of spontaneity in the context of AI/LLMs means specifically “action whose origin is unable to be traced back to the prompt or training data.” This is, sadly, difficult to prove as it would require proving a negative. I’ll give some thought to how I might frame this in such a way that it is verifiable in an immutable-goalpost kind of way but I’m afraid this isn’t something I have an answer for now. Perhaps you have some thoughts?
I think that’s holding AI to a standard we don’t and can’t hold humans to. Every single thing you and I do that’s empirically measurable can plausibly be traced back in some way to our past experiences or observations—our training data. Spontaneity, desire, and emotion intuitively feel like a good bellwether of AGI consciousness because the sensations of volition and sentiment are so core to our experience of being human. But those aren’t strong cruxes of how much AGI would affect human civilization. We can imagine apocalyptically dangerous systems that design pandemic viruses without a shred of emotion, and likewise can imagine sublimely emotional and empathetic chatbots unable to either cause much harm or solve any real problems for us either. So I prefer the AGI definition I expressed largely because it avoids those murky consciousness questions and focuses on ability to impact the world in measurable ways.
To continue your engine analogy, I think we can definitely agree that the “check engine” light is firmly on at this point. I think that the drawing a line in the sand for AGI vs. “very powerful LLM” is, at best, subjective, and distracts from the fact that the LLMs/AIs that exist today are already well capable of causing the widescale damage that you warn of; the technology is already here, we are just waiting on the implementation. Perhaps what I mean is that we have, in my view, already crossed the line—the timing belt has snapped, the engine is dead, but we’re still coasting on the back of our existing momentum (maybe I’m over-stretching this analogy now...).
We may have an object-level disagreement here. I agree that the “check engine” light is one, and that current AI can already cause many problems. But I also expect that there is a qualitative difference (again, not a bright line, though) between risk from today’s LLMs and from AGI. For example, current AI evals/metrology have established to my satisfaction that the risk of GPT-5 class models designing an extinction level virus from scratch is extremely low.
That’s a fair point, but if we aren’t arguing about “consciousness,” and we have grounded our definition of “AGI” in, essentially, its capacity to do damage, then I think these kinds of tests fall into the same category as GDP in economics: a reasonable corollary but ultimately unsuitable as a true metric (and almost certainly misleading and ripe for abuse if taken out of context).
Absolutely, valid concerns. Folks in AI evals/metrology are working very hard to make sure we’re measuring the right things, and to educate people about the limitations of those metrics.
For sure! I just don’t feel the need to wait for this technology to be relabeled as “AGI” before we do something about it. If your concern is their ability to act, as the agents on Moltbook act, (let’s say) “semi-spontaneously,” then we are clearly already there: all we are waiting for is for a person to hand over the launch codes to an agent (or put a crowd of them in charge of a social-media psy-op, prior to a key election, etc.).
Yes, I am not suggesting that we wait. We should be acting aggressively now to mitigate risks.
You say that AIs would need to be “qualitatively” different to current generation models to do pose enough of a threat to be worthy of the “AGI” label. Please could you outline what these qualitative differences might be? I can only think of quantitative differences (e.g. more agents, more data-centers, more compute, more power, wider-scale application/deployment, more trust, more training data—all of these are simply scaling-up what already is and require no truly novel technology, though they would all increase the risk posed by AIs to our society).
The qualitative differences I’m referring to often involve threshold effects, where capabilities above the threshold trigger different dynamics. Sort of like how the behavior of a 51 kg sphere of enriched uranium is a very poor guide to the behavior of a 52 kg sphere at critical mass. Some concrete examples include virus design (synthesizing a high-lethality virus with
is a pandemic, and lower than that generally isn’t), geoengineering (designing systems capable of triggering climatic chain reactions, such as superefficient carbon-capture algae), nanotechnology (designing nanobots that can self-replicate from materials common in the biosphere). In all those cases, the dynamics of a disaster would be wildly different from an AI malfunction at lower levels of capability.
As for your point that you, personally, are concentrating on the individual response within the wider community of alarmists who, collectively, are concentrating on both the collective and the individual response: thank you for clarifying this, it is important context. I definitely agree that both avenues need exploration and it is no bad thing to concentrate your efforts. I would say that, for my rope, the collective response is where I think the overall course will be set, but when collectivism fails, then individualism (or, more realistically, smaller scale collectivism) is the backstop. In this vein, I think that point 10 from your original article is the absolute key: it won’t be your basement full of tinned food that saves you from the apocalypse: it will be your neighbours.
Perhaps I worded this in an unclear way. I am personally concentrating mostly on the collective response. But this particular post is about the individual response, partly because there is less clear and accessible material about that than on the collective response, which is a major focus of many other LessWrong posts.
I think that’s holding AI to a standard we don’t and can’t hold humans to. Every single thing you and I do that’s empirically measurable can plausibly be traced back in some way to our past experiences or observations—our training data. Spontaneity, desire, and emotion intuitively feel like a good bellwether of AGI consciousness because the sensations of volition and sentiment are so core to our experience of being human. But those aren’t strong cruxes of how much AGI would affect human civilization. We can imagine apocalyptically dangerous systems that design pandemic viruses without a shred of emotion, and likewise can imagine sublimely emotional and empathetic chatbots unable to either cause much harm or solve any real problems for us either. So I prefer the AGI definition I expressed largely because it avoids those murky consciousness questions and focuses on ability to impact the world in measurable ways.
We may have an object-level disagreement here. I agree that the “check engine” light is one, and that current AI can already cause many problems. But I also expect that there is a qualitative difference (again, not a bright line, though) between risk from today’s LLMs and from AGI. For example, current AI evals/metrology have established to my satisfaction that the risk of GPT-5 class models designing an extinction level virus from scratch is extremely low.
Absolutely, valid concerns. Folks in AI evals/metrology are working very hard to make sure we’re measuring the right things, and to educate people about the limitations of those metrics.
Yes, I am not suggesting that we wait. We should be acting aggressively now to mitigate risks.
The qualitative differences I’m referring to often involve threshold effects, where capabilities above the threshold trigger different dynamics. Sort of like how the behavior of a 51 kg sphere of enriched uranium is a very poor guide to the behavior of a 52 kg sphere at critical mass. Some concrete examples include virus design (synthesizing a high-lethality virus with
Perhaps I worded this in an unclear way. I am personally concentrating mostly on the collective response. But this particular post is about the individual response, partly because there is less clear and accessible material about that than on the collective response, which is a major focus of many other LessWrong posts.
Many thanks for the thoughtful exchange!