I would define an AGI system as anything except for a Verifiably Incapable System.
My take on the definition of a VIS is that it can be constructed as follows.
It can be trained ONLY on a verifiaby safe dataset (for example, a dataset which is too narrow, like structures of proteins).
Alternatively it can be a CoT-based architecture with at most [DATA EXPUNGED] compute used for RL, since the scaling laws are likely known.
Finally, it could be something explicitly approved by a monopolistic governance organ and a review of opinions of researchers.
A potential approval protocol could be the following, but we would need to ensure that experiments can’t lead to an accidental creation of the AGI.
Discover the scaling laws for the new architecture by conducting experiments (e.g. a neuralese model could have capabilities similar to those of a CoT-trained model RLed by the same number of tasks with the amount of bits transferred being the same. But the model is to be primitive).
Extrapolate the capabilities to the level where they could become dangerous or vulnerable to sandbagging.
If capabilities are highly unlikely to become dangerous at a scaling-up, then one can train a new model with a similar architecture and use as many benchmarks as humanely possible.
There were no nuclear weapons in 1925; so anything that existed in 1925 is known to not be a nuclear weapon. (Moreover, anything built to a 1925 design isn’t one either.)
The smallest critical mass is greater than 1kg; so anything smaller than 1kg is not a nuclear weapon (though it may be part of one).
Nuclear weapons are made of metal, not wood, cloth, paper, or clay; so anything made of wood, cloth, paper, or clay is not a nuclear weapon. (Thus for instance no conventionally printed book is a nuclear weapon, which is convenient for maintaining freedom of the press.)
Critical masses related to nuclear weapons can be found out at the risk of a nuclear explosion at worst and without any risks at best, by attacking nuclei with neutrons and studying the energies of particles and sections of reactions.
However, we don’t know the critical mass for the AGI. While an approval protocol could be an equivalent of determiming critical mass for every new architecture, accidental creation of an AGI capable of breaching containment or being used for AI research, sabotaging alignment work and aligning the ASI to the AGI’s whims instead of mankind’s ideas would be equivalent to a nuclear explosion burning the Earth’s atmosphere.
I would define an AGI system as anything except for a Verifiably Incapable System.
My take on the definition of a VIS is that it can be constructed as follows.
It can be trained ONLY on a verifiaby safe dataset (for example, a dataset which is too narrow, like structures of proteins).
Alternatively it can be a CoT-based architecture with at most [DATA EXPUNGED] compute used for RL, since the scaling laws are likely known.
Finally, it could be something explicitly approved by a monopolistic governance organ and a review of opinions of researchers.
A potential approval protocol could be the following, but we would need to ensure that experiments can’t lead to an accidental creation of the AGI.
Discover the scaling laws for the new architecture by conducting experiments (e.g. a neuralese model could have capabilities similar to those of a CoT-trained model RLed by the same number of tasks with the amount of bits transferred being the same. But the model is to be primitive).
Extrapolate the capabilities to the level where they could become dangerous or vulnerable to sandbagging.
If capabilities are highly unlikely to become dangerous at a scaling-up, then one can train a new model with a similar architecture and use as many benchmarks as humanely possible.
This is more like “not-obviously not-AGI”.
But besides that, yeah, it seems like an OK starting point for thinking about proper definitions for the purpose of a ban.
Would you define ‘nuclear weapon’ as ‘anything not produced in a way that verifiably could not contain any nuclear material’?
(Keep in mind that this would categorize e.g. a glass of tap water as a nuclear weapon.)
There were no nuclear weapons in 1925; so anything that existed in 1925 is known to not be a nuclear weapon. (Moreover, anything built to a 1925 design isn’t one either.)
The smallest critical mass is greater than 1kg; so anything smaller than 1kg is not a nuclear weapon (though it may be part of one).
Nuclear weapons are made of metal, not wood, cloth, paper, or clay; so anything made of wood, cloth, paper, or clay is not a nuclear weapon. (Thus for instance no conventionally printed book is a nuclear weapon, which is convenient for maintaining freedom of the press.)
A wise man called @Leon Lang told me recently: “a definition that defines something as not being another thing, is flawed”.
Critical masses related to nuclear weapons can be found out at the risk of a nuclear explosion at worst and without any risks at best, by attacking nuclei with neutrons and studying the energies of particles and sections of reactions.
However, we don’t know the critical mass for the AGI. While an approval protocol could be an equivalent of determiming critical mass for every new architecture, accidental creation of an AGI capable of breaching containment or being used for AI research, sabotaging alignment work and aligning the ASI to the AGI’s whims instead of mankind’s ideas would be equivalent to a nuclear explosion burning the Earth’s atmosphere.
P.S. The possibility that a nuclear explosion could burn the Earth’s atmosphere was considered by scientists working on the Manhattan project.
Fusion is also a thing. A glass of tap water contains (admittedly a very small amount) of deuterium.