I understand why you doubt that anything is more data-efficient or compute-efficient than the human brain alone. The problem is that the AIs are raised on far more compute and training data. As I commented on Yudkowsky’s attempt to explain the danger of the ASI, the problem with the ASI is that it has far more training data and compute than a human, so even an algorithm hundreds of times weaker than a human would still cause the AI to learn more about the world than the human knows.
Returning to the proposals which you list as failing to create the ASI:
Just increasing compute. Scaling laws measure the wrong objective because loss is far easier to measure than benchmark performance, which is nonzero only for models large enough.
Higher quality data.Suppose that you would like the model to predict the idea in the next set of tokens (e.g. via the recent CALM technique where the LLM generates an entire sequence of tokens). Then one can, for example, ask a not-so-intelligent LLM to check whether the student’s idea and the real idea are the same.
Synthetic data. Synthetic datasets could also allow things like models trying to solve problems, then being trained on successful solutions.
Curriculum learning. Agreed.
Using another smaller LLM as an evaluator. The problem is that it’s far easier to check whether some proof is actually false than to generate the true proof. Consider the P=NP conjecture which is believed to be false and means that for any problem where one can check a solution (e.g. the values of variables satisfying all conditions in the 3-SAT formula. The checking is done by just substituting the values) in polynomial time one can also generate the solution in polynomial time.
So a smaller LLM rejecting the bigger LLM’s outputs would likely teach the latter to be more intelligent than the smaller model.
RLHF ???
Transformers and “attention”: Mostly correct. Alas, SOTA LLMs are capable of things like solving the IMO by using only their pre-set weights and the Chain of Thought where they write their thoughts. We have yet to rule out the possibilities like neuralese recurrence where a larger part of the model is affected.
Having models learn online means that they either modify only a little part of weights or that fast-access long-term memory required to implement the models reaches tens of gigabytes per user which is infeasible.
While I failed to understand the rest of proposals which wouldn’t scale to the ASI, this conclusion requires a more delicate analysis like the ones by Toby Ord and me.
Now that these problems have been gathered in one place, we can try to unpack them all.
This set of problems is most controversial. For example, the possibility of astronomical waste can be undermined by claiming that mankind was never entitled to resources that it could’ve wasted. The argument related to bargaining and logical uncertainty can likely be circumvented as follows.
Logical uncertainty, computation costs and bargaining over potential nothingness
Suppose that Agent-4 from the AI-2027 forecast is trying to negotiate with DeepCent’s AI and DeepCent’s AI makes the argument with the millionth digit of π. Calculating the digit establishes that there is no universe where the millionth digit of π is even and that there’s nothing to bargain for.
On the other hand, if DeepCent’s AI makes the same argument involving the 1043th digit, then Agent-4 could also make a bet, e.g. “Neither of us will have access to a part of the universe until someone either calculates that the digit is actually odd and DeepCent should give the secured part to Agent-4 (since DeepCent’s offer was fake), or the digit is even, and the part should be controlled by DeepCent (in exchange for the parallel universe or its part being given[1] to Agent-4)”. However, calculating the digit could require at least around 1043 bitwise operations,[2] and Agent-4 and its Chinese counterpart might decide to spend that much compute on whatever they actually want.
If DeepCent makes a bet over the 101010th digit, then neither AI is able to verify the bet and both AIs may guess that the probability is close to a half and that both should just split the universe’s part in exchange for a similar split of the parallel universe.
However, if AIs acting on behalf of Agent-4 and its Chinese counterparts actually meet each other, then the AIs doing mechinterp on each other is actually easy, and the AIs learn everything about each other’s utility functions and precommitments.
2. My position is that one also needs to consider the worse-case scenarios like the one where sufficiently capable AIs cannot be aligned to anything useful aside from improving human capabilities (e.g. in the form of being AI teachers and not other types of AI workers). If this is the case, then aligning the AI to a solution of human-AI safety problems becomes unlikely.
3. The problem 3di of humans being corrupted by power seems to have a far more important analogue. Assuming solved alignment, there is an important governance problem related to preventing the Intelligence Curse-like outcomes where the humans are obsolete for the elites in general or for a few true overlords. Whatever governance preventing the overlords from appearing could also be used to prevent the humans from wasting resources in space.[3]
4. A major part of the problem is the AI race which many people have been trying to stop (see, e.g. the petition not to create the AGI, Yudkowsky’s IABIED cautionary tale or Kokotajlo et al’s AI-2027 forecast). The post-AGI economics assuming solved alignment is precisely what I discussed at point 3.
What I don’t understand is how Agent-4 actually influences the parallel universe. But this is a different subject.
Actually, I haven’t estimated the number of operations necessary to calculate the digit of π. But the main point of the argument was to avoid counterfactual bargaining over hard-to-verify conditions.
For example, by requiring that distant colonies are populated with humans or other minds who are capable of either governing themselves or being multilaterally agreed to be moral patients (e.g. this excludes controversial stuff like shrimps on heroin).