Vladimir_Nesov comments on romeo’s Shortform

Vladimir_Nesov 27 Apr 2025 12:06 UTC
22 points
9
The extensive discussion of trends in global datacenter/Nvidia revenue shows that the framing considers human economy as a whole as the system driving eventual AI takeoff, that there are always essential complementary inputs that can’t be abstracted out.

Software-only singularity is about considering scaling laws for a different system that is not the entire economy and whose relevant inputs are specific AIs (varying in their capabilities and compute efficiency) and the novel software and cultural knowledge they are producing, rather than more material forms of capital or compute or data from the physical world. An intermediate construction is an AI/robot economy that’s highly decoupled from the human economy and does its own thing at its own pace.

Early trends of an algal bloom shouldn’t be about the total mass of organic matter in the ocean. The choice of the system to consider relevant carries more of the argument than detailed analyses of any given system. In the post, Ege Erdil makes a point that we know very little about the system where a possible software-only singularity takes place:

It’s just hard to be convinced in a domain where the key questions about the complexity of the job of a researcher and the complementarity between cognitive and compute/data inputs remain unanswered.

This is a reason for persistence of the disagreement about which systems are relevant, as those who feel that software-only recursive self-improvement can work and is therefore a relevant system will fail to convince those who don’t, and conversely. But instead of discussing the crux of which system is relevant (which has to be about details of recursive self-improvement), only the proponents will tend to talk about software-only singularity, while the opponents will talk about different systems whose scaling they see as more relevant, such as the human economy or datacenter economy.

In the current regime, pretraining scaling laws tether AI capabilities to compute of a single training system, but not to the total amount of compute (or revenue) in datacenters worldwide. This in turn translates to relevance of finances of individual AI companies and hardware improvements, which will remain similarly crucial if long reasoning training takes over from pretraining, the difference being that AI company money will be buying inference compute for RL training from many datacenters, rather than time on a single large training system. A pivot to RL (if possible) lifts some practical constraints on the extent of scaling, and the need to coordinate construction of increasingly large and expensive training systems that are suboptimal for other purposes. This might let the current scaling regime extend for another 3-4 years, until 2030-2032, as an AI company would only need to cover a training run rather than arrange construction of a training system, a difference of 10x.
What links here?
- Vladimir_Nesov's comment on The case for multi-decade AI timelines [Linkpost] by Noosphere89 (27 Apr 2025 16:00 UTC; 10 points)
- romeo 27 Apr 2025 20:49 UTC
  3 points
  0
  Parent
  But instead of discussing the crux of which system is relevant (which has to be about details of recursive self-improvement), only the proponents will tend to talk about software-only singularity, while the opponents will talk about different systems whose scaling they see as more relevant, such as the human economy or datacenter economy.
  Totally agree! Thank you for phrasing it elegantly. This is basically what I commented on Ege’s post yesterday, I asked him to engage with the actual crux and make arguments about why the software-only singularity is unlikely.