https://twitter.com/bioshok3
I regularly tweet in Japanese about AI alignment and the long-term risks of AGI.
https://twitter.com/bioshok3
I regularly tweet in Japanese about AI alignment and the long-term risks of AGI.
Footnote 68 of the Forethought paper employs a Cobb-Douglas R&D production function (σ = 1) in its quantitative analysis of a technology explosion, with cognitive labor exponent γ = 0.7 derived from NSF R&D expenditure data. Under this assumption, an explosive increase in cognitive capability can produce centuries of technological progress even with limited physical R&D capital.
However, Growiec, McAdam and Mućk (2023, Kansas City Fed) directly estimated the elasticity of substitution between R&D labor and R&D capital in the idea production function, finding σ = 0.7–0.8 using U.S. data from 1968–2019. Their conclusion is that “rather than ideas getting harder to find, the R&D capital needed to find them has become scarce.”
Replacing the Cobb-Douglas assumption in footnote 68 with a CES production function using this empirically estimated σ significantly alters the conclusions.
Key findings:
Assuming C = 10^{10} (explosive increase in cognitive effort), the conclusions vary dramatically with σ:
σ = 1.0 (footnote 68′s assumption): ~3x R&D capital expansion sufficient for 300 years of progress
σ = 0.75 (midpoint of Growiec et al.): 100 years of progress requires ~17x R&D capital expansion. 300 years requires several hundred thousand times expansion—equivalent to hundreds of times current world GDP
σ = 0.7 (lower bound of Growiec et al.): 100 years requires ~45x R&D capital. 300 years requires ~650,000x
The Cobb-Douglas assumption in footnote 68 is therefore decisive for the conclusion. Within the empirically supported range of σ = 0.7–0.8, material bottlenecks are far more severe than footnote 68 suggests.
The AI 2027 scenario envisions AGI/ASI rapidly climbing the technology tree within 1–2 years, achieving centuries’ worth of technological progress. In light of this analysis, several questions arise.
Questions:
Does the AI 2027 scenario implicitly assume σ ≈ 1 for the relationship between cognitive effort and physical R&D capital? If so, how do you evaluate the empirical findings of Growiec et al. (σ = 0.7–0.8)?
In a world where σ = 0.75, achievable technological progress within 1–2 years may be limited to roughly 100 years’ worth. While 100 years of progress would still be revolutionary (curing most diseases, substantially slowing aging, universal robotics, etc.), it may not reach the most ambitious technologies mentioned in AI 2027, such as mind uploading or atomic-precision nanoscale manufacturing. To what extent would the AI 2027 scenario need to be revised in this case?
Do you believe ASI could endogenously raise the effective σ toward 1 by increasing the efficiency of existing physical capital—extracting more information from the same experimental apparatus, substituting simulation for physical experimentation, and so on? If so, how much could σ plausibly rise from its historical level of 0.7–0.8 within a 1–2 year timeframe?
In the context of Deceptive Alignment, would the ultimate goal of an AI system appear random and uncorrelated with the training distribution’s objectives from a human perspective? Or would it be understandable to humans that the goal is somewhat correlated with the objectives of the training distribution?
For instance, in the article below, it is written that “the model just has some random proxies that were picked up early on, and that’s the thing that it cares about.” To what extent does it learn random proxies?
https://www.lesswrong.com/posts/A9NxPTwbw6r6Awuwt/how-likely-is-deceptive-alignment
If an AI system pursues ultimate goals such as power or curiosity, there seems to be a pseudocorrelation regardless of what the base objective is.
On the other hand, can it possibly learn to pursue a goal completely unrelated to the context of the training distribution, such as mass-producing objects of a peculiar shape?
In the context of Deceptive Alignment, would the ultimate goal of an AI system appear random and uncorrelated with the training distribution’s objectives from a human perspective? Or would it be understandable to humans that the goal is somewhat correlated with the objectives of the training distribution?
For instance, in the article below, it is written that “the model just has some random proxies that were picked up early on, and that’s the thing that it cares about.” To what extent does it learn random proxies?
https://www.lesswrong.com/posts/A9NxPTwbw6r6Awuwt/how-likely-is-deceptive-alignment
If an AI system pursues ultimate goals such as power or curiosity, there seems to be a pseudocorrelation regardless of what the base objective is.
On the other hand, can it possibly learn to pursue a goal completely unrelated to the context of the training distribution, such as mass-producing objects of a peculiar shape?
In the context of Deceptive Alignment, would the ultimate goal of an AI system appear random and uncorrelated with the training distribution’s objectives from a human perspective? Or would it be understandable to humans that the goal is somewhat correlated with the objectives of the training distribution?
I would like to know about the history of the term “AI alignment”. I found an article written by Paul Christiano in 2018. Did the use of the term start around this time? Also, what is the difference between AI alignment and value alignment?
https://www.alignmentforum.org/posts/ZeE7EKHTFMBs8eMxn/clarifying-ai-alignment
This is a wonderful essay — really interesting. I have one question. I do acknowledge the possibility of an intelligence explosion, but I’d like to understand in more detail the scenario you describe, like in AI 2027, where several centuries of technological progress could occur within just 1–2 years. I’m not skeptical about a technological explosion driven by superintelligence — I simply want to better understand your reasoning.
What I want to understand is how much of an “industrial explosion” — that is, an explosion in research capital — is required for a “technology explosion.” In your AI 2027 report, it seemed to me that you climb several centuries’ worth of the technological tree even without a very large industrial expansion.
Footnote 68 of the Forethought paper employs a Cobb-Douglas R&D production function (σ = 1) in its quantitative analysis of a technology explosion, with cognitive labor exponent γ = 0.7 derived from NSF R&D expenditure data. Under this assumption, an explosive increase in cognitive capability can produce centuries of technological progress even with limited physical R&D capital.
https://www.forethought.org/research/preparing-for-the-intelligence-explosion#the-technology-explosion?reloaded=true&reloaded=tr
However, Growiec, McAdam and Mućk (2023, Kansas City Fed) directly estimated the elasticity of substitution between R&D labor and R&D capital in the idea production function, finding σ = 0.7–0.8 using U.S. data from 1968–2019. Their conclusion is that “rather than ideas getting harder to find, the R&D capital needed to find them has become scarce.”
Replacing the Cobb-Douglas assumption in footnote 68 with a CES production function using this empirically estimated σ significantly alters the conclusions.
Key findings:
Assuming C = 10^{10} (explosive increase in cognitive effort), the conclusions vary dramatically with σ:
σ = 1.0 (footnote 68′s assumption): ~3x R&D capital expansion sufficient for 300 years of progress
σ = 0.75 (midpoint of Growiec et al.): 100 years of progress requires ~17x R&D capital expansion. 300 years requires several hundred thousand times expansion—equivalent to hundreds of times current world GDP
σ = 0.7 (lower bound of Growiec et al.): 100 years requires ~45x R&D capital. 300 years requires ~650,000x
The Cobb-Douglas assumption in footnote 68 is therefore decisive for the conclusion. Within the empirically supported range of σ = 0.7–0.8, material bottlenecks are far more severe than footnote 68 suggests.
The AI 2027 scenario envisions AGI/ASI rapidly climbing the technology tree within 1–2 years, achieving centuries’ worth of technological progress. In light of this analysis, several questions arise.
Questions:
Does the AI 2027 scenario implicitly assume σ ≈ 1 for the relationship between cognitive effort and physical R&D capital? If so, how do you evaluate the empirical findings of Growiec et al. (σ = 0.7–0.8)?
In a world where σ = 0.75, achievable technological progress within 1–2 years may be limited to roughly 100 years’ worth. While 100 years of progress would still be revolutionary (curing most diseases, substantially slowing aging, universal robotics, etc.), it may not reach the most ambitious technologies mentioned in AI 2027, such as mind uploading or atomic-precision nanoscale manufacturing. To what extent would the AI 2027 scenario need to be revised in this case?
Do you believe ASI could endogenously raise the effective σ toward 1 by increasing the efficiency of existing physical capital—extracting more information from the same experimental apparatus, substituting simulation for physical experimentation, and so on? If so, how much could σ plausibly rise from its historical level of 0.7–0.8 within a 1–2 year timeframe?