I No Longer Believe Intelligence to be “Magical”
I feel it’s better to post my rough and unpolished thoughts than to not post them at all, so hence.
One significant way I’ve changed my views related to risks from strongly superhuman intelligence (compared to 2017 bingeing LW DG) is that I no longer believe intelligence to be “magical”.
During my 2017 binge of LW, I recall Yudkowsky suggesting that a superintelligence could infer the laws of physics from a single frame of video showing a falling apple (Newton apparently came up with his idea of gravity from observing a falling apple). I now think that’s somewhere between deeply magical and utter nonsense. It hasn’t been shown that a perfect Bayesian engine (with [a] suitable [hyper]prior[s]) could locate general relativity or (even just Newtonian mechanics) in hypothesis space from a single frame of video. I’m not even sure a single frame of video of a falling apple has enough bits to allow one to make that distinction in theory .
The above is based on a false recollection of what Yudkowsky said (I leave it anyway because the comments have engaged with it). Here’s the relevant excerpt from “That Alien Message”:
A Bayesian superintelligence, hooked up to a webcam, would invent General Relativity as a hypothesis—perhaps not the dominant hypothesis, compared to Newtonian mechanics, but still a hypothesis under direct consideration—by the time it had seen the third frame of a falling apple. It might guess it from the first frame, if it saw the statics of a bent blade of grass.
I still disagree somewhat with the above, but it’s a weaker disagreement — this is a weaker claim than I misrecalled Yudkowsky as making — I apologise for any harm/misunderstanding my carelessness (I should have just tracked down the original post) caused.
I think that I need to investigate at depth what intelligence (even strongly superhuman intelligence) is actually capable of, and not just assume that intelligence can do anything not explicitly forbidden by the fundamental laws. The relevant fundamental laws with a bearing on cognitive and real-world capabilities seem to be:
Decision & Game Theory
The Relevant Question: Marginal Returns to Real World Capability of Cognitive Capabilities
I’ve done some armchair style thinking on “returns to real-world capability” of increasing intelligence, and I think the Yudkowsky style arguments around superintelligence are quite magical.
It seems doubtful that higher intelligence would enable that. E.g. marginal returns to real-world capability from increased predictive power diminish at an exponential rate. Better predictive power buys less capability at each step, and it buys a lot less. I would say that the marginal returns are “sharply diminishing”.
An explanation of “significantly/sharply diminishing”:
Sharply Diminishing Marginal Returns to Real World Capabilities From Increased Predictive Accuracy
A sensible way of measuring predictive accuracy is something analogous to . The following transitions:
All make the same incremental jump in predictive accuracy.
We would like to measure the marginal return to real-world capability of increased predictive accuracy. The most compelling way I found to operationalise “returns to real-world capability” was monetary returns.
I think that’s a sensible operationalization:
Money is the basic economic unit of account
Money is preeminently fungible
Money can be efficiently levered into other forms of capability via the economy.
I see no obviously better proxy.
(I will however be interested in other operationalisations of “returns to real-world capability” that show different results).
The obvious way to make money from beliefs in propositions is by bets of some form. One way to place a bet and reliably profit is insurance. (Insurance is particularly attractive because in practice, it scales to arbitrary confidence and arbitrary returns/capital).
Suppose that you sell an insurance policy for event , and for each prospective client , you have a credence that would not occur to them . Suppose also that you sell your policy at .
At a credence of , you cannot sell your policy for . At a price of and given the credence of , your expected returns will be for that customer. Assume the given customer is willing to pay at most for the policy.
If your credence in not happening increased, how would your expected returns change? This is the question we are trying to investigate to estimate real-world capability gains from increased predictive accuracy.
The results are below:
As you can see, the marginal returns from linear increases in predictive accuracy are give by the below sequence:
(This construction could be extended to other kinds of bets, and I would expect the result to generalise [modulo some minor adjustments] to cross-domain predictive ability.
a shortform [this started as a shortform but ended up crossing the 1,000 words barrier, so I moved it] is not the place for such elaboration).
Thus returns to real-world capability of increased predictive accuracy are sharply diminishing.
One prominent class of scenario in which the above statement is false is when competing against other agents. In competitions with “winner take all” dynamics, better predictive accuracy has more graceful returns. I’ll need to think further about various multi agent scenarios and model them better/in more detail.
Marginal Returns to Real World Capabilities From Other Cognitive Capabilities
Of course, predictive accuracy is just one aspect of intelligence, there are many others:
Other symbolic reasoning
Broad pattern matching
And we’d want to investigate the relationship for aggregate cognitive capabilities/”general intelligence”. The example I illustrated earlier merely sought to demonstrate how returns to real-world capability could be “sharply diminishing”.
Marginal Returns to Cognitive Capabilities
Another inquiry that’s important to determining what intelligence is actually capable of is the marginal returns to investment of cognitive capabilities towards raising cognitive capabilities.
That is if an agent was improving its own cognitive architecture (recursive self improvement) or designing successor agents, how would the marginal increase in cognitive capabilities across each generation behave? What function characterises it?
Marginal Returns of Computational Resources
This isn’t even talking about the nature of marginal returns to predictive accuracy from the addition of extra computational resources.
By “computational resources” I mean the following:
An aggregation of all of them
That could further bound how much capability you can purchase with the investment of additional economic resources. If those also diminish “significantly” or “sharply”, the situation becomes that much bleaker.
Marginal Returns to Cognitive Reinvestment
The other avenue to raising cognitive capabilities is the investment of cognitive capabilities themselves. As seen when designing successor agents or via recursive self-improvement.
We’d also want to investigate the marginal returns to cognitive reinvestment.
My Current Thoughts
Currently, I think strongly superhuman intelligence would require herculean effort. I am much less confident that bootstrapping to ASI would necessarily be as easy as recursive self-improvement or “scaling” up the expenditure of computational resources. I’m unconvinced that a hardware overhang would be sufficient (marginal returns may diminish too fast for it to be sufficient).
I currently expect marginal returns to real-world capability will diminish significantly or sharply for many cognitive capabilities (and the aggregate of them) across some “relevant cognitive intervals”.
I suspect that the same will prove to be true for marginal returns to cognitive capabilities of investing computational resources or other cognitive capabilities.
I don’t plan to rely on my suspicions and would want to investigate these issues at extensive depth (I’m currently planning to pursue a Masters and afterwards PhD, and these are the kinds of questions I’d like to research when I do so).
By “relevant cognitive intervals”, I am gesturing at the range of general cognitive capabilities an agent might belong in.
Humans being the only examples of general intelligence we are aware of, I’ll use them as a yardstick.
Some “relevant cognitive intervals” that seem particularly pertinent:
Subhuman to near-human
Near-human to beginner human
Beginner human to median human professional
Median human professional to expert human
Expert human to superhuman
Superhuman to strongly superhuman
Conclusions and Next Steps
The following questions:
Marginal returns to cognitive capabilities from the investment of computational resources
Marginal returns to cognitive capabilities from:
Larger cognitive engines
E.g. a brain with more synapses
E.g. a LLM with more parameters
Faster cognitive engines
E.g. Transformers vs LSTM
Coordination with other cognitive engines
A “hive mind”
Marginal returns to computational resources from economic investment
How much more computational resources can you buy by throwing more money directly at computational resources?
What’s the behaviour of the cost curves of computational resources?
Economies of scale
How much more computational resources can you buy by throwing more money at cognitive labour to acquire computational resources?
How does investment of cognitive labour affect the cost curves of computational resources?
Marginal returns to real world capabilities from the investment of cognitive capabilities
Marginal returns to real world capabilities from investment of economic resources
(Across the cogntive intervals of interest).
Are questions I plan to investigate in depth in the future.