Hmm, mulling over this a bit more. (spends 20 minutes)
Two tldrs:
tldr#1: clarifying question for Paul: Do you see a strong distinction between a growth in capabilities shaped like a hyperbolic hockey stick, and a discontinuitous one? (I don’t currently see that strong a distinction between them)
tldr#2: A world that seems most likely to me that seems less likely to be “takeoff like” (or at least moves me most towards looking at other ways to think about it) is a world where we get a process that can design better AGI (which may or may not be an AGI), but does not have general consequentialism/arbitrary learning.
More meandering background thoughts, not sure if legible or persuasive because it’s 4am.
Robby:
I think that e.g. MIRI/Christiano disagreements are less about whether “months” versus “years” is the right timeframe, and more about things like: “Before we get AGI, will we have proto-AGI that’s nearly as good as AGI in all strategically relevant capabilities?”
Assuming that’s accurate, looking at it a second time crystallized some things for me.
And also Robby’s description of “what seems strategically relevant”:
More relevant thresholds on my view are things like “is it an AGI yet? can it, e.g., match the technical abilities of an average human engineer in at least one rich, messy real-world scientific area?” and “is it strong enough to prevent any competing AGI systems from being deployed in the future?”
I’m assuming the “match technical abilities” thing is referencing something like “the beginning of a takeoff” (or at least something that 2012 Bostrom would have called a takeoff?) and the “prevent competitors” is the equivalent “takeoff is complete, for most intents and purposes.”
I agree with those being better thresholds than “human” and “superhuman”
But looking at the nuts and bolts of what might cause those thresholds, the feats that seem most likely produce a sharp takeoff (“sharp” meaning the rate of change increases after these capabilities exist in the world. I’m not sure if this is meaningfully distinct from a hyperbolic curve.)
general consequentialist behavior
arbitrary learning capability (possibly by spinning up subsystems that learn for it, don’t think that distinction matters much)
ability to do AGI design
(not sure if #2 can be meaningfully split from #1 or not, and doubt they would be in practice)
These three are the combo that seem, to me, better modeled as something different from “the economy just doing it’s thing, but acceleratingly”.
And one range of things-that-could-happen is “do we get #1, #2 and #3 together? what happens if we just get #1 or #2? What happens if we just get #3?”
If we get #1, and it’s allowed to run unfettered, I expect that process would try to gain properties #2 and #3.
But upon reflection, a world where we get property #3 without 1 and 2 seems fairly qualitatively different, and is the world that looks, to me, more like “progress accelerates but looks more like various organizations building things in a way best modeled as an accelerating economy.
These three are the combo that seem, to me, better modeled as something different from “the economy just doing it’s thing, but acceleratingly”.
I don’t see this.
And why is “arbitrary learning capacity” a discrete thing? I’d think the important thing is that future systems will learn radically faster than current systems and be able to learn more complex things, but still won’t learn infinitely faster or be able to learn arbitrarily complex things (in the same ways that humans can’t). Why wouldn’t these parameters increase gradually?
A thought: you’ve been using the phrase “slow takeoff” to distinguish your model vs the MIRI-ish model, but I think the relevant phrase is more like “smooth takeoff vs sharp takeoff” (where the shape of the curve changes at some point)
But, your other comment + Robby’s has me convinced that the key disagreement doesn’t have anything to do with smooth vs sharp takeoff either. Just happens to be a point of disagreement without being an important.
Not sure if this is part of the confusion/disagreement, but by “arbitrary” I mean “able to learn ‘anything’” as opposed to “able to learn everything arbitrarily fast/well.” (i.e. instead of systems tailored to learn specific things like we have today, a system that can look at the domains that it might want to learn, choose which of those domains are most strategically relevant, and then learn whichever ones seem highest priority)
(The thing clearly needs to be better than a chimp at general purpose learning, it’s not obvious to me if it needs any particular equivalent IQ for this to start changing the nature of technological progress, but probably needs to be at least equivalent IQ 80 and maybe IQ 100 at least in some domains before it transitions from ‘cute science fair project’ to ‘industry-relevant’)
Hmm, mulling over this a bit more. (spends 20 minutes)
Two tldrs:
tldr#1: clarifying question for Paul: Do you see a strong distinction between a growth in capabilities shaped like a hyperbolic hockey stick, and a discontinuitous one? (I don’t currently see that strong a distinction between them)
tldr#2: A world that seems most likely to me that seems less likely to be “takeoff like” (or at least moves me most towards looking at other ways to think about it) is a world where we get a process that can design better AGI (which may or may not be an AGI), but does not have general consequentialism/arbitrary learning.
More meandering background thoughts, not sure if legible or persuasive because it’s 4am.
Robby:
Assuming that’s accurate, looking at it a second time crystallized some things for me.
And also Robby’s description of “what seems strategically relevant”:
I’m assuming the “match technical abilities” thing is referencing something like “the beginning of a takeoff” (or at least something that 2012 Bostrom would have called a takeoff?) and the “prevent competitors” is the equivalent “takeoff is complete, for most intents and purposes.”
I agree with those being better thresholds than “human” and “superhuman”
But looking at the nuts and bolts of what might cause those thresholds, the feats that seem most likely produce a sharp takeoff (“sharp” meaning the rate of change increases after these capabilities exist in the world. I’m not sure if this is meaningfully distinct from a hyperbolic curve.)
general consequentialist behavior
arbitrary learning capability (possibly by spinning up subsystems that learn for it, don’t think that distinction matters much)
ability to do AGI design
(not sure if #2 can be meaningfully split from #1 or not, and doubt they would be in practice)
These three are the combo that seem, to me, better modeled as something different from “the economy just doing it’s thing, but acceleratingly”.
And one range of things-that-could-happen is “do we get #1, #2 and #3 together? what happens if we just get #1 or #2? What happens if we just get #3?”
If we get #1, and it’s allowed to run unfettered, I expect that process would try to gain properties #2 and #3.
But upon reflection, a world where we get property #3 without 1 and 2 seems fairly qualitatively different, and is the world that looks, to me, more like “progress accelerates but looks more like various organizations building things in a way best modeled as an accelerating economy.
I don’t see this.
And why is “arbitrary learning capacity” a discrete thing? I’d think the important thing is that future systems will learn radically faster than current systems and be able to learn more complex things, but still won’t learn infinitely faster or be able to learn arbitrarily complex things (in the same ways that humans can’t). Why wouldn’t these parameters increase gradually?
A thought: you’ve been using the phrase “slow takeoff” to distinguish your model vs the MIRI-ish model, but I think the relevant phrase is more like “smooth takeoff vs sharp takeoff” (where the shape of the curve changes at some point)
But, your other comment + Robby’s has me convinced that the key disagreement doesn’t have anything to do with smooth vs sharp takeoff either. Just happens to be a point of disagreement without being an important.
Not sure if this is part of the confusion/disagreement, but by “arbitrary” I mean “able to learn ‘anything’” as opposed to “able to learn everything arbitrarily fast/well.” (i.e. instead of systems tailored to learn specific things like we have today, a system that can look at the domains that it might want to learn, choose which of those domains are most strategically relevant, and then learn whichever ones seem highest priority)
(The thing clearly needs to be better than a chimp at general purpose learning, it’s not obvious to me if it needs any particular equivalent IQ for this to start changing the nature of technological progress, but probably needs to be at least equivalent IQ 80 and maybe IQ 100 at least in some domains before it transitions from ‘cute science fair project’ to ‘industry-relevant’)