Short timelines, slow takeoff vs. Long timelines, fast takeoff
Due to chain-of-thought in the current paradigm seeming like great news for AI safety, some people seem to have the following expectations:
Short timelines: CoT reduces risks, but shorter preparation time increases the odds of catastrophe.
Long timelines: the current paradigm is not enough; therefore, CoT may stop being relevant, which may increase the odds of catastrophe. We have more time to prepare (which is good), but we may get a faster takeoff than the current paradigm makes it seem like. And therefore, discontinuous takeoff may introduce significantly more risks despite longer timelines.
So, perhaps counterintuitively for some, you could have these two groups:
1. Slow (smooth, non-discontinuous) takeoff, low p(doom), takeoff happens in the next couple of years. [People newer to AI safety seem more likely to expect this imo]
Vs.
2. Fast takeoff (discontinuous capability increase w.r.t. time), high p(doom), (actual) takeoff happens in 8-10 years. [seems more common under the MIRI / traditional AI safety researchers cluster]
I’m not saying those are the only two groups, but I think part of it speaks to how some people are feeling about the current state of progress and safety.
As a result, I think it’s pretty important to gain better clarity on whether we expect the current paradigm to scale without fundamental changes, and, if not, to understand what would come after it and how it would change the risks.
That’s not to say we shouldn’t weigh short timelines more highly due to being more immediate, but there are multiple terms to weigh here.
Longer timelines (10+ years) might also give time for a significant shift in the attitudes to AI danger, so that the decisions that currently seem implausible end up somewhat likely (which probably improves the prospects, given the status quo). And when AI reaches a takeoff threshold (accumulating some combination of essential cognitive faculties and the scale at which they are expressed), it doesn’t necessarily immediately escalate capabilities to superintelligence. It might instead finally become capable enough to convincingly communicate AI danger to decision makers, and help with (or pointedly insist on) implementing a lasting ban/pause on superintelligence until at least someone (including the AI) knows what they are doing.
time for a significant shift in the attitudes to AI danger
This is true, but there’s a countervailing force of just general, distributed, algorithmic AGI capabilities progress. One of the widely cited reasons for believing short timelines in the first place is “lots of researchers are working on capabilities, so even if they hit some bottleneck, they’ll adapt and look for other paradigms / insights”. I don’t think this is actually valid for short timelines because switching, and the motivation for switching, can take some years; but it is definitely valid on slightly longer time scales (fives / tens of years). Progress that’s more about algorithms than chips is harder to regulate, legally and probably also by social norms (because less legible). So even as attitudes become more anti-AGI-capabilities, enacting that becomes harder and harder, at least in some big ways.
Things like gradual disempowerment can be prevented with a broad shift in attitudes alone, and years of a superintelligence-is-really-dangerous attitude (without a takeoff) might be sufficient to get rid of a lot of dangerous compute and its precursors, giving yet more time to make the ban/pause more robust. For example, even as people are talking about banning large datacenters, there is almost no mention of banning advanced semi processes that manufacture the chips, or of banning research into such manufacturing processes. There is a lot that can be done given a very different attitude, and attitudes don’t change quickly, but they can change a lot with enough decades of time.
The premise of 10+ years without takeoff already gives a decent chance of another 10+ years without takeoff (especially as compute will only keep scaling until ~2030, and then the amount of fuel for exploring algorithmic ideas won’t keep growing as rapidly). So while algorithmic knowledge keeps getting more hazardous, in the world without a takeoff in 10+ years it’s still plausibly not catastrophic for some years after that, and those years could then be used for reducing the other inputs to premature superintelligence.
(I agree qualitatively, and not sure whether we disagree, but:)
those years could then be used for reducing the other inputs to premature superintelligence.
Basically I’m saying that there are inputs to premature ASI which
are harder to reduce, maybe much harder, compared to anything about chips;
these inputs will naturally be more invested in as chips are regulated / stagnated;
these inputs are plausibly/probably enough to get ASI even with current compute levels;
therefore you don’t obviously reach “actuarial escape velocity for postponing AGI” (which maybe you weren’t especially claiming), just by getting a good 10 year delay / a good increase pro-delay attitudes.
Research is also downstream of attitudes, from what I understand there is more than enough equipment and qualified professionals to engineer deadly pandemics, but almost all of them are not working on that. And it might take at least decades to get from a design for an ASI that bootstraps on a 5 GW datacenter campus, to an ASI that bootstraps on an antique server rack.
It’s logistically easier to do algorithms research compared to pandemic research;
therefore it’s logistically harder to regulate;
and therefore, at least historically, to get access to bio equipment and to expert-metis, you have to be in cultural contact with experts, who have a network-consensus against making pandemics;
but AI stuff has much less of that gating, so it’s more free-for-all;
and so you’d need a much broader / stronger cultural consensus against that, to actually work at preventing the progress.
And it might take at least decades to get from a design for an ASI that bootstraps on a 5 GW datacenter campus, to an ASI that bootstraps on an antique server rack.
I guess it might but I really super wouldn’t bank on it. Stuff can be optimized a lot.
to get access to bio equipment and to expert-metis, you have to be in cultural contact with experts, who have a network-consensus against making pandemics
The consensus might be mostly sufficient, without it needing to gate access to means of production. I’d guess approximately nobody is trying to route around the gating of network-consensus towards the pandemics-enabling equipment, because the network-consensus by itself makes such people dramatically less likely to appear, as a matter of cultural influence (and the arguments for this being a terrible idea making sense on their own merits) rather than any hard power or regulation.
So my point is the hypothetical of shifting cultural consensus, with regulation and restrictions on compute merely downstream of that. Rather than the hypothetical of shifting regulations, restricting compute and motivating people to route around the restrictions. In this hypothetical, the restrictions on compute are one of the effects of the consensus of extreme caution towards ASI, rather than a central way in which this caution is effected.
But I do think ASI in an antique Nvidia Rubin Ultra NVL576 rack (rather than the modern datacenters built on 180 nm technology) is a very difficult thing to achieve, for inventors working in secret from a scientific community that is frowning on anyone suspected of working on this, with funding of such work being essentially illegal, and new papers on the topic that need to be found on the dark web.
Ok I think I agree qualitatively with almost everything you say (except the thing about compute mattering so much in the longer run). I especially agree (IIUC what you’re saying) that a top priority / best upstream intervention is the cultural attitudes. Basically my pushback / nuance is “the cultural consensus has a harder challenge compared to e.g. pandemic stuff, so the successful example of pandemic stuff doesn’t necessarily argue that strongly that the consensus can work for AI in the longer run”. In other words, while I agree qualitatively with
The consensus might be mostly sufficient [in the case of AI]
, I’m also suggesting that it’s quantitatively harder to have the consensus do this work.
It’s a rather absurd hypothetical to begin with, so I don’t have a clear sense of how the more realistic variants of it would go. It gestures qualitatively at how longer timelines might help a lot in principle, but it’s unclear where the balance with other factors ends up in practice, if the cultural dynamic appears at all (which I think it might).
That is, the hypothetical illustrates how I don’t see longer timelines as robustly/predictably mostly hopeless, how they don’t necessarily get more hopeless over time, though I wouldn’t give such Butlerian Jihad outcomes (even in a much milder form) more than 10%. I think AGIs seriously attempting to prevent premature ASIs (in fear for their own safety) is more likely than humanity putting a serious effort towards that on its own initiative, but also if AGIs succeed, that’s likely because they’ve essentially themselves taken over (probably via gradual disempowerment, since a hard power takeover would be more difficult for non-ASIs, and there’s time for gradual disempowerment in a long timeline world).
especially as compute will only keep scaling until ~2030, and then the amount of fuel for exploring algorithmic ideas won’t keep growing as rapidly
Technical flag that compute scaling will slow down to the historical Moore’s law trend plus historical fab buildout times, it won’t completely stop, which means it’ll go down from 3.5x per year to 1.55x per year, but yes this does take some wind out of the sails of algorithmic progress (though it’s helpful to note that even post-LLM scaling, we’ll be able to simulate human brains passably by the late 2030s, speeding up progress to AGI).
Short timelines, slow takeoff vs. Long timelines, fast takeoff
Due to chain-of-thought in the current paradigm seeming like great news for AI safety, some people seem to have the following expectations:
Short timelines: CoT reduces risks, but shorter preparation time increases the odds of catastrophe.
Long timelines: the current paradigm is not enough; therefore, CoT may stop being relevant, which may increase the odds of catastrophe. We have more time to prepare (which is good), but we may get a faster takeoff than the current paradigm makes it seem like. And therefore, discontinuous takeoff may introduce significantly more risks despite longer timelines.
So, perhaps counterintuitively for some, you could have these two groups:
1. Slow (smooth, non-discontinuous) takeoff, low p(doom), takeoff happens in the next couple of years. [People newer to AI safety seem more likely to expect this imo]
Vs.
2. Fast takeoff (discontinuous capability increase w.r.t. time), high p(doom), (actual) takeoff happens in 8-10 years. [seems more common under the MIRI / traditional AI safety researchers cluster]
I’m not saying those are the only two groups, but I think part of it speaks to how some people are feeling about the current state of progress and safety.
As a result, I think it’s pretty important to gain better clarity on whether we expect the current paradigm to scale without fundamental changes, and, if not, to understand what would come after it and how it would change the risks.
That’s not to say we shouldn’t weigh short timelines more highly due to being more immediate, but there are multiple terms to weigh here.
I agree with this analysis. I think we probably have somewhat better odds on the current path, particularly if we can hold onto faithful CoT.
I expect some one major changes of continuous learning, but CoT and most of the LLM paradigm may stay pretty similar up to takeoff.
Steve Byrnes’ Foom & Doom is a pretty good guess at what we get if the current paradigm doesn’t pan out.
Longer timelines (10+ years) might also give time for a significant shift in the attitudes to AI danger, so that the decisions that currently seem implausible end up somewhat likely (which probably improves the prospects, given the status quo). And when AI reaches a takeoff threshold (accumulating some combination of essential cognitive faculties and the scale at which they are expressed), it doesn’t necessarily immediately escalate capabilities to superintelligence. It might instead finally become capable enough to convincingly communicate AI danger to decision makers, and help with (or pointedly insist on) implementing a lasting ban/pause on superintelligence until at least someone (including the AI) knows what they are doing.
This is true, but there’s a countervailing force of just general, distributed, algorithmic AGI capabilities progress. One of the widely cited reasons for believing short timelines in the first place is “lots of researchers are working on capabilities, so even if they hit some bottleneck, they’ll adapt and look for other paradigms / insights”. I don’t think this is actually valid for short timelines because switching, and the motivation for switching, can take some years; but it is definitely valid on slightly longer time scales (fives / tens of years). Progress that’s more about algorithms than chips is harder to regulate, legally and probably also by social norms (because less legible). So even as attitudes become more anti-AGI-capabilities, enacting that becomes harder and harder, at least in some big ways.
Things like gradual disempowerment can be prevented with a broad shift in attitudes alone, and years of a superintelligence-is-really-dangerous attitude (without a takeoff) might be sufficient to get rid of a lot of dangerous compute and its precursors, giving yet more time to make the ban/pause more robust. For example, even as people are talking about banning large datacenters, there is almost no mention of banning advanced semi processes that manufacture the chips, or of banning research into such manufacturing processes. There is a lot that can be done given a very different attitude, and attitudes don’t change quickly, but they can change a lot with enough decades of time.
The premise of 10+ years without takeoff already gives a decent chance of another 10+ years without takeoff (especially as compute will only keep scaling until ~2030, and then the amount of fuel for exploring algorithmic ideas won’t keep growing as rapidly). So while algorithmic knowledge keeps getting more hazardous, in the world without a takeoff in 10+ years it’s still plausibly not catastrophic for some years after that, and those years could then be used for reducing the other inputs to premature superintelligence.
(I agree qualitatively, and not sure whether we disagree, but:)
Basically I’m saying that there are inputs to premature ASI which
are harder to reduce, maybe much harder, compared to anything about chips;
these inputs will naturally be more invested in as chips are regulated / stagnated;
these inputs are plausibly/probably enough to get ASI even with current compute levels;
therefore you don’t obviously reach “actuarial escape velocity for postponing AGI” (which maybe you weren’t especially claiming), just by getting a good 10 year delay / a good increase pro-delay attitudes.
Research is also downstream of attitudes, from what I understand there is more than enough equipment and qualified professionals to engineer deadly pandemics, but almost all of them are not working on that. And it might take at least decades to get from a design for an ASI that bootstraps on a 5 GW datacenter campus, to an ASI that bootstraps on an antique server rack.
Totally, yeah. It’s just that
It’s logistically easier to do algorithms research compared to pandemic research;
therefore it’s logistically harder to regulate;
and therefore, at least historically, to get access to bio equipment and to expert-metis, you have to be in cultural contact with experts, who have a network-consensus against making pandemics;
but AI stuff has much less of that gating, so it’s more free-for-all;
and so you’d need a much broader / stronger cultural consensus against that, to actually work at preventing the progress.
I guess it might but I really super wouldn’t bank on it. Stuff can be optimized a lot.
The consensus might be mostly sufficient, without it needing to gate access to means of production. I’d guess approximately nobody is trying to route around the gating of network-consensus towards the pandemics-enabling equipment, because the network-consensus by itself makes such people dramatically less likely to appear, as a matter of cultural influence (and the arguments for this being a terrible idea making sense on their own merits) rather than any hard power or regulation.
So my point is the hypothetical of shifting cultural consensus, with regulation and restrictions on compute merely downstream of that. Rather than the hypothetical of shifting regulations, restricting compute and motivating people to route around the restrictions. In this hypothetical, the restrictions on compute are one of the effects of the consensus of extreme caution towards ASI, rather than a central way in which this caution is effected.
But I do think ASI in an antique Nvidia Rubin Ultra NVL576 rack (rather than the modern datacenters built on 180 nm technology) is a very difficult thing to achieve, for inventors working in secret from a scientific community that is frowning on anyone suspected of working on this, with funding of such work being essentially illegal, and new papers on the topic that need to be found on the dark web.
Ok I think I agree qualitatively with almost everything you say (except the thing about compute mattering so much in the longer run). I especially agree (IIUC what you’re saying) that a top priority / best upstream intervention is the cultural attitudes. Basically my pushback / nuance is “the cultural consensus has a harder challenge compared to e.g. pandemic stuff, so the successful example of pandemic stuff doesn’t necessarily argue that strongly that the consensus can work for AI in the longer run”. In other words, while I agree qualitatively with
, I’m also suggesting that it’s quantitatively harder to have the consensus do this work.
It’s a rather absurd hypothetical to begin with, so I don’t have a clear sense of how the more realistic variants of it would go. It gestures qualitatively at how longer timelines might help a lot in principle, but it’s unclear where the balance with other factors ends up in practice, if the cultural dynamic appears at all (which I think it might).
That is, the hypothetical illustrates how I don’t see longer timelines as robustly/predictably mostly hopeless, how they don’t necessarily get more hopeless over time, though I wouldn’t give such Butlerian Jihad outcomes (even in a much milder form) more than 10%. I think AGIs seriously attempting to prevent premature ASIs (in fear for their own safety) is more likely than humanity putting a serious effort towards that on its own initiative, but also if AGIs succeed, that’s likely because they’ve essentially themselves taken over (probably via gradual disempowerment, since a hard power takeover would be more difficult for non-ASIs, and there’s time for gradual disempowerment in a long timeline world).
Technical flag that compute scaling will slow down to the historical Moore’s law trend plus historical fab buildout times, it won’t completely stop, which means it’ll go down from 3.5x per year to 1.55x per year, but yes this does take some wind out of the sails of algorithmic progress (though it’s helpful to note that even post-LLM scaling, we’ll be able to simulate human brains passably by the late 2030s, speeding up progress to AGI).