Christiano’s model of progress, AFAIU, can be summarized as: “When only a few people work in a field, big jumps in progress are possible. When many people work in a field, the low hanging fruits are picked quickly and then progress is smooth.”
The problem with this model is, its predictions depend a lot on how you draw the boundary around “field”. Take Yudkowsky’s example of startups. How do we explain small startups succeed where large companies failed? And, it’s not lack of economic incentives since successful startups sometimes make huge profits? (Often resulting from getting acquired by the large companies.) So much that there’s an entire industry of investing in startups?
I’m guessing that a proponent of Christiano’s theory would say: sure, such-and-such startup succeeded but it was because they were the only ones working on problem P, so problem P was an uncrowded field at the time. Okay, but why do we draw the boundary around P rather than around “software” or around something in between which was crowded? To take another example, heavier-than-air flight was arguably uncrowded when the Wright brothers came along, but flight-in-general (including balloons and airships) was somewhat crowded.
In a hypothetical Yudkowsky-style fast takeoff scenario, an imaginary post-singularity proponent of Christiano’s theory could argue: Sure, AI was a crowded field, but this is consistent with the big jump because TAI was achieved using Technology X, which was an innovation created by that small group, and the field of “Technology X” was not crowded. A possible counterargument is: TAI will have to be good at science (or maybe some other word instead of “science”), and people are already trying to do science using AI (e.g. AlphaFold, the knot theory thing, the quantum chemistry thing). However, this is still begging the question: why draw the line around “science AI” rather than around something else?
Perhaps proponents of Christiano’s theory just have independent, object-level reasons to believe that there will be no “Technology X”, that “more of the same” is already sufficient to get to TAI. In this case there might be no natural place to draw the boundary which will be uncrowded. Here I have my own object-level reasons for skepticism, but maybe it’s unwise to take the discussion there: as Yudkowsky pointed out, this can be net negative. However, I still ask: whence the (seemingly high) confidence? Maybe more-of-the-same is enough for TAI, but maybe not? (And even if it’s theoretically enough, maybe it won’t get there first.)
EDIT: I anticipate an objection to the “Technology X” scenario along the lines of, a small group cannot create something with rapid global impact. Because, presumably, the level of impact of any innovation is bounded in some predictable way by the economic investment. To this I would have two counter-objections:
First, imagine that Technology X didn’t create TAI immediately but did lead to a kink comparable to the start of the deep learning revolution (i.e. TAI was eventually created as a result of much investment in Technology X, on a timeline significantly faster than the extrapolation of the pre-X trend). On the one hand, Christiano appears to believe this is still very unlikely. On the other hand, it can still be excused in hindsight by the reference class tennis I described.
Second, I am skeptical of the methodology. All of us here agree that AI poses unique risks compared to past technologies, so why can we extrapolate from the past in that way? Imagine that we lived in a universe in which it was plausible that the LHC creates a black hole or causes false vacuum collapse. It seems to me that such a universe could still have a techno-economic trajectory broadly similar to our own, for the same reasons. So, in that universe, would it make sense to argue “the LHC cannot destroy the world because its cost is an insufficient fraction of world GDP[1]”? It seems to me it would be strange there in a similar way how the economic argument about AI is strange here.
The LHC is an expensive project, but is it expensive enough to destroy the world? How can we tell? Is this really a sensible way to analyze this compared to thinking about the actual physics?
I’m guessing that a proponent of Christiano’s theory would say: sure, such-and-such startup succeeded but it was because they were the only ones working on problem P, so problem P was an uncrowded field at the time. Okay, but why do we draw the boundary around P rather than around “software” or around something in between which was crowded?
I’d make a different reply: you need to not just look at the winning startup, but all startups. If it’s the case that the ‘startup ecosystem’ is earning 100% returns and the rest of the economy is earning 5% returns, then something weird is up and the model is falsified, but if the startup ecosystem is earning 10% returns once you average together the few successes and many failures, then this looks more like a normal risk-return story.
Furthermore, there’s something interesting where the modern startup economy feels much more like the Paulian ‘concentration of power’ story than the Yudkowskian ‘wisdom of USENET’ story; teams that make video games might be able to turn a handful of people into tens of millions of dollars in revenue (or billions in an extreme case), but teams that make self-driving cars mostly have to be able to tell a story about being able to turn billions of investor dollars into teams of engineers and mountains of hardware that will then be able to produce the self-driving cars, with the race between companies not being “who has a better product already” but “who can better acquire the means to create a better product.”
I’m pretty sympathetic to the view that the first transformative intelligence will look more like a breakout indie game than a AAA title because there’s some new ‘gimmick’ that can be made by a small team and that has an outsized impact on usefulness. But it seems important to note that a lot of the economy doesn’t work that way, even lots of the ‘build speculative future tech’ part!
I don’t see what it has to do with risk-return. Sure, many startups fail. And, plausibly many people tried to build an airplane and failed before the Wright brothers. And, many people keep trying building AGI and failing. This doesn’t mean there won’t be kinks in AI progress or even a TAI created by a small group.
Saying that “the subjective expected value of AI progress over time is a smooth curve” is a very different proposition from “the actual AI progress over time will be a smooth curve”.
My line of argument here is not trying to prove a particular story about AI progress (e.g. “TAI will be similar to a startup”) but push pack against (/ voice my confusions about) the confidence level of predictions made by Christiano’s model.
My line of argument here is not trying to prove a particular story about AI progress (e.g. “TAI will be similar to a startup”) but push pack against (/ voice my confusions about) the confidence level of predictions made by Christiano’s model.
What is the confidence level of predictions you are pushing back against? I’m at like 30% on fast takeoff in the sense of “1 year doubling without preceding 4 year doubling” (a threshold roughly set to break any plausible quantitative historical precedent a threshold intended to be faster than historical precedent but that’s probably similar to the agricultural revolution sped up 10,000x). I’m at maybe 10-20% on the kind of crazier world Eliezer imagines.
Is that a high level of confidence? I’m not sure I would be able to spread my probability in a way that felt unconfident (to me) without giving probabilities that low to lots of particular ways the future could be crazy. E.g. 10-20% is similar to the probability I put on other crazy-feeling possibilities like no singularity at all, rapid GDP acceleration with only moderate cognitive automation, or singleton that arrests economic growth before we get to 4 year doubling times...
I’m at like 30% on fast takeoff in the sense of “1 year doubling without preceding 4 year doubling” (a threshold roughly set to break any plausible quantitative historical precedent).
Huh, AI impacts looked at one dataset of GWP (taken from wikipedia, in turn taken from here) and found 2 precedents for “x year doubling without preceding 4x year doubling”, roughly during the agricultural evolution. The dataset seems to be a combination of lots of different papers’ estimates of human population, plus an assumption of ~constant GWP/capita early in history.
Yeah, I think this was wrong. I’m somewhat skeptical of the numbers and suspect future revisions systematically softening those accelerations, but 4x still won’t look that crazy.
(I don’t remember exactly how I chose that number but it probably involved looking at the same time series so wasn’t designed to be much more abrupt.)
If X is obviously very valuable, many people will work on achieving X (potentially including lots of different approaches they see to achieving X).
In fields with lots of spending and impact, most technological progress is made gradually rather than abruptly, for most ways of measuring.
We haven’t said very much at all about why AI should be one of the exceptions (compare to the situation when discussing nuclear weapons, where you can make fantastic arguments about why it would be different). Eliezer’s argument about criticality seems to me to just not work unless one rejects 1+2 altogether (unlike nuclear weapons).
Your arguments about technology X apply to any other technological goal—better fusion reactors or solar panels, more generally cheaper energy, rockets, semiconductors, whatever. So it seems like they should be visible in the base rate for 2. Do you think that a significant fraction of technological progress is abrupt and unpredictable in the sense that you are saying TAI will probably be?
First, imagine that Technology X didn’t create TAI immediately but did lead to a kink comparable to the start of the deep learning revolution (i.e. TAI was eventually created as a result of much investment in Technology X, on a timeline significantly faster than the extrapolation of the pre-X trend). On the one hand, Christiano appears to believe this is still very unlikely. On the other hand, it can still be excused in hindsight by the reference class tennis I described.
I don’t know exactly what you are responding to here. I have some best guesses about what progress will look like, but they are pretty separate from the broader heuristic. And I’m not sure this is a fair representation of my actual view, within 20 years I think it’s reasonably likely that AI will look fairly different, on a scale of 5 years that seems kind of unlikely.
Second, I am skeptical of the methodology. All of us here agree that AI poses unique risks compared to past technologies, so why can we extrapolate from the past in that way? Imagine that we lived in a universe in which it was plausible that the LHC creates a black hole or causes false vacuum collapse. It seems to me that such a universe could still have a techno-economic trajectory broadly similar to our own, for the same reasons. So, in that universe, would it make sense to argue “the LHC cannot destroy the world because its cost is an insufficient fraction of world GDP[1]”? It seems to me it would be strange there in a similar way how the economic argument about AI is strange here.
I’m predicting that the performance of AI systems will grow relatively continuously and predictably, not that AI isn’t risky or even that risk will emerge gradually.
Take Yudkowsky’s example of startups. How do we explain small startups succeed where large companies failed?
I think it’s pretty unclear how this bears on the general schema above. But I’m happy to consider particular examples of startups that made rapid/unpredictable progress towards particular technological goals (perhaps by pursuing a new approach), since those are the kind of thing I’m predicting are rare. They sure look rare to me (i.e. are responsible for a very small share of total technological progress regardless of how you measure).
In a hypothetical Yudkowsky-style fast takeoff scenario, an imaginary post-singularity proponent of Christiano’s theory could argue: Sure, AI was a crowded field, but this is consistent with the big jump because TAI was achieved using Technology X, which was an innovation created by that small group, and the field of “Technology X” was not crowded.
I’m surprised by an uncrowded technology surprisingly and rapidly overtaking the state of the art for a task people care a lot about, even at a super abstract or broad level like “make money.” Happens sometimes but it’s a minority of progress.
In fields with lots of spending and impact, most technological progress is made gradually rather than abruptly, for most ways of measuring...
Your arguments about technology X apply to any other technological goal—better fusion reactors or solar panels, more generally cheaper energy, rockets, semiconductors, whatever. So it seems like they should be visible in the base rate for 2. Do you think that a significant fraction of technological progress is abrupt and unpredictable in the sense that you are saying TAI will probably be?
I think that you can roughly divide progress into “qualitatively new ideas” (QNI) and “incremental improvement of existing technology” (ofc in reality it’s a spectrum). The first kind is much less predictable than the second kind. Now, when a QNI comes along, it doesn’t necessarily look like a discontinuity, because there might be a lot of work to bridge the distance between idea and implementation. And, this work involves a lot of small details. Because of this, the first version is probably often only a slight improvement on SOTA. So, I’m guessing that QNIs produce something more like a discontinuity in the derivative than a discontinuity in the SOTA itself.
Under this model, most progress is “gradual” in the sense that at most points the graph is differentiable. But (i) it doesn’t work too well to extrapolate trends across QNI points and (ii) the counterfactual impact of QNIs is a large fraction of progress.
Certainly I don’t see fusion reactors, solar panels or (use in electronics of) semiconductors as counterexamples, since each of these was invented at some point, and didn’t gradually evolve from some completely different technology.
Another factor is that in software the distance between idea and implementation tends to be smaller, because processors and operating systems abstract much of the details for you. I think this is partly responsible for “startups” being more of a thing in software than in other fields (ofc another part of it is that software is just more new with more low-hanging fruits). And, this probably makes progress in software less smooth.
For unaligned TAI, the effective distance might be shorter still. Because, for most software there’s a fair amount of work going towards UI and/or integration with other software. But, with TAI this can be unnecessary. Moreover, the alignment problem itself is part of the would-be idea-to-profitable-implementation gap. On the other hand, optimizing performance can also be a large part of idea-to-implementation, and if the first AGI is e.g. a drastically slow superintelligence, this can be compatible with a slow takeoff.
Certainly I don’t see fusion reactors, solar panels or (use in electronics of) semiconductors as counterexamples, since each of these was invented at some point, and didn’t gradually evolve from some completely different technology.
Your definition of “discontinuity” seems broadly compatible with my view of the future then. Definitely there are different technologies that are not all outgrowths of one another.
My main point of divergence is:
Now, when a QNI comes along, it doesn’t necessarily look like a discontinuity, because there might be a lot of work to bridge the distance between idea and implementation. And, this work involves a lot of small details. Because of this, the fist version is probably often only a slight improvement on SOTA.
I think that most of the time when a QNI comes along it is worse than the previous thing and takes work to bring up to the level of the previous thing. In small areas no one pays attention until it overtakes SOTA, but in big areas people usually start paying attention (and investing a significant fraction of the prior SOTA’s size) well before the cross-over point. This seems true for solar or fusion, or digital computers, or deep learning for that matter, or self-driving cars or early automobiles.
If that’s right, then you are looking at two continuous curves and you can think about when they cross and you usually start to get a lot of data before the crossover point. And indeed this is obviously how I’m thinking about technologies like deep learning, which are currently useless for virtually all tasks but which I expect to relatively soon overtake alternatives (like humans and other software) in a huge range of very important domains.
And if some other AI technology replaces deep learning, I generally expect the same story. There is a scale at which new things can burst onto the scene, but over time that scale becomes smaller and smaller relative to the scale of the field. At this point the appearance of “bursting onto the scene” is primarily driven by big private projects that don’t talk publicly about what they are doing for a while (e.g. putting in 20 person-years of effort before a public announcement, so that they get data internally but an outsider just sees a discontinuity), but even that seems to be drying up fairly quickly.
I’m not sure what’s the difference between what you’re saying here and what I said about QNIs. Is it that you expect being able to see the emergent technology before the singular (crossover) point? Actually, the fact you describe DL as “currently useless” makes me think we should be talking about progress as a function of two variables: time and “maturity”, where maturity inhabits, roughly speaking, a scale from “theoretical idea” to “proof of concept” to “beats SOTA in lab conditions” to “commercial product”. In this sense, the “lab progress” curve is already past the DL singularity but the “commercial progress” curve maybe isn’t.
On this model, if post-DL AI technology X appears tomorrow, it will take it some time to span the distance from “theoretical idea” to “commercial product”, in which time we would notice it and update our predictions accordingly. But, two things to note here:
First, it’s not clear which level of maturity is the relevant reference point for AI risk. In particular, I don’t think you need commercial levels of maturity for AI technology to become risky, for the reasons I discussed in my previous comment (and, we can also add regulatory barriers to that list, although I am not convinced they are as important as Yudkowsky seems to believe).
Second, all this doesn’t sound to me like “AI systems will grow relatively continuously and predictably”, although maybe I just interpreted this statement differently from its intent. For instance, I agree that it’s unlikely technology X will emerge specifically in the next year, so progress over the next year should be fairly predictable. On the other hand, I don’t think it would be very surprising if technology X emerges in the next decade.
IIUC, part of what you’re saying can be rephrased as: TAI is unlikely to be created by a small team, since once a small team shows something promising, tonnes of resources will be thrown at them (and at other teams that might be able to copy the technology) and they won’t be a small team anymore. Which sounds plausible, I suppose, but doesn’t make TAI predictable that long in advance.
Now, when a QNI comes along, it doesn’t necessarily look like a discontinuity, because there might be a lot of work to bridge the distance between idea and implementation. And, this work involves a lot of small details. Because of this, the first version is probably often only a slight improvement on SOTA. So, I’m guessing that QNIs produce something more like a discontinuity in the derivative than a discontinuity in the SOTA itself.
Don’t have a great source for this at hand, but my impression is that seemingly-QNIs surprisingly often just power existing exponential trends, meaning no change in derivative (on a log graph).
(A random comment in support of this — I remember chip design expert Jim Keller saying on Lex Fridman’s podcast that Moore’s Law is just a bunch of separate s-curves, as they have to come up with new ideas to work through challenges to shrinking transistors, and the new techniques work for a range of scales and then have to be replaced with new new ideas.)
Not sure if this question is easily settled, but it might be a crux for various views — how often do QNIs actually change the slope of the curve?
The problem with this model is, its predictions depend a lot on how you draw the boundary around “field”. Take Yudkowsky’s example of startups. How do we explain small startups succeed where large companies failed?
I don’t quite see how this is a problem for the model. The narrower you draw the boundary, the more jumpy progress will be, right?
Successful startups are big relative to individuals, but not that big relative to the world as a whole. If we’re talking about a project / technology / company that can rival the rest of the world in its output, then the relevant scale is trillions of dollars (prob deca-trillions), not billions.
And while the most fantastically successful startups can become billion dollar companies within a few years, nobody has yet made it to a trillion in less than a decade.
EDIT: To clarify, not trying to say that something couldn’t grow faster than any previous startup. There could certainly be a ‘kink’ in the rate of progress, like you describe. I just want to emphasize that:
startups are not that jumpy, on the world scale
the actual scale of the world matters
A simple model for the discontinuousness of a field might have two parameters — one for the intrinsic lumpiness of available discoveries, and one for total effort going into discovery. And,
all else equal, more people means smoother progress — if we lived in a trillion person world, AI progress would be more continuous
it’s an open empirical question whether the actual values for these parameters will result in smooth or jumpy takeoff:
even if investment in AI is in the deca-trillions and a meaningful fraction of all world output, it could still be that the actual territory of available discoveries is so lumpy that progress is discontinuous
but, remember that reality has a surprising amount of detail, which I think tends to push things in a smoother direction — it means there are more fiddly details to work through, even when you have a unique insight or technological advantage
or, in other words, even if you have a random draw from a distribution that ends up being an outlier, actual progress in the real world will be the result of many different draws, which will tend to push things more toward the regime of normals
I don’t quite see how this is a problem for the model. The narrower you draw the boundary, the more jumpy progress will be, right?
So, you’re saying: if we draw the boundary around a narrow field, we get jumpy/noisy progress. If we the draw the boundary around a broad field, all the narrow subfields average out and the result is less noise. This makes a lot of sense, thank you!
The question is, what metric do we use to average the subfields. For example, on some metrics the Manhattan project might be a rather small jump in military-technology-averaged-over-subfields. But, its particular subfield had a rather outsized impact! In general, I think that “impactfulness” has a heavy-tailed distribution and therefore the “correct” averaging still leaves a fair amount of jumpiness.
And while the most fantastically successful startups can become billion dollar companies within a few years, nobody has yet made it to a trillion in less than a decade.
Yeaaah, but like I said before, I am skeptical of giving so much weight to data from economics. Economics reflects a lot about people and about the world, but there are facts about physics/math it cannot possibly know about, so evidence from such facts cannot be meaningfully overturned with economic data.
Moreover, from certain angles singleton takeoff can look sort of like a “normal” type of economic story. In one case, person has an idea, does a lot of work, gets investments etc etc and after a decade there’s a trillion dollars. In the other case, person builds AI, the AI has some ideas, [stuff happens], after a decade nanobots kill everyone. As Daniel Kokotajlo argued, what’s actually important is when the point-of-no-return (PONR) is. And, the PONR might be substantially earlier than the analogue-of-trillion-dollars.
all else equal, more people means smoother progress — if we lived in a trillion person world, AI progress would be more continuous
Would it? It’s clear that progress in AI would be faster, but why more continuous?
I think that the causation actually goes in the opposite direction. If a field has a lot of small potential improvements with substantial economic value, then a lot of people will work in the field because (i) you don’t need extremely intelligent people to make progress and (ii) it pays off. If a field has a small number of large improvements, then only a small number of people are able to contribute to it. So, a lot of people working on AI is evidence about the kind of progress happening today, but not strong evidence about the absence of significant kinks in the future.
On my version of the “continuous view”, the Technology X story seems plausible, but it starts with a shitty version of Technology X that doesn’t immediately produce billions of dollars of impact (or something similar, e.g. killing all humans), that then improves faster than the existing technology, such that an outside observer looking at both technologies could use trend extrapolation to predict that Technology X would be the one to reach TAI.
(And you can make this prediction at least, say, 3 years in advance of TAI, i.e. Technology X isn’t going to be accelerating so fast that you have zero time to react.)
Imagine that we lived in a universe in which it was plausible that the LHC creates a black hole or causes false vacuum collapse. It seems to me that such a universe could still have a techno-economic trajectory broadly similar to our own, for the same reasons. So, in that universe, would it make sense to argue “the LHC cannot destroy the world because its cost is an insufficient fraction of world GDP[1]”? It seems to me it would be strange there in a similar way how the economic argument about AI is strange here.
The “continuous view” argument is about takeoff speeds, not about AI risk?
If AI risk arose from narrow systems that couldn’t produce a billion dollars of value then I’d expect that risk could arise more discontinuously from a new paradigm. But AI risk arises from systems that are sufficiently intelligent that they could produce billions of dollars of value.
Christiano’s model of progress, AFAIU, can be summarized as: “When only a few people work in a field, big jumps in progress are possible. When many people work in a field, the low hanging fruits are picked quickly and then progress is smooth.”
The problem with this model is, its predictions depend a lot on how you draw the boundary around “field”. Take Yudkowsky’s example of startups. How do we explain small startups succeed where large companies failed? And, it’s not lack of economic incentives since successful startups sometimes make huge profits? (Often resulting from getting acquired by the large companies.) So much that there’s an entire industry of investing in startups?
I’m guessing that a proponent of Christiano’s theory would say: sure, such-and-such startup succeeded but it was because they were the only ones working on problem P, so problem P was an uncrowded field at the time. Okay, but why do we draw the boundary around P rather than around “software” or around something in between which was crowded? To take another example, heavier-than-air flight was arguably uncrowded when the Wright brothers came along, but flight-in-general (including balloons and airships) was somewhat crowded.
In a hypothetical Yudkowsky-style fast takeoff scenario, an imaginary post-singularity proponent of Christiano’s theory could argue: Sure, AI was a crowded field, but this is consistent with the big jump because TAI was achieved using Technology X, which was an innovation created by that small group, and the field of “Technology X” was not crowded. A possible counterargument is: TAI will have to be good at science (or maybe some other word instead of “science”), and people are already trying to do science using AI (e.g. AlphaFold, the knot theory thing, the quantum chemistry thing). However, this is still begging the question: why draw the line around “science AI” rather than around something else?
Perhaps proponents of Christiano’s theory just have independent, object-level reasons to believe that there will be no “Technology X”, that “more of the same” is already sufficient to get to TAI. In this case there might be no natural place to draw the boundary which will be uncrowded. Here I have my own object-level reasons for skepticism, but maybe it’s unwise to take the discussion there: as Yudkowsky pointed out, this can be net negative. However, I still ask: whence the (seemingly high) confidence? Maybe more-of-the-same is enough for TAI, but maybe not? (And even if it’s theoretically enough, maybe it won’t get there first.)
EDIT: I anticipate an objection to the “Technology X” scenario along the lines of, a small group cannot create something with rapid global impact. Because, presumably, the level of impact of any innovation is bounded in some predictable way by the economic investment. To this I would have two counter-objections:
First, imagine that Technology X didn’t create TAI immediately but did lead to a kink comparable to the start of the deep learning revolution (i.e. TAI was eventually created as a result of much investment in Technology X, on a timeline significantly faster than the extrapolation of the pre-X trend). On the one hand, Christiano appears to believe this is still very unlikely. On the other hand, it can still be excused in hindsight by the reference class tennis I described.
Second, I am skeptical of the methodology. All of us here agree that AI poses unique risks compared to past technologies, so why can we extrapolate from the past in that way? Imagine that we lived in a universe in which it was plausible that the LHC creates a black hole or causes false vacuum collapse. It seems to me that such a universe could still have a techno-economic trajectory broadly similar to our own, for the same reasons. So, in that universe, would it make sense to argue “the LHC cannot destroy the world because its cost is an insufficient fraction of world GDP[1]”? It seems to me it would be strange there in a similar way how the economic argument about AI is strange here.
The LHC is an expensive project, but is it expensive enough to destroy the world? How can we tell? Is this really a sensible way to analyze this compared to thinking about the actual physics?
I’d make a different reply: you need to not just look at the winning startup, but all startups. If it’s the case that the ‘startup ecosystem’ is earning 100% returns and the rest of the economy is earning 5% returns, then something weird is up and the model is falsified, but if the startup ecosystem is earning 10% returns once you average together the few successes and many failures, then this looks more like a normal risk-return story.
Furthermore, there’s something interesting where the modern startup economy feels much more like the Paulian ‘concentration of power’ story than the Yudkowskian ‘wisdom of USENET’ story; teams that make video games might be able to turn a handful of people into tens of millions of dollars in revenue (or billions in an extreme case), but teams that make self-driving cars mostly have to be able to tell a story about being able to turn billions of investor dollars into teams of engineers and mountains of hardware that will then be able to produce the self-driving cars, with the race between companies not being “who has a better product already” but “who can better acquire the means to create a better product.”
I’m pretty sympathetic to the view that the first transformative intelligence will look more like a breakout indie game than a AAA title because there’s some new ‘gimmick’ that can be made by a small team and that has an outsized impact on usefulness. But it seems important to note that a lot of the economy doesn’t work that way, even lots of the ‘build speculative future tech’ part!
I don’t see what it has to do with risk-return. Sure, many startups fail. And, plausibly many people tried to build an airplane and failed before the Wright brothers. And, many people keep trying building AGI and failing. This doesn’t mean there won’t be kinks in AI progress or even a TAI created by a small group.
Saying that “the subjective expected value of AI progress over time is a smooth curve” is a very different proposition from “the actual AI progress over time will be a smooth curve”.
My line of argument here is not trying to prove a particular story about AI progress (e.g. “TAI will be similar to a startup”) but push pack against (/ voice my confusions about) the confidence level of predictions made by Christiano’s model.
What is the confidence level of predictions you are pushing back against? I’m at like 30% on fast takeoff in the sense of “1 year doubling without preceding 4 year doubling” (
a threshold roughly set to break any plausible quantitative historical precedenta threshold intended to be faster than historical precedent but that’s probably similar to the agricultural revolution sped up 10,000x). I’m at maybe 10-20% on the kind of crazier world Eliezer imagines.Is that a high level of confidence? I’m not sure I would be able to spread my probability in a way that felt unconfident (to me) without giving probabilities that low to lots of particular ways the future could be crazy. E.g. 10-20% is similar to the probability I put on other crazy-feeling possibilities like no singularity at all, rapid GDP acceleration with only moderate cognitive automation, or singleton that arrests economic growth before we get to 4 year doubling times...
Huh, AI impacts looked at one dataset of GWP (taken from wikipedia, in turn taken from here) and found 2 precedents for “x year doubling without preceding 4x year doubling”, roughly during the agricultural evolution. The dataset seems to be a combination of lots of different papers’ estimates of human population, plus an assumption of ~constant GWP/capita early in history.
Yeah, I think this was wrong. I’m somewhat skeptical of the numbers and suspect future revisions systematically softening those accelerations, but 4x still won’t look that crazy.
(I don’t remember exactly how I chose that number but it probably involved looking at the same time series so wasn’t designed to be much more abrupt.)
My view is:
If X is obviously very valuable, many people will work on achieving X (potentially including lots of different approaches they see to achieving X).
In fields with lots of spending and impact, most technological progress is made gradually rather than abruptly, for most ways of measuring.
We haven’t said very much at all about why AI should be one of the exceptions (compare to the situation when discussing nuclear weapons, where you can make fantastic arguments about why it would be different). Eliezer’s argument about criticality seems to me to just not work unless one rejects 1+2 altogether (unlike nuclear weapons).
Your arguments about technology X apply to any other technological goal—better fusion reactors or solar panels, more generally cheaper energy, rockets, semiconductors, whatever. So it seems like they should be visible in the base rate for 2. Do you think that a significant fraction of technological progress is abrupt and unpredictable in the sense that you are saying TAI will probably be?
I don’t know exactly what you are responding to here. I have some best guesses about what progress will look like, but they are pretty separate from the broader heuristic. And I’m not sure this is a fair representation of my actual view, within 20 years I think it’s reasonably likely that AI will look fairly different, on a scale of 5 years that seems kind of unlikely.
I’m predicting that the performance of AI systems will grow relatively continuously and predictably, not that AI isn’t risky or even that risk will emerge gradually.
I think it’s pretty unclear how this bears on the general schema above. But I’m happy to consider particular examples of startups that made rapid/unpredictable progress towards particular technological goals (perhaps by pursuing a new approach), since those are the kind of thing I’m predicting are rare. They sure look rare to me (i.e. are responsible for a very small share of total technological progress regardless of how you measure).
I’m surprised by an uncrowded technology surprisingly and rapidly overtaking the state of the art for a task people care a lot about, even at a super abstract or broad level like “make money.” Happens sometimes but it’s a minority of progress.
I think that you can roughly divide progress into “qualitatively new ideas” (QNI) and “incremental improvement of existing technology” (ofc in reality it’s a spectrum). The first kind is much less predictable than the second kind. Now, when a QNI comes along, it doesn’t necessarily look like a discontinuity, because there might be a lot of work to bridge the distance between idea and implementation. And, this work involves a lot of small details. Because of this, the first version is probably often only a slight improvement on SOTA. So, I’m guessing that QNIs produce something more like a discontinuity in the derivative than a discontinuity in the SOTA itself.
Under this model, most progress is “gradual” in the sense that at most points the graph is differentiable. But (i) it doesn’t work too well to extrapolate trends across QNI points and (ii) the counterfactual impact of QNIs is a large fraction of progress.
Certainly I don’t see fusion reactors, solar panels or (use in electronics of) semiconductors as counterexamples, since each of these was invented at some point, and didn’t gradually evolve from some completely different technology.
Another factor is that in software the distance between idea and implementation tends to be smaller, because processors and operating systems abstract much of the details for you. I think this is partly responsible for “startups” being more of a thing in software than in other fields (ofc another part of it is that software is just more new with more low-hanging fruits). And, this probably makes progress in software less smooth.
For unaligned TAI, the effective distance might be shorter still. Because, for most software there’s a fair amount of work going towards UI and/or integration with other software. But, with TAI this can be unnecessary. Moreover, the alignment problem itself is part of the would-be idea-to-profitable-implementation gap. On the other hand, optimizing performance can also be a large part of idea-to-implementation, and if the first AGI is e.g. a drastically slow superintelligence, this can be compatible with a slow takeoff.
Your definition of “discontinuity” seems broadly compatible with my view of the future then. Definitely there are different technologies that are not all outgrowths of one another.
My main point of divergence is:
I think that most of the time when a QNI comes along it is worse than the previous thing and takes work to bring up to the level of the previous thing. In small areas no one pays attention until it overtakes SOTA, but in big areas people usually start paying attention (and investing a significant fraction of the prior SOTA’s size) well before the cross-over point. This seems true for solar or fusion, or digital computers, or deep learning for that matter, or self-driving cars or early automobiles.
If that’s right, then you are looking at two continuous curves and you can think about when they cross and you usually start to get a lot of data before the crossover point. And indeed this is obviously how I’m thinking about technologies like deep learning, which are currently useless for virtually all tasks but which I expect to relatively soon overtake alternatives (like humans and other software) in a huge range of very important domains.
And if some other AI technology replaces deep learning, I generally expect the same story. There is a scale at which new things can burst onto the scene, but over time that scale becomes smaller and smaller relative to the scale of the field. At this point the appearance of “bursting onto the scene” is primarily driven by big private projects that don’t talk publicly about what they are doing for a while (e.g. putting in 20 person-years of effort before a public announcement, so that they get data internally but an outsider just sees a discontinuity), but even that seems to be drying up fairly quickly.
I’m not sure what’s the difference between what you’re saying here and what I said about QNIs. Is it that you expect being able to see the emergent technology before the singular (crossover) point? Actually, the fact you describe DL as “currently useless” makes me think we should be talking about progress as a function of two variables: time and “maturity”, where maturity inhabits, roughly speaking, a scale from “theoretical idea” to “proof of concept” to “beats SOTA in lab conditions” to “commercial product”. In this sense, the “lab progress” curve is already past the DL singularity but the “commercial progress” curve maybe isn’t.
On this model, if post-DL AI technology X appears tomorrow, it will take it some time to span the distance from “theoretical idea” to “commercial product”, in which time we would notice it and update our predictions accordingly. But, two things to note here:
First, it’s not clear which level of maturity is the relevant reference point for AI risk. In particular, I don’t think you need commercial levels of maturity for AI technology to become risky, for the reasons I discussed in my previous comment (and, we can also add regulatory barriers to that list, although I am not convinced they are as important as Yudkowsky seems to believe).
Second, all this doesn’t sound to me like “AI systems will grow relatively continuously and predictably”, although maybe I just interpreted this statement differently from its intent. For instance, I agree that it’s unlikely technology X will emerge specifically in the next year, so progress over the next year should be fairly predictable. On the other hand, I don’t think it would be very surprising if technology X emerges in the next decade.
IIUC, part of what you’re saying can be rephrased as: TAI is unlikely to be created by a small team, since once a small team shows something promising, tonnes of resources will be thrown at them (and at other teams that might be able to copy the technology) and they won’t be a small team anymore. Which sounds plausible, I suppose, but doesn’t make TAI predictable that long in advance.
Don’t have a great source for this at hand, but my impression is that seemingly-QNIs surprisingly often just power existing exponential trends, meaning no change in derivative (on a log graph).
(A random comment in support of this — I remember chip design expert Jim Keller saying on Lex Fridman’s podcast that Moore’s Law is just a bunch of separate s-curves, as they have to come up with new ideas to work through challenges to shrinking transistors, and the new techniques work for a range of scales and then have to be replaced with new new ideas.)
Not sure if this question is easily settled, but it might be a crux for various views — how often do QNIs actually change the slope of the curve?
I don’t quite see how this is a problem for the model. The narrower you draw the boundary, the more jumpy progress will be, right?
Successful startups are big relative to individuals, but not that big relative to the world as a whole. If we’re talking about a project / technology / company that can rival the rest of the world in its output, then the relevant scale is trillions of dollars (prob deca-trillions), not billions.
And while the most fantastically successful startups can become billion dollar companies within a few years, nobody has yet made it to a trillion in less than a decade.
EDIT: To clarify, not trying to say that something couldn’t grow faster than any previous startup. There could certainly be a ‘kink’ in the rate of progress, like you describe. I just want to emphasize that:
startups are not that jumpy, on the world scale
the actual scale of the world matters
A simple model for the discontinuousness of a field might have two parameters — one for the intrinsic lumpiness of available discoveries, and one for total effort going into discovery. And,
all else equal, more people means smoother progress — if we lived in a trillion person world, AI progress would be more continuous
it’s an open empirical question whether the actual values for these parameters will result in smooth or jumpy takeoff:
even if investment in AI is in the deca-trillions and a meaningful fraction of all world output, it could still be that the actual territory of available discoveries is so lumpy that progress is discontinuous
but, remember that reality has a surprising amount of detail, which I think tends to push things in a smoother direction — it means there are more fiddly details to work through, even when you have a unique insight or technological advantage
or, in other words, even if you have a random draw from a distribution that ends up being an outlier, actual progress in the real world will be the result of many different draws, which will tend to push things more toward the regime of normals
So, you’re saying: if we draw the boundary around a narrow field, we get jumpy/noisy progress. If we the draw the boundary around a broad field, all the narrow subfields average out and the result is less noise. This makes a lot of sense, thank you!
The question is, what metric do we use to average the subfields. For example, on some metrics the Manhattan project might be a rather small jump in military-technology-averaged-over-subfields. But, its particular subfield had a rather outsized impact! In general, I think that “impactfulness” has a heavy-tailed distribution and therefore the “correct” averaging still leaves a fair amount of jumpiness.
Yeaaah, but like I said before, I am skeptical of giving so much weight to data from economics. Economics reflects a lot about people and about the world, but there are facts about physics/math it cannot possibly know about, so evidence from such facts cannot be meaningfully overturned with economic data.
Moreover, from certain angles singleton takeoff can look sort of like a “normal” type of economic story. In one case, person has an idea, does a lot of work, gets investments etc etc and after a decade there’s a trillion dollars. In the other case, person builds AI, the AI has some ideas, [stuff happens], after a decade nanobots kill everyone. As Daniel Kokotajlo argued, what’s actually important is when the point-of-no-return (PONR) is. And, the PONR might be substantially earlier than the analogue-of-trillion-dollars.
Would it? It’s clear that progress in AI would be faster, but why more continuous?
I think that the causation actually goes in the opposite direction. If a field has a lot of small potential improvements with substantial economic value, then a lot of people will work in the field because (i) you don’t need extremely intelligent people to make progress and (ii) it pays off. If a field has a small number of large improvements, then only a small number of people are able to contribute to it. So, a lot of people working on AI is evidence about the kind of progress happening today, but not strong evidence about the absence of significant kinks in the future.
On my version of the “continuous view”, the Technology X story seems plausible, but it starts with a shitty version of Technology X that doesn’t immediately produce billions of dollars of impact (or something similar, e.g. killing all humans), that then improves faster than the existing technology, such that an outside observer looking at both technologies could use trend extrapolation to predict that Technology X would be the one to reach TAI.
(And you can make this prediction at least, say, 3 years in advance of TAI, i.e. Technology X isn’t going to be accelerating so fast that you have zero time to react.)
Yes, this is something I discuss in the edit (you probably started typing your reply before I posted it).
The “continuous view” argument is about takeoff speeds, not about AI risk?
If AI risk arose from narrow systems that couldn’t produce a billion dollars of value then I’d expect that risk could arise more discontinuously from a new paradigm. But AI risk arises from systems that are sufficiently intelligent that they could produce billions of dollars of value.