At some point, I looked at the base rate for discontinuities in what I thought was a random enough sample of 50 technologies. You can get the actual csv here. The base rate for big discontinuities I get is just much higher than 5% that keeps being mentioned throughout the post.
Here are some of the discontinuities that I think can contribute more to this discussion: - One story on the printing press was that there was a hardware overhang from the Chinese having invented printing, but applying it to their much more difficult to print script. When applying similar methods to the Latin alphabet, printing suddenly became much more efficient. [note: probably wrong, see comment below] - Examples of cheap physics hacks: The Bessemer process, activated sludge, de Laval nozzles, the Bayer + Hall–Héroult processes.
To overcome small inconveniences, I’m copying the whole csv from that post here:
Technology
Is there plausibly a discontinuity
Size of the (plausible) discontinuity
History of aviation
Yes. With the Wright brothers, who were more analytical and capable than any before them.
Big
History of ceramics
Probably not.
History of cryptography
Yes. Plausibly with the invention of the one-time pad.
Medium
History of cycling
Yes. With its invention. The dandy horse (immediate antecessor to the bicycle) was invented in a period where there were few horses, but it could in principle have been invented much earlier, and it enabled humans to go much faster.
Small
History of film
Probably not.
History of furniture
Maybe. Maybe with the invention of the chair. Maybe with the Industrial Revolution. Maybe in recent history with the invention of more and more comfy models of chairs (e.g., bean bags)
Small
History of glass.
Yes. In cheapness and speed with the industrial revolution
Medium
Nuclear history
Yes. Both with the explosion of the first nuclear weapon, and with the explosion of the (more powerful) hydrogen bomb
Big
History of the petroleum industry
Yes. Petroleum had been used since ancient times, but it took off starting in ~1850
Big
History of photography
Probably not.
History of printing
Yes. With Gutenberg. Hardware overhang from having used printing for a more difficult problem: Chinese characters vs Latin alphabet
Big
History of rail transport
Yes. With the introduction of iron, then (Bessemer process) steel over wood, and the introduction of steam engines over horses. Great expansion during the Industrial Revolution.
Medium
History of robotics
Maybe. But the 18th-21st centuries saw more progress than the rest combined.
Small
History of spaceflight
Yes. With the beginning of the space race.
Big
History of water supply and sanitation
Yes. With the Industrial revolution and the push starting in the, say, 1850s to get sanitation in order (https://en.wikipedia.org/wiki/Great_Stink; https://en.wikipedia.org/wiki/Activated_sludge); the discovery/invention of activated sludge might also be another discontinuity. But I’d say it’s mostly the “let us, as a civilization, get our house in order” impulse that led to these inventions.
Medium
History of rockets
Yes. With Hale rockets, whose spinning made them more accurate. Then with de Laval nozzles (hypersonic rockets; went from 2% to 64% efficiency). Then plausibly with Germany’s V2 rocket (the German missile program cost levels comparable to the Manhattan project).
Big
History of artificial life
Probably not.
History of calendars
Probably not. Maybe with the Khayyam calendar reform in 1079 in the Persian calendar, but it seems too precise to be true. “Because months were computed based on precise times of solar transit between zodiacal regions, seasonal drift never exceeded one day, and also there was no need for a leap year in the Jalali calendar. [...] However, the original Jalali calendar based on observations (or predictions) of solar transit would not have needed either leap years or seasonal adjustments.”
History of candle making
Yes. With industrialization: “The manufacture of candles became an industrialized mass market in the mid 19th century. In 1834, Joseph Morgan, a pewterer from Manchester, England, patented a machine that revolutionized candle making. It allowed for continuous production of molded candles by using a cylinder with a moveable piston to eject candles as they solidified. This more efficient mechanized production produced about 1,500 candles per hour, (according to his patent ”. . with three men and five boys [the machine] will manufacture two tons of candle in twelve hours”). This allowed candles to become an easily affordable commodity for the masses”
Small
History of chromatography
Probably not. Any of the new types could have been one, though.
Chronology of bladed weapons
Probably not. Though the Spanish tercios were probably discontinuous as an organization method around it.
History of condoms
Probably not
History of the diesel car
Yes. In terms of efficiency: the diesel engine’s point is much more efficient than the gasoline engine.
Medium
History of hearing aids
Probably not
History of aluminium
Yes. With the Bayer + Hall–Héroult processes in terms of cheapness.
Big
History of automation
Maybe. If so, with controllers in the 1900s, or with the switch to digital in the 1960s. Kiva systems, used by Amazon, also seems to be substantially better than the competition: https://en.wikipedia.org/wiki/Amazon_Robotics
Medium
History of radar
Yes. Development was extremely fast during the war.
Big
History of radio
Yes. The first maybe discontinuity was with Marconi realizing the potential of electromagnetic waves for communication, and his superior commercialization. The second discontinuity was a discontinuity in price as vacuum tubes were replaced with transistors, making radios much more affordable.
Big
History of sound recording
Maybe. There were different eras, and any of them could have had a discontinuity. For example, magnetic tape recordings were much better than previous technologies
Small
History of submarines
Yes. Drebbel’s submarine “seemed beyond conventional expectations of what science was thought to have been capable of at the time.” It also seems likely that development was sped up during major conflicts (American Civil War, WW1, WW2, Cold War)
Small
History of television
Maybe. Work on television was banned during WW2 and picked up faster afterwards. Perhaps with the super-Emitron in the 1930s (“The super-Emitron was between ten and fifteen times more sensitive than the original Emitron and iconoscope tubes and, in some cases, this ratio was considerably greater”)
Medium
History of the automobile
Yes. In speed of production with Ford. Afterwards maybe with the Japanese (i.e., Toyota)
Big
History of the battery
Maybe. There have been many types of batteries throughout history, each with different tradeoffs. For example, higher voltage and more consistent current at the expense of greater fragility, like the Poggendorff cell. Or the Grove cell, which offered higher current and voltage, at the expense of being more expensive and giving off poisonous nitric oxide fumes. Or the lithium-ion cell, which seems to just have been better, gotten its inventor a Nobel Price, and shows a pretty big jump in terms of, say, voltage.
Small
History of the telephone
Probably not. If so, maybe with the invention of the automatic switchboard.
History of the transistor
Maybe. Probably with the invention of the MOSFET; the first transistor which could be used to create integrated circuits, and which started Moore’s law.
Big
History of the internal combustion engine
Probably not. If so, jet engines.
History of manufactured fuel gases
Probably not.
History of perpetual motion machines
No.
History of the motorcycle
Probably not. If there is, perhaps in price for the first Vespa in 1946
History of multitrack recording
Maybe. It is possible that Les Paul’s experimenting was sufficiently radical to be a discontinuity.
Small
History of nanotechnology
Probably not
Oscilloscope history
Probably not. However, there were many advances in the last century, and any of them could have been one.
History of paper
Maybe. Maybe with Cai Lun at the beginning. Probably with the industrial revolution and the introduction of wood pulp w/r to cheapness.
Small
History of polymerase chain reaction
Yes. Polymerase chain reaction *is* the discontinuity; a revolutionary new technology. It enabled many new other technologies, like DNA evidence in trials, HIV tests, analysis of ancient DNA, etc.
Big
History of the portable gas stove
Probably not
History of the roller coaster
Probably not
History of the steam engine
Maybe. The Newcomen engine put together various disparate already existing elements to create something new. Watt’s various improvements also seem dramatic. Unclear abou the others.
Medium
History of the telescope
Maybe. If so, maybe after the serendipitous invention/discovery of radio telescopy
Small
History of timekeeping devices
Maybe. Plausibly with the industrial revolution in terms of cheapness, then with quartz clocks, then with atomic clocks in terms of precision.
I think 24% for “there will be a big discontinuity at some point in the history of a field” is pretty reasonable, though I have some quibbles with your estimates (detailed below). I think there are a bunch of additional facts that make me go a lot lower than that on the specific question we have with AI:
We’re talking about a discontinuity at a specific moment along the curve—not just “there will be a discontinuity in AI progress at some point”, but specifically “there will be a discontinuity around the point where AI systems first reach approximately human-level intelligence”. Assigning 5% to a discontinuity at a specific region of the curve can easily be compatible with 24% of a discontinuity overall.
We also know that the field of AI has been around for 60 years and that there is a lot of effort being put into building powerful AI systems. I expect that the more effort is being put into something, the more the low-hanging fruit / “secrets” are already plucked, and the less likely discontinuities are.
We can’t currently point to anything that seems like it should cause a discontinuity in the future. It seems to me like for many of the “physics hack” style of discontinuity, the discontinuity would have been predictable in advance (I’m thinking especially of nukes and spaceflight here). Though possibly this is just hindsight bias on my part.
We were talking about a huge discontinuity—that the first time we destroy cities, we will also destroy the Earth. (And I think we’re talking about a similarly large discontinuity in the AI case, though I’m not actually sure.) These intuitively feel way larger than your big discontinuities. Though as a counterpoint, I also think AI will be a way bigger deal than most of the other technologies, so a similar discontinuity in some underlying trend could lead to a much bigger discontinuity in terms of impact. (Still, if we talk about “smaller” discontinuities like 1-year doubling of GDP before 4-year doubling of GDP, I put more probability on it, relative to something like “the world looks pretty similar to today’s world, and then everyone drops dead”.)
(All of these together would push me way lower than 5%, if ignoring model uncertainty / “maybe I’m wrong” + noting that the future is hard to predict.)
It seems to me that the biggest point of disagreement is on (3), and this is why in the conversation I keep coming back to
My impression is that Eliezer thinks that “general intelligence” is a qualitatively different sort of thing than that-which-neural-nets-are-doing, and maybe that’s what’s analogous to “entirely new physics”. I’m pretty unconvinced of this, but something in this genre feels quite crux-y for me.
I do think “look at historical examples” is a good thing to do, so I’ll go through each of your discontinuities in turn. Note that I know very little about most of these areas and haven’t even read the Wikipedia page for most of them, so lots of things I say could be completely wrong:
Aviation: I assume you’re talking about the zero-to-one discontinuity from “no flight” to “flight”? I do agree that we’ll see zero-to-one discontinuities on particular AI capabilities, e.g. language models learn to do arithmetic quite discontinuously. This seems pretty irrelevant to the case with AI. (Notably, the Wright flyer didn’t have much of an impact on things people cared about, and not that many people were working on flight to my knowledge.)
Nukes: Agree that this is a zero-to-one discontinuity from “physics hack” (but note that it did involve a huge amount of effort). Unlike the Wright flyer, it did have a huge impact on things people cared about.
Petroleum: Not sure what the discontinuity is—is it that “amount of petroleum used” increased discontinuously? If so, I very much expect such discontinuities to happen; they’ll happen any time a better technology replaces a worse technology. (Put another way, the reason to expect continuous progress is that there is optimization pressure on the metric and so the low-hanging fruit / “secrets” have already been taken; there wouldn’t have been much optimization pressure on “amount of petroleum used”.) I also expect that there has been a discontinuity in “use of neural nets for machine learning”, and similarly I expect AI coding assistants will become hugely more popular in the nearish future. The relevant question to me is whether we saw a discontinuity in something like “ability to heat your home” or “ability to travel long distances” or something like that.
Printing: Going off of AI Impacts’ investigation, I’d count this one. I think partly this was because there wasn’t much effort going into this. (It looks like when the printing press was invented we were producing ~50,000 manuscripts per year, using about 25,000 person-years of labor. Presumably much much less than that was going into optimizing the process, similarly to how R&D in machine translation is way way lower than the size of the translation market.)
Spaceflight: Agree that this is a zero-to-one discontinuity from “physics hack” (but note that it did involve a huge amount of effort). Although if you’re saying that more resources were spent on it, same comment as petroleum.
Rockets: I’d love to see numbers, but this does sound like a discontinuity that’s relevant to the case with AI. I’d also want to know how much people cared about it (plausibly quite a lot).
Aluminium: Looking at AI Impacts, I think I’m at “probably a discontinuity relevant to the case with AI, but not a certainty”.
Radar: I’d need more details about what happened here, but it seems like this is totally consistent with the “continuous view” (since “with more effort you got more progress” seems like a pretty central conclusion of the model ).
Radio: Looking at AI Impacts, I think this one looks more like “lots of crazy fast progress that is fueled by frequent innovations”, which seems pretty compatible with the “continuous view” on AI. (Though I’m sympathetic to the critique that the double exponential curve chosen by AI Impacts is an instance of finding a line by which things look smooth; I definitely wouldn’t have chosen that functional form in advance of seeing the data.)
Automobile: I’d assume that there was very little optimization on “speed of production of cars” at the time, given that cars had only just become commercially viable, so a discontinuity seems unsurprising.
Transistors: Wikipedia claims “the MOSFET was also initially slower and less reliable than the BJT”, and further discussion seems to suggest that its benefits were captured with further work and effort (e.g. it was a twentieth the size of a BJT by the 1990s, decades after invention). This sounds like it wasn’t a discontinuity to me. What metric did you think it was a discontinuity for?
PCR: I don’t know enough about the field—sounds like a zero-to-one discontinuity (or something very close, where ~no one was previously trying to do the things PCR does). See aviation.
Transistors: Wikipedia claims “the MOSFET was also initially slower and less reliable than the BJT”, and further discussion seems to suggest that its benefits were captured with further work and effort (e.g. it was a twentieth the size of a BJT by the 1990s, decades after invention). This sounds like it wasn’t a discontinuity to me.
I am also skeptical that the MOSFET produced a discontinuity. Plausibly, what we care about is the number of computations we can do per dollar. Nordhaus (2007) provides data showing that that the rate of progress on this metric was practically unchanged at the time the MOSFET was invented, in 1959.
Your prior is for discontinuities throughout the entire development of a technology, so shouldn’t your prior be for discontinuity at any point during the development of AI, rather than discontinuity at or around the specific point when AI becomes AGI? It seems this would be much lower, though we could then adjust upward based on the particulars of why we think a discontinuity is more likely at AGI.
The idea of “hardware overhang” from Chinese printing tech seems extremely unlikely. There was almost certainly no contact between Chinese and European printers at the time. European printing tech was independently derived, and differed from its Chinese precursors in many many important details. Gutenberg’s most important innovation, the system of mass-producing types from a matrix (and the development of specialized lead alloys to make this possible), has no Chinese precedent. The economic conditions were also very different; most notably, the Europeans had cheap paper from the water-powered paper mill (a 13th-century invention), which made printing a much bigger industry even before Gutenberg.
When I think about the history of cryptography, I consider there to be a discontinuity at the 1976 invention of Diffie–Hellman key exchange. Diffie–Hellman key exchange was the first asymmetric cryptosystem. Asymmetric cryptography is important because it allows secure communication between two strangers over a externally-monitored channel. Before Diffie–Hellman, it wasn’t even clear that asymmetric cryptography was even possible.
That’s an interesting example because of how it’s a multiple discovery. You know about the secret GCHQ invention, of course, but “Merkle’s Puzzles” was invented before D-H: Merkle’s Puzzles aren’t quite like what you think of by public-key crypto (it’s proof of work, in a sense, not trapdoor function, I guess you might say), but they do give a constructive proof that it is possible to achieve the goal of bootstrapping a shared secret. It’s not great and can’t be improved, but it does it.
It’s also another example of ‘manufacturing an academic pedigree’. You’ll see a lot of people say “Merkle’s Puzzles inspired D-H”, tracing out a Whig intellectual history of citations only mildly vexed by the GCHQ multiple. But, if you ask Merkle or Diffie themselves, about how Diffie knew about Merkle’s then-unpublished work, they’ll tell you that Diffie didn’t know about the Puzzles when he invented D-H—because nobody understood or liked Merkle’s Puzzles (Diffie amusingly explains it as ‘tl;dr’). What Diffie did know was that there was a dude called Merkle somewhere who was convinced it could be done, while Diffie was certain it was impossible, and the fact that this dude disagreed with him nagged at him and kept him thinking about the problem until he solved it (making asymmetric/public-key at least a triple: GCHQ, Merkle, and D-H). Reason is adversarial.
At some point, I looked at the base rate for discontinuities in what I thought was a random enough sample of 50 technologies. You can get the actual csv here. The base rate for big discontinuities I get is just much higher than 5% that keeps being mentioned throughout the post.
Here are some of the discontinuities that I think can contribute more to this discussion:
- One story on the printing press was that there was a hardware overhang from the Chinese having invented printing, but applying it to their much more difficult to print script. When applying similar methods to the Latin alphabet, printing suddenly became much more efficient. [note: probably wrong, see comment below]
- Examples of cheap physics hacks: The Bessemer process, activated sludge, de Laval nozzles, the Bayer + Hall–Héroult processes.
To overcome small inconveniences, I’m copying the whole csv from that post here:
I think 24% for “there will be a big discontinuity at some point in the history of a field” is pretty reasonable, though I have some quibbles with your estimates (detailed below). I think there are a bunch of additional facts that make me go a lot lower than that on the specific question we have with AI:
We’re talking about a discontinuity at a specific moment along the curve—not just “there will be a discontinuity in AI progress at some point”, but specifically “there will be a discontinuity around the point where AI systems first reach approximately human-level intelligence”. Assigning 5% to a discontinuity at a specific region of the curve can easily be compatible with 24% of a discontinuity overall.
We also know that the field of AI has been around for 60 years and that there is a lot of effort being put into building powerful AI systems. I expect that the more effort is being put into something, the more the low-hanging fruit / “secrets” are already plucked, and the less likely discontinuities are.
We can’t currently point to anything that seems like it should cause a discontinuity in the future. It seems to me like for many of the “physics hack” style of discontinuity, the discontinuity would have been predictable in advance (I’m thinking especially of nukes and spaceflight here). Though possibly this is just hindsight bias on my part.
We were talking about a huge discontinuity—that the first time we destroy cities, we will also destroy the Earth. (And I think we’re talking about a similarly large discontinuity in the AI case, though I’m not actually sure.) These intuitively feel way larger than your big discontinuities. Though as a counterpoint, I also think AI will be a way bigger deal than most of the other technologies, so a similar discontinuity in some underlying trend could lead to a much bigger discontinuity in terms of impact. (Still, if we talk about “smaller” discontinuities like 1-year doubling of GDP before 4-year doubling of GDP, I put more probability on it, relative to something like “the world looks pretty similar to today’s world, and then everyone drops dead”.)
(All of these together would push me way lower than 5%, if ignoring model uncertainty / “maybe I’m wrong” + noting that the future is hard to predict.)
It seems to me that the biggest point of disagreement is on (3), and this is why in the conversation I keep coming back to
I do think “look at historical examples” is a good thing to do, so I’ll go through each of your discontinuities in turn. Note that I know very little about most of these areas and haven’t even read the Wikipedia page for most of them, so lots of things I say could be completely wrong:
Aviation: I assume you’re talking about the zero-to-one discontinuity from “no flight” to “flight”? I do agree that we’ll see zero-to-one discontinuities on particular AI capabilities, e.g. language models learn to do arithmetic quite discontinuously. This seems pretty irrelevant to the case with AI. (Notably, the Wright flyer didn’t have much of an impact on things people cared about, and not that many people were working on flight to my knowledge.)
Nukes: Agree that this is a zero-to-one discontinuity from “physics hack” (but note that it did involve a huge amount of effort). Unlike the Wright flyer, it did have a huge impact on things people cared about.
Petroleum: Not sure what the discontinuity is—is it that “amount of petroleum used” increased discontinuously? If so, I very much expect such discontinuities to happen; they’ll happen any time a better technology replaces a worse technology. (Put another way, the reason to expect continuous progress is that there is optimization pressure on the metric and so the low-hanging fruit / “secrets” have already been taken; there wouldn’t have been much optimization pressure on “amount of petroleum used”.) I also expect that there has been a discontinuity in “use of neural nets for machine learning”, and similarly I expect AI coding assistants will become hugely more popular in the nearish future. The relevant question to me is whether we saw a discontinuity in something like “ability to heat your home” or “ability to travel long distances” or something like that.
Printing: Going off of AI Impacts’ investigation, I’d count this one. I think partly this was because there wasn’t much effort going into this. (It looks like when the printing press was invented we were producing ~50,000 manuscripts per year, using about 25,000 person-years of labor. Presumably much much less than that was going into optimizing the process, similarly to how R&D in machine translation is way way lower than the size of the translation market.)
Spaceflight: Agree that this is a zero-to-one discontinuity from “physics hack” (but note that it did involve a huge amount of effort). Although if you’re saying that more resources were spent on it, same comment as petroleum.
Rockets: I’d love to see numbers, but this does sound like a discontinuity that’s relevant to the case with AI. I’d also want to know how much people cared about it (plausibly quite a lot).
Aluminium: Looking at AI Impacts, I think I’m at “probably a discontinuity relevant to the case with AI, but not a certainty”.
Radar: I’d need more details about what happened here, but it seems like this is totally consistent with the “continuous view” (since “with more effort you got more progress” seems like a pretty central conclusion of the model ).
Radio: Looking at AI Impacts, I think this one looks more like “lots of crazy fast progress that is fueled by frequent innovations”, which seems pretty compatible with the “continuous view” on AI. (Though I’m sympathetic to the critique that the double exponential curve chosen by AI Impacts is an instance of finding a line by which things look smooth; I definitely wouldn’t have chosen that functional form in advance of seeing the data.)
Automobile: I’d assume that there was very little optimization on “speed of production of cars” at the time, given that cars had only just become commercially viable, so a discontinuity seems unsurprising.
Transistors: Wikipedia claims “the MOSFET was also initially slower and less reliable than the BJT”, and further discussion seems to suggest that its benefits were captured with further work and effort (e.g. it was a twentieth the size of a BJT by the 1990s, decades after invention). This sounds like it wasn’t a discontinuity to me. What metric did you think it was a discontinuity for?
PCR: I don’t know enough about the field—sounds like a zero-to-one discontinuity (or something very close, where ~no one was previously trying to do the things PCR does). See aviation.
I am also skeptical that the MOSFET produced a discontinuity. Plausibly, what we care about is the number of computations we can do per dollar. Nordhaus (2007) provides data showing that that the rate of progress on this metric was practically unchanged at the time the MOSFET was invented, in 1959.
Your prior is for discontinuities throughout the entire development of a technology, so shouldn’t your prior be for discontinuity at any point during the development of AI, rather than discontinuity at or around the specific point when AI becomes AGI? It seems this would be much lower, though we could then adjust upward based on the particulars of why we think a discontinuity is more likely at AGI.
Yep.
The idea of “hardware overhang” from Chinese printing tech seems extremely unlikely. There was almost certainly no contact between Chinese and European printers at the time. European printing tech was independently derived, and differed from its Chinese precursors in many many important details. Gutenberg’s most important innovation, the system of mass-producing types from a matrix (and the development of specialized lead alloys to make this possible), has no Chinese precedent. The economic conditions were also very different; most notably, the Europeans had cheap paper from the water-powered paper mill (a 13th-century invention), which made printing a much bigger industry even before Gutenberg.
You’re probably right given that I didn’t look all that much into it, changed.
When I think about the history of cryptography, I consider there to be a discontinuity at the 1976 invention of Diffie–Hellman key exchange. Diffie–Hellman key exchange was the first asymmetric cryptosystem. Asymmetric cryptography is important because it allows secure communication between two strangers over a externally-monitored channel. Before Diffie–Hellman, it wasn’t even clear that asymmetric cryptography was even possible.
That’s an interesting example because of how it’s a multiple discovery. You know about the secret GCHQ invention, of course, but “Merkle’s Puzzles” was invented before D-H: Merkle’s Puzzles aren’t quite like what you think of by public-key crypto (it’s proof of work, in a sense, not trapdoor function, I guess you might say), but they do give a constructive proof that it is possible to achieve the goal of bootstrapping a shared secret. It’s not great and can’t be improved, but it does it.
It’s also another example of ‘manufacturing an academic pedigree’. You’ll see a lot of people say “Merkle’s Puzzles inspired D-H”, tracing out a Whig intellectual history of citations only mildly vexed by the GCHQ multiple. But, if you ask Merkle or Diffie themselves, about how Diffie knew about Merkle’s then-unpublished work, they’ll tell you that Diffie didn’t know about the Puzzles when he invented D-H—because nobody understood or liked Merkle’s Puzzles (Diffie amusingly explains it as ‘tl;dr’). What Diffie did know was that there was a dude called Merkle somewhere who was convinced it could be done, while Diffie was certain it was impossible, and the fact that this dude disagreed with him nagged at him and kept him thinking about the problem until he solved it (making asymmetric/public-key at least a triple: GCHQ, Merkle, and D-H). Reason is adversarial.
(I made edits to your table in width and color, for readability.)
Thanks!