Philosophy and Physics BSc, AI MSc at Edinburgh, starting a PhD at King’s College London. Interested in metaethics, anthropics/general philosophy and technical AI Safety.
Some good news for the claim public awareness of X risk in general should go up after coronavirus—the economist cover story: https://www.economist.com/node/21788546?frsc=dg|e, https://www.economist.com/node/21788589?frsc=dg|e
In writing this answer I somehow completely forgot to mention Garrett Jones’ new book 10% Less Democracy, which essentially goes over every idea listed above along with many others!
The way I see it (I live in the UK) is that most western European governments are able to respond to a sufficiently unambiguous warning from experts that disaster is coming within a month if we carry on as normal, and react strongly to that warning, but not much more. That’s why we get blunt instruments like local lockdowns and a slow-to-scale contact tracing system that could easily be made to work with, say 20 times its current budget.
That’s because the Morituri Nolumus Mori effect applies everywhere, but has to combine with some basic collective ability to perceive physical facts to work properly, as you say and I argued in that post.
The MNM effect is what we credit instead of clever planning or reasoning, for why things aren’t as bad as they could be—the differences between e.g. America and Germany are due to any level of planning at all
I think the extreme version of your ‘no ability to perceive physical facts’ claim applies to some US states, the US federal government and maybe Brazil and the various developing countries that just don’t have good enough information flow for people to stay informed, but doesn’t apply to Europe, let alone East Asia.
But I strongly suspect that when things do get New York-bad in those other states, we will see individual and state responses trying to keep in under control that will bring the R back to near-1, even if it seems hopeless right now.
After reading your summary of the difference (maybe just a difference in emphasis) between ‘Paul slow’ vs ‘continuous’ takeoff, I did some further simulations. A low setting of d (highly continuous progress) doesn’t give you a paul slow condition on its own, but it is relatively easy to replicate a situation like this:
There will be a complete 4 year interval in which world output doubles, before the first 1 year interval in which world output doubles. (Similarly, we’ll see an 8 year doubling before a 2 year doubling, etc.)
What we want is a scenario where you don’t get intermediate doubling intervals at all in the discontinuous case, but you get at least one in the continuous case. Setting s relatively high appears to do the trick.
Here is a scenario where we have very fast post-RSI growth with s=5,c=1,I0=1 and I_AGI=3. I wrote some more code to produce plots of how long each complete interval of doubling took in each scenario. The ‘default’ rate with no contribution from RSI was 0.7. All the continuous scenarios had two complete doubling intervals over intermediate time frames before the doubling time collapsed to under 0.05 on the third doubling. The discontinuous model simply kept the original doubling interval until it collapsed to under 0.05 on the third doubling interval. It’s all in this graph.
Let’s make the irresponsible assumption that this actually applies to the real economy, with the current growth mode, non-RSI condition being given by the ‘slow/no takeoff’, s=0 condition.
The current doubling time is a bit over 23 years. In the shallow continuous progress scenario (red line), we get a 9 year doubling, a 4 year doubling and then a ~1 year doubling. In the discontinuous scenario (purple line) we get 2 23 year doublings and then a ~1 year doubling out of nowhere. In other words, this fairly random setting of the parameters (this was the second set I tried) gives us a Paul slow takeoff if you make the assumption that all of this should be scaled to years of economic doubling. You can see that graph here.
we will eventually hit some sort of limit on growth, even with “just” exponential growth—but this limit could be quite far beyond what we have achieved so far. See also this related post.
One major intuitive finding that came out of that post was that most of the adjustments I made to the speed and continuity of the takeoff seem fairly marginal—I think that if you presented any one of those trajectories in isolation you would call them exceptionally fast.
I strongly suspect that as well as disagreements about discontinuities, there are very strong disagreements about ‘post-RSI speed’ - maybe over orders of magnitude.
This is what the curves look like if s (the effective ‘power’ of RSI) is set to 0.1 - the takeoff is much slower even if RSI comes about fairly abruptly.
Rohin’s opinion: I enjoyed this post; it gave me a visceral sense for what hyperbolic models with noise look like (see the blog post for this, the summary doesn’t capture it). Overall, I think my takeaway is that the picture used in AI risk of explosive growth is in fact plausible, despite how crazy it initially sounds.
One thing this post led me to consider is that when we bring together various fields, the evidence for ‘things will go insane in the next century’ is stronger than any specific claim about (for example) AI takeoff. What is the other evidence?
We’re probably alone in the universe, and anthropic arguments tend to imply we’re living at an incredibly unusual time in history. Isn’t that what you’d expect to see in the same world where there is a totally plausible mechanism that could carry us a long way up this line, in the form of AGI and eternity in six hours? All the pieces are already there, and they only need to be approximately right for our lifetimes to be far weirder than those of people who were e.g. born in 1896 and lived to 1947 - which was weird enough, but that should be your minimum expectation.
In general, there are three categories of evidence that things are likely to become very weird over the next century, or that we live at the hinge of history:
1) Specific mechanisms around AGI—possibility of rapid capability gain, and arguments from exploratory engineering
2) Economic and technological trend-fitting predicting explosive growth in the next century
3) Anthropic and Fermi arguments suggesting that we live at some extremely unusual time
All of these are evidence for such a claim. 1) is because a superintelligent AGI takeoff is just a specific example for how the hinge occurs. 3) is already directly arguing for that, but how does 2) fit in with 1) and 3)?
There is something a little strange about calling a fast takeoff from AGI and whatever was driving superexponential growth throughout all history the same trend—there is some huge cosmic coincidence that causes there to always be superexponential growth—so as soon as population growth + growth in wealth per capita or whatever was driving it until now runs out in the great stagnation (which is visible as a tiny blip on the RHS of the double-log plot), AGI takes over and pushes us up the same trend line. That’s clearly not possible, so there would have to be some factor responsible for both if AGI is what takes us up the rest of that trend line—a factor that was at work in the founding of Jericho but predestined that AGI would be invented and cause explosive growth in the 21st century, rather than the 19th or the 23rd.
For AGI to be the driver of the rest of that growth curve, there has to be a single causal mechanism that keeps us on the same trend and includes AGI as its final step—if we say we are agnostic about what that mechanism is, we can still call 2) evidence for us living at the hinge point, though we have to note that there is a huge blank spot in need of explanation. Is there anything that can fill it to complete the picture?
The mechanism proposed in the article seems like it could plausibly include AGI.
If technology is responsible for the growth rate, then reinvesting production in technology will cause the growth rate to be faster. I’d be curious to see data on what fraction of GWP gets reinvested in improved technology and how that lines up with the other trends.
But even though the drivers seem superficially similar—they are both about technology, the claim is that one very specific technology will generate explosive growth, not that technology in general will—it seems strange that AGI would follow the same growth curve caused by reinvesting more GWP in improving ordinary technology which doesn’t improve your own ability to think in the same way that AGI would.
As for precise timings, the great stagnation (last 30ish years) just seems like it would stretch out the timeline a bit, so we shouldn’t take the 2050s seriously—as much as the last 70 years work on an exponential trend line there’s really no way to make it fit overall as that post makes clear.
They do disagree about locality, yes, but as far as I can tell that is downstream of the assumption that there won’t be a very abrupt switch to a new growth mode. A single project pulling suddenly ahead of the rest of the world would happen if the growth curve is such that with a realistic amount (a few months) of lead time you can get ahead of everyone else.
So the obvious difference in predictions is that e.g. Paul/Robin think that takeoff will occur across many systems in the world while MIRI thinks it will occur in a single system. That is because MIRI thinks that RSI is much more of an all-or-nothing capability than the others, which in turn is because they think AGI is much more likely to depend on a few novel, key insights that produce sudden gains in capability. That was the conclusion of my post.
In the past I’ve called Locality a practical discontinuity—from the outside world’s perspective, does a single project explode out of nowhere? Whether you get a practical discontinuity doesn’t just depend on whether progress is discontinuous. If you get a discontinuity to RSI capability then you do get a practical discontinuity, but that is a sufficient, not necessary condition. If the growth curve is steep enough you might get a practical discontinuity anyway.
Perhaps Eliezer-2008 believed that there would be a discontinuity in returns on optimization leading to a practical discontinuity/local explosion but Eliezer-2020 (since de-emphasising RSI) just thinks we will get a local explosion somehow, either from a discontinuity or sufficiently fast continuous progress.
My graphs above do seem to support that view—even most of the ‘continuous’ scenarios seem to have a fairly abrupt and steep growth curve. I strongly suspect that as well as disagreements about discontinuities, there are very strong disagreements about ‘post-RSI speed’ - maybe over orders of magnitude.
This is what the curves look like if s is set to 0.1 - the takeoff is much slower even if RSI comes about fairly abruptly.
Further to this point—there is something a little strange about calling a fast takeoff from AGI and whatever was driving superexponential growth throughout all of history the same phenomenon—if true there is some huge cosmic coincidence that causes there to always be superexponential growth—so as soon as population growth + growth in wealth per capita or whatever was driving it until now runs out in the great stagnation (which is visible as a tiny blip on the RHS of the double-log plot), AGI takes over and pushes us up the same trend line. That’s clearly not possible, so there would have to be some factor responsible for both if AGI is what takes us up the rest of that trend line.
In general, there are three categories of evidence that things are likely to become very weird over the next century, or that we live at the hinge of history in some sense:
1) Specific mechanisms around AGI—possibility of rapid capability gain
2) Economic and technological trend-fitting predicting a singularity around 2050
All of these are also arguments for the more general claim that we live at the hinge of history. 1) is because a superintelligent AGI takeoff is just a specific example for how the hinge occurs, and it is plausible for much more specific reasons. 3) is already directly arguing for that, but how does 2) fit in with 1) and 3)? For AGI to be the driver of the rest of that growth curve, there has to be a single causal mechanism that keeps us on the same trend and includes AGI as its final step—if we say we are agnostic about what that mechanism is, we can still call 2) evidence for us living at the hinge point, though we have to note that there is a huge blank spot in need of explanation—what phenomenon causes the right technologies to appear to continue the superexponential trend all the way from 10,000 BCE to the arrival of AGI?
On the point about ‘Deterioration of collective epistemology’, and how it might interact with an impending risk, we have some recent evidence in the form of the Coronavirus response.
It’s important to note the Sleepwalk bias/Morituri Nolumus Mori effect’s potential role here—the way I conceptualised it, sufficiently terrible collective epistemology can vitiate any advantage you might expect from the MNM effect/discounting sleepwalk bias, but it has to be so bad that current danger is somehow rendered invisible. In other words, the MNM effect says the quality of our collective epistemology and how bad the danger is aren’t independent—we can get slightly smarter in some relevant ways if the stakes go up, though there do appear to be some levels of impaired collective epistemology it is hard to recover from even for high stakes—if the information about risk is effectively or actually inaccessible we don’t respond to it.
On the other hand, the MNM effect requires leaders and individuals to have access to information about the state of the world right now (i.e. how dangerous are things at the moment). Even in countries with reasonably free flow of information this is not a given. If you accept Eliezer Yudkowksy’s thesis that clickbait has impaired our ability to understand a persistent, objective external world then you might be more pessimistic about the MNM effect going forward. Perhaps for this reason, we should expect countries with higher social trust, and therefore more ability for individuals to agree on a consensus reality and understand the level of danger posed, to perform better. Japan and the countries in Northern Europe like Denmark and Sweden come to mind, and all of them have performed better than the mitigation measures employed by their governments would suggest.
It might seem the MNM hypothesis doesn’t fit terribly well with people voluntarily choosing to go to that rally in Oklahoma (or, to take an example from the other aisle, the uncritical support certain other outdoor gatherings received from many people who ought to know better), but I actually did say something at the end of my post that seems to explain both:
Stuart Russell gave his list of roadblocks, which is relevant as he (might) have just made a claim that was falsified by GPT3, in that same interview -
The first thing is that the Go board is fully observable. You can see the entire state of the world that matters. And of course in the real world there’s lots of stuff you don’t see and don’t know. Some of it you can infer by accumulating information over time, what we call state estimation, but that turns out to be quite a difficult problem. Another thing is that we know all the rules of Go, and of course in the real world, you don’t know all the rules, you have to learn a lot as you go along. Another thing about the Go board is that despite the fact that we think of it as really complicated, it’s incredibly simple compared to the real world. At any given time on the Go board there’s a couple of hundred legal moves, and the game lasts for a couple hundred moves.
And if you said, well, what are the analogous primitive actions in the real world for a human being? Well, we have 600 muscles and we can actuate them maybe about 10 times per second each. Your brain probably isn’t able to do that, but physically that’s what could be your action space. And so you actually have then a far greater action space. And you’re also talking about… We often make plans that last for many years, which is literally trillions of primitive actions in terms of muscle actuations. Now we don’t plan those all out in detail, but we function on those kinds of timescales. Those are some of the ways that Go and the real world differ. And what we do in AI is we don’t say, okay, I’ve done Go, now I’m going to work on suicide Go, and now I’m going to work on chess with three queens.
What we try to do is extract the general lessons. Okay, we now understand fairly well how to handle that whole class of problems. Can we relax the assumptions, these basic qualitative assumptions about the nature of the problem? And if you relax all the ones that I listed, and probably a couple more that I’ve got
So dealing with partial observability, discovering new action sets, managing mental activity (?) and some others. This seems close to the list in an older post I wrote:
Stuart Russell’s List
human-like language comprehension
discovering new action sets
managing its own mental activity
For reference, I’ve included two capabilities we already have that I imagine being on a similar list in 1960
perception and object recognition
efficient search over known facts
If AlphaStar is evidence that partial observability isn’t going to be a problem, is GPT3 similarly evidence that language comprehension isn’t going to be a problem, since GPT3 can do things like simple arithmetic? That leaves cumulative learning, discovering action sets and managing mental activity on Stuart’s list.
See also this post.
The part of my post that is relevant to AI alignment is right at the end, but I say something similar to Rohin, that we actually have significantly mitigated the effects of Coronavirus but have still failed in a certain specific way -
The lesson to be learned is that there may be a phase shift in the level of danger posed by certain X-risks—if the amount of advance warning or the speed of the unfolding disaster is above some minimal threshold, even if that threshold would seem like far too little time to do anything given our previous inadequacy, then there is still a chance for the MNM effect to take over and avert the worst outcome. In other words, AI takeoff with a small amount of forewarning might go a lot better than a scenario where there is no forewarning, even if past performance suggests we would do nothing useful with that forewarning.
More speculatively, I think we can see the MNM effect’s influence in other settings where we have consistently avoided the very worst outcomes despite systematic inadequacy—Anders Sandberg referenced something like it when he was discussing the probability of nuclear war. There have been many near misses when nuclear war could have started, implying that we can’t have been lucky over and over. Instead that there has been a stronger skew towards interventions that halt disaster at the last moment, compared to not-the-last-moment:
The increase is so monotonic that either the data’s wrong, we’re going to experience a major break with the past in the mid 2040s or its galactic time when I’m in early middle age. One thing this post led me to consider is that when we bring together everything, the evidence for ‘things will go insane in the next century’ is stronger than the evidence for any specific scenario as to how. This isn’t the only evidence for the broad thesis of ‘things are going to go crazy over the next decades’, where crazy is defined as more rapid change than we saw over the previous century.
Treat this like a detective story—bring in disparate clues. We’re probably alone in the universe, and anthropic arguments tend to imply we’re living at an incredibly unusual time in history. Isn’t that what you’d expect to see in the same world where there is a totally plausible mechanism that could carry us a long way up this line, in the form of AGI and eternity in six hours? It’s like—all the pieces are already there, and they only need to be approximately right for our lifetimes to be far weirder than those of people who were e.g. born in 1896 and lived to 1947 - which was weird enough, but that should be your minimum expectation
EDIT: on the point about AI, I just checked to see if there were any recent updates and now we have Image GPT. Heck.
A possible example of the Ernest Rutherford effect (respected scientist says a thing isn’t going to happen and then the next day it does), Stuart Russell speaking in a recent podcast
Deep learning systems are needing, even for these relatively simple concepts, thousands, tens of thousands, millions of examples, and the idea within deep learning seems to be that well, the way we’re going to scale up to more complicated things like learning how to write an email to ask for a job, is that we’ll just have billions or trillions of examples, and then we’ll be able to learn really, really complicated concepts. But of course the universe just doesn’t contain enough data for the machine to learn direct mappings from perceptual inputs or really actually perceptual input history. So imagine your entire video record of your life, and that feeds into the decision about what to do next, and you have to learn that mapping as a supervised learning problem. It’s not even funny how unfeasible that is. The longer the deep learning community persists in this, the worse the pain is going to be when their heads bang into the wall.
I could be wrong but GPT3 probably could write a passable job application letter
My first thought was that they put some convolutional layers in to preprocess the images and then used the GPT architecture, but no, it’s literally just GPT again....
Does this maybe give us evidence the brain isn’t anywhere near a peak of generality, since we use specialised circuits for processing image data (which convolutional layers were based off of)
Some factors that seem important for whether or not you get the MNM effect—rate of increase of the danger (sudden, not gradual), intuitive understanding of the danger, level of social trust and agreement over facts, historical memory of the disaster, how certain the threat is, coordination problems, how dangerous the threat is, how tractable the problem seems
From reading your post—the sleepwalk bias does seem to be the mirror-image of the Morituri Nolumus Mori effect; that we tend to systematically underweight strong, late reactions. One difference is that I was thinking of both individual and policy responses whilst your post focusses on policy, but that’s in large part because most of the low-frequency high-damage risks we commonly talk[ed] about are X-risks that can be dealt with only at the level of policy. I also note that I got at a few of the same factors as you that might affect the strength of such a reaction:
The catastrophe is arriving too fast for actors to react.
It is unclear whether the catastrophe will in fact occur, or it is at least not very observable for the relevant actors (the financial crisis, possibly AGI).
The possible disaster, though observable in some sense, is not sufficiently salient (especially to voters) to override more immediate concerns (climate change).
There are conflicts (World War I) and/or free-riding problems (climate change) which are hard to overcome.
The problem is technically harder than initially thought.
The speed issue I discussed in conclusions and I obliquely referred to the salience issue in talking about ‘ability to understand consensus reality’ and that we have pre-existing instincts around purity and disgust that would help a response to something like a pandemic. The presence of free-rider problems I didn’t discuss. How the speed/level of difficulty interacts with the response I did mention—talking about the hypotheticals where R0 was 2 or 8, for example.
Those differences aside, it seems like we got at the same phenomenon independently.
I’m curious about whether you made any advance predictions about likely outcomes based on your understanding of the ‘sleepwalk bias’. I made a light suggestion that things might go better than expected in mid-March, but I can’t really call it a prediction. The first time I explicitly said ‘we were wrong’ was when a lot of evidence had already come in—in April.
This is a tricky problem. The first-order answer seems to be ‘have the right people in power’, but that’s not an actionable strategy. However, it’s amazing what a difference just one or two people can make—apparently a major reason the UK didn’t delay its lockdown even further and risk ending up like the US is just because of Dominic Cummings.
The two main angles are either making the marketplace of ideas / electoral system select for foresight and sanity more effectively or building institutions with specific remits that can stand aside from such pressures and make the right choices anyway. The first is really hard and the second is really dangerous. However, neither are impossible.
For the first, there’s ordinary electoral reform. An interesting alternative was given in Against Democracy by Jason Brennan—he proposes a new form of epistocracy to better reach higher-quality decisions—you can judge his scheme for yourself.
For the second, building competent independent institutions and then handing off power, the track record is pretty mixed. Independent central banks come to mind as a good example, the recent horrible Coronavirus debacle with the CDC, FDA or Public Health England as an especially bad example. For how to do that sort of thing correctly, you might also want to look at all the things Dominic Cummings has proposed, starting with e.g. this, or this article on Westminster dysfunction. He likes prediction markets, but not exclusively—he talks about building decentralised institutions that can operate with a large degree of independence.
On the specific angle of being more sane with respect to X-risks, I tend to favour the second approach (independent institutions) because I think it likely has a bigger effect and is easier to pull off than raising the society-wide sanity waterline. Toby Ord spoke a lot about this in ‘The Precipice’. As for why, here’s Scott Alexander:
Average national IQ correlates well with GDP per capita and other measures of development. But is average national IQ really the right number to look at? “Smart fraction theory” suggests we should instead look at the range of top IQs, since the smartest people are most likely to drive national growth by inventing things or starting businesses or governing well. Now Heiner Rindermann and James Thompson (names you may recognize!) have given the hypothesis its most complete test so far, and found that yes, IQ at the 95th percentile correlates better with national development than at the 50th percentile. But I am a little skeptical of their results...
Having elite opinion be non-crazy matters a lot in situations like the one we’re in right now. Don’t make ‘we need to improve public discourse’ your plan A for avoiding this level of chaos. So as suggested here, we should hand off more and more stuff to expert boards with limited remits, follow the example of independent Central Banks which didn’t turn into French-revolution style rationalist tyranny over the masses - starting with everything to do with catastrophic risks. Someone in the UK government apparently took that suggestion seriously. Just don’t get Steven Pinker involved.