Yonatan Cale comments on AGI Ruin: A List of Lethalities

Yonatan Cale 14 Jun 2022 10:53 UTC
1 point
0
I’m interested in getting predictions for whether such an event would get all (known) labs to stop research for even one month (not counting things like “the internet is down so we literally can’t continue”).
I expect it won’t. You?
- SurvivalBias 14 Jun 2022 20:39 UTC
  2 points
  0
  Parent
  It might, given some luck and that all the pro-safety actors play their cards right. Assuming by “all labs” you mean “all labs developing AIs at or near to then-current limit of computational power”, or something along those lines, and by “research” you mean “practical research”, i.e. training and running models. The model I have in mind not that everyone involved will intellectually agree that such research should be stopped, but that enough percentage of public and governments will get scared and exert pressure on the labs. Consider how most of the world was able to (imperfectly) coordinate to slow Covid spread, or how nobody have prototyped a supersonic passenger jet in decades, or, again, the nuclear energy—we as a species can do such things in principle, even though often for the wrong reasons.
  I’m not informed enough to give meaningful probabilities on this, but to honor the tradition, I’d say that given a catastrophe with immediate, graphic death toll >=1mln happening in or near the developed world, I’d estimate >75% probability that ~all seriously dangerous activity will be stopped for at least a month, and >50% that it’ll be stopped for at least a year. With the caveat that the catastrophe was unambiguously attributed to the AI, think “Fukushima was a nuclear explosion”, not “Covid maybe sorta kinda plausibly escaped from the lab but well who knows”.
  - Yonatan Cale 15 Jun 2022 10:22 UTC
    1 point
    0
    Parent
    I’d be pretty happy to bet on this and then keep discussing it, wdyt? :)
    Here are my suggested terms:
    All major AI research labs that we know about (deep mind, openai, facebook research, china, perhaps a few more*)
    Stop “research that would advance AGI” for 1 month, defined not as “practical research” but as “research that will be useful for AGI coming sooner”. So for example if they stopped only half of their “useful to AGI” research, but they did it for 3 months, you win. If they stopped training models but keep doing the stuff that is the 90% bottleneck (which some might call “theoretical”), I win
    *You judge all these parameters yourself however you feel like
    I’m just assuming you agree that the labs mentioned above are currently going towards AGI, at least for the purposes of this bet. If you believe something like “openai (and the other labs) didn’t change anything about their research but hey, they weren’t doing any relevant research in the first place”, then say so now
    I might try to convince you to change your mind, or ask others to comment here, but you have the final say
    Regarding “the catastrophe was unambiguously attributed to the AI”—I ask that you judge if it was unambiguously because AI, and that you don’t rely on public discourse, since the public can’t seem to unambiguously agree on anything (like even vaccines being useful).
    I suggest we bet $20 or so mainly “for fun”
    What do you think?
    - SurvivalBias 15 Jun 2022 23:07 UTC
      2 points
      0
      Parent
      To start off, I don’t see much point in formally betting $20 on an event conditioned on something I assign <<50% probability of happening within the next 30 years (powerful AI is launched and failed catastrophically and we’re both still alive to settle the bet and there was an unambiguous attribution of the failure to the AI). I mean sure, I can accept the bet, but largely because I don’t believe it matters one way or another, so I don’t think it counts from the epistemological virtue standpoint.
      But I can state what I’d disagree with in your terms if I were to take it seriously, just to clarify my argument:
      Sounds good.
      Mostly sounds good, but I’d push back that “not actually running anything close to the dangerous limit” sounds like a win to me, even if theoretical research continues. One pretty straightforward Schelling point for a ban/moratorium on AGI research is “never train or run anything > X parameters”, with X << dangerous level at then-current paradigm. It may be easier explain to the public and politicians than many other potential limits, and this is important. It’s much easier to control too—checking that nobody collects and uses a gigashitton of GPUs [without supervision] is easier than to check every researcher’s laptop. Additionally, we’ll have nuclear weapons tests as a precedent.
      That’s the core of my argument, really. If the consortium of 200 world experts says “this happened because your AI wasn’t aligned, let’s stop all AI research”, then Facebook AI or China can tell the consortium to go fuck themselves, and I agree with your skepticism that it’d make all labs pause for even a month (see: gain of function research, covid). But if it becomes public knowledge that a catastrophe of 1mln casualties happened because of AI, then it can trigger a panic which will make both the world leaders and the public to really honestly want to restrict this AI stuff, and it will both justify and enable the draconian measures required to make every lab to actually stop the research. Similar to how panics about nuclear energy, terrorism and covid worked. I propose defining “public agreement” as “leaders of the relevant countries (defined as the countries housing the labs from p.1, so US, China, maybe UK and a couple of others) each issue a clear public statement saying that the catastrophe happened because of an unaligned AI”. This is not an unreasonable ask, they were this unanimous about quite a few things, including vaccines.