habryka comments on On the Rationality of Deterring ASI

habryka 22 Mar 2025 22:20 UTC
11 points
2
I mean, you saw people make fun of it when Eliezer said it, and then my guess is people conservatively assumed that this would generalize to the future. I’ve had conversations with people where they tried to convince me that Eliezer mentioning kinetic escalation was one of the worst things that anyone has ever done for AI policy, and they kept pointing to twitter threads and conversations where opponents made fun of it as evidence. I think there clearly was something real here, but I also think people really fail to understand the communication dynamics here.
- deep 27 Mar 2025 15:51 UTC
  1 point
  −6
  Parent
  You’re missing some ways Eliezer could have predictably done better with the Time article, if he were framing it for national security folks (rather than an attempt at brutal honesty, or perhaps most acccurately a cri de coeur).
  @davekasten—Eliezer wasn’t arguing for bombing as retaliation for a cyberattack. Rather, as a preemptive measure against noncompliant AI developments:
  If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
  If you zoom out several layers of abstraction, that’s not too different from the escalation ladder concept described in this paper. A crucial difference, though, is that Eliezer doesn’t mention escalation ladders at all—or other concepts that would help neutral readers be like “OK, this guy gets how big a lift all this would be, and he has some ideas for building blocks to enact it”. Examples include “how do you get an international agreement on this stuff”, “how do you track all the chips”, “how do you prevent people building super powerful AI despite the compute threshold lowering”, “what about all the benefits of AI that we’d be passing up” (besides a brief mention that narrow-AI-for-bio might be worth it), “how confident can we be that we’d know if someone was contravening the deal”.
  Second, there was a huge inferential gap to this idea of AGI as key national security threat—there’s still a large one today, despite rhetoric around AGI. And Eliezer doesn’t do enough meeting in the middle here.
  He gives the high-level argument that to him is sufficient, but is/was not convincing to most people—that AI by some metrics is growing fast, in principle can be superhuman, etc. Unfortunately most people in government don’t have the combination of capacity, inclination, and time to assess these kinds of first-principles arguments for themselves, and they really need concreteness in the form of evidence or expert opinion.
  Also, frankly, I just think Eliezer is wrong to be as confident in his view of “doom by default” as he is, and the strategic picture looks very very different if you place say 20% or even 50% probability on this.
  If I had Eliezer’s views I’d probably focus on evals and red-teaming type research to provide fire alarms, convince technical folks p(doom) was really high, and then use that technical consensus or quasi-consensus to shift policy. This isn’t totally distinct from what Eliezer did in the past with more abstract arguments, and it kinda worked (there are a lot of people with >10% p(doom) in policy world, there was that 2023 moment where everyone was talking about it). I think in worlds where Eliezer’s right, but timelines are say more like 2030 than 2027, there’s real scope for people to be convinced of high p(doom) as AI advances, and that could motivate some real policy change.
  - deep 27 Mar 2025 16:05 UTC
    1 point
    0
    Parent
    I think it’s fine that Eliezer wrote it, though. Not maximally strategic by any means, but the man’s done a lot and he’s allowed his hail mary outreach plans.
    I think at the time I and others were worried this would look bad for “safety as a whole”, but at this point concerns about AI risk are common and varied enough, and people with those concerns have often strong local reputations w/ different groups. So this is no longer as big of an issue, which I think is really healthy for AI risk folks—it means we can have Pause AI and Eliezer and Hendrycks and whoever all doing their own things, able to say “no I’m not like those folks, here’s my POV”, and not feeling like they should get a veto over each other. And in retrospect I think we should have anticipated and embraced this vision earlier on.
    tbh, this is part of what I think went wrong with EA—a shared sense that community reputation was a resource everyone benefitted from and everyone wanted to protect and polish, that people should get vetoes over what each other do and say. I think it’s healthy that there’s much less of a centralized and burnished “EA brand” these days, and much more of a bunch of people following their visions of the good. Though there’s still the problem of Open Phil as a central node in the network, through which reputation effects flow.