So8res comments on A case for courage, when speaking of AI danger

So8res 27 Jun 2025 16:25 UTC
46 points
29

I don’t think the weed/local turf wars really cause the problems here, why do you think that?

The hypothesized effect is: people who have been engaged in the weeds/turf wars think of themselves as “uncertain” (between e.g. the 25%ers and the 90+%ers) and forget that they’re actually quite confident about some proposition like “this whole situation is reckless and crazy and Earth would be way better off if we stopped”. And then there’s a disconnect where (e.g.) an elected official asks a local how bad things look, and they answer while mentally inhabiting the uncertain position (“well I’m not sure whether it’s 25%ish or 90%ish risk”), and all they manage to communicate is a bunch of wishy-washy uncertainty. And (on this theory) they’d do a bunch better if they set aside all the local disagreements and went back to the prima-facie “obvious” recklessness/insanity of the situation and tried to communicate about that first. (It is, I think, usually the most significant part to communicate!)
- Buck 27 Jun 2025 16:45 UTC
  8 points
  −13
  Parent
  Yeah I am pretty skeptical that this is a big effect—I don’t know anyone who I think speaks ~~unconfidently~~ without the courage of their convictions when talking to audiences like elected officials for this kind of reason—but idk.
  - Richard_Ngo 30 Jun 2025 6:05 UTC
    32 points
    20
    Parent
    Whoa, this seems very implausible to me. Speaking with the courage of one’s convictions in situations which feel high-stakes is an extremely high bar, and I know of few people who I’d describe as consistently doing this.
    If you don’t know anyone who isn’t in this category, consider whether your standards for this are far too low.
    - Lukas Finnveden 1 Jul 2025 20:56 UTC
      9 points
      5
      Parent
      I read Buck’s comment as consistent with him knowing people who speak without the courage of their convictions for other reasons than stuff like “being uncertain between 25% doom and 90% doom”.
  - So8res 27 Jun 2025 17:03 UTC
    17 points
    13
    Parent
    Huh! I’ve been in various conversations with elected officials and have had the sense that most people speak without the courage of their convictions (which is not quite the same thing as “confidence”, but which is more what the post is about, and which is the property I’m more interested in discussing in this comment section, and one factor of the lack of courage is broadcasting uncertainty about things like “25% vs 90+%” when they could instead be broadcasting confidence about “this is ridiculous and should stop”). In my experience, it’s common to the point of others expressing explicit surprise when someone does and it works (as per the anecdote in the post).
    
    I am uncertain to what degree we’re seeing very different conversations, versus to what degree I just haven’t communicated the phenomena I’m talking about, versus to what degree we’re making different inferences from similar observations.
    - Buck 28 Jun 2025 1:29 UTC
      11 points
      −3
      Parent
      I don’t think your anecdote supports that it’s important to have the courage of your convictions when talking. I think the people I know who worked on SB-1047 are totally happy to say “it’s ridiculous that these companies don’t have any of the types of constraints that might help mitigate extreme risks from their work” without wavering because of the 25%-vs-90% thing. I interpret your anecdote as being evidence about which AI-concerned-beliefs go over well, not about how you should say them. (Idk how important this is, np if you don’t want to engage further.)
      - So8res 28 Jun 2025 17:03 UTC
        49 points
        28
        Parent
        A few claims from the post (made at varying levels of explicitness) are:
        
        1 . Often people are themselves motivated by concern X (ex: “the race to superintelligence is reckless and highly dangerous”) and decide to talk about concern Y instead (ex: “AI-enabled biorisks”), perhaps because they think it is more palatable.
        
        2 . Focusing on the “palatable” concerns is a pretty grave mistake.
        
        2a. The claims Y are often not in fact more palatable; people are often pretty willing to talk about the concerns that actually motivate you.
        
        2b. When people try talking about the concerns that actually motivate them while loudly signalling that they think their ideas are shameful and weird, this is not a terribly good test of claim (2a).
        
        2c. Talking about claims other than the key ones comes at a steep opportunity cost.
        
        2d. Talking about claims other than the key ones risks confusing people who are trying to make sense of the situation.
        
        2e. Talking about claims other than the key ones risks making enemies of allies (when those would-be allies agree about the high-stakes issues and disagree about how to treat the mild stuff).
        
        2f. Talking about claims other than the key ones triggers people’s bullshit detectors.
        
        3 . Nate suspects that many people are confusing “I’d be uncomfortable saying something radically different from the social consensus” with “if I said something radically different from the social consensus then it would go over poorly”, and that this conflation is hindering their ability to update on the evidence.
        
        3a. Nate is hopeful that evidence of many people’s receptiveness to key concerns will help address this failure.
        
        3b. Nate suspects that various tropes and mental stances associated with the word “couarge” are perhaps a remedy to this particular error, and hopes that advice like “speak with the courage of your convictions” is helpful for remembering the evidence in (3a) and overcoming the error of (3).
        
        I think the people I know who worked on SB-1047 are totally happy to say “it’s ridiculous that these companies don’t have any of the types of constraints that might help mitigate extreme risks from their work” without wavering
        
        I don’t think this is in much tension with my model.
        
        For one thing, that whole sentence has a bunch of the property I’d call “cowardice”. “Risks” is how one describes tail possibilities; if one believes that a car is hurtling towards a cliff-edge, it’s a bit cowardly to say “I think we should perhaps talk about gravity risks” rather than “STOP”. And the clause “help mitigate extreme risks from their work” lets the speaker pretend the risks are tiny on Monday and large on Tuesday; it doesn’t extend the speaker’s own neck.
        
        For another thing, willingness to say that sort of sentence when someone else brings up the risks (or to say that sort of sentence in private) is very different from putting the property I call “courage” into the draft legislation itself.
        
        I observe that SB-1047 itself doesn’t say anything about a big looming extinction threat that requires narrowly targeted legislation. It maybe gives the faintest of allusions to it, and treads no closer than that. The bill lacks the courage of the conviction “AI is on track to ruin everything.” Perhaps you believe this simply reflects the will of Scott Weiner. (And for the record: I think it’s commendable that Senator Weiner put forth a bill that was also trying to get a handle on sub-extinction threats, though it’s not what I would have done.) But my guess is that the bill would be written very differently if the authors believed that the whole world knew how insane and reckless the race to superintelligence is. And “write as you would if your ideas were already in the Overton window” is not exactly what I mean by “have the courage of your convictions”, but it’s close.
        
        (This is also roughly my answer to the protest “a lot of the people in D.C. really do care about AI-enabled biorisk a bunch!”. If the whole world was like “this race to superintelligence is insane and suicidal; let’s start addressing that”, would the same people be saying “well our first priority should be AI-enabled biorisk; we can get to stopping the suicide race later”? Because my bet is that they’re implicitly focusing on issues that they think will fly, and I think that this “focus on stuff you think will fly” calculation is gravely erroneous and harmful.)
        
        As for how the DC anecdote relates: it gives an example of people committing error (1), and it provides fairly direct evidence for claims (2a) and (2c). (It also provided evidence for (3a) and (3b), in that the people at the dinner all expressed surprise to me post-facto, and conceptualized this pretty well in terms of ‘courage’, and have been much more Nate!courageous at future meetings I’ve attended, to what seem to me like good effects. Though I didn’t spell those bits out in the original post.)
        
        I agree that one could see this evidence and say “well it only shows that courage works for that exact argument in that exact time period” (as is mentioned in a footnote, and as is a running theme throughout the post). Various other parts of the post provide evidence for other claims (e.g. the Vance, Cruz, and Sacks references provide evidence for (2d), (2e), and (2f)). I don’t expect this post to be wholly persuasive, and indeed a variant of it has been sitting in my drafts folder for years. I’m putting it out now in part (because I am trying to post more in the lead-up to the book and in part) because folks have started saying things like “this is slightly breaking my model of where things are in the overton window”, which causes me to think that maybe the evidence has finally piled up high enough that people can start to internalize hypothesis (3), even despite how bad and wrong it might feel for them to (e.g.) draft legislation in accordance with beliefs of theirs that radically disagree with perceived social consensus.
        What links here?
        A case for courage, when speaking of AI danger by So8res (27 Jun 2025 2:15 UTC; 519 points)
        Max H's comment on TurnTrout’s shortform feed by TurnTrout (30 Jun 2025 4:58 UTC; 19 points)
        Buck 29 Jun 2025 14:18 UTC
        13 points
        0
        Parent
        Ok. I don’t think your original post is clear about which of these many different theses it has, or which points it thinks are evidence for other points, or how strongly you think any of them.
        I don’t know how to understand your thesis other than “in politics you should always pitch people by saying how the issue looks to you, Overton window or personalized persuasion style be damned”. I think the strong version of this claim is obviously false. Though maybe it’s good advice for you (because it matches your personality profile) and perhaps it’s good advice for many/most of the people we know.
        I think that making SB-1047 more restrictive would have made it less likely to pass, because it would have made it easier to attack and fewer people would agree that it’s a step in the right direction. I don’t understand who you think would have flipped from negative to positive on the bill based on it being stronger—surely not the AI companies and VCs who lobbyied against it and probably eventually persuaded Newsom to veto?
        I feel like the core thing that we’ve seen in DC is that the Overton window has shifted, almost entirely as a result of AI capabilities getting better, and now people are both more receptive to some of these arguments and more willing to acknowledge their sympathy.
        So8res 29 Jun 2025 14:48 UTC
        49 points
        36
        Parent
        To be clear, my recommendation for SB-1047 was not “be basically the same bill but talk about extinction risks and levy a few more restrictions on the labs”, but rather “focus very explicitly on the extinction threat; say ‘this bill is trying to address a looming danger described by a variety of scientists and industry leaders’ or suchlike, shape the bill differently to actually address the extinction threat straightforwardly”.
        
        I don’t have a strong take on whether SB-1047 would have been more likely to pass in that world. My recollection is that, back when I attempted to give this advice, I said I thought it would make the bill less likely to pass but more likely to have good effects on the conversation (in addition to it being much more likely to matter in cases where it did pass). But that could easily be hindsight bias; it’s been a few years. And post facto, the modern question of what is “more likely” depends a bunch on things like how stochastic you think Newsom is (we already observed that he vetoed the real bill, so I think there’s a decent argument that a bill with different content has a better chance even if it’s a lower than our a-priori odds on SB-1047), though that’s a digression.
        
        I do think that SB-1047 would have had a substantially better effect on the conversation if it was targeted towards the “superintelligence is on track to kill us all” stuff. I think this is a pretty low bar because I think that SB-1047 had an effect that was somewhere between neutral and quite bad, depending on which follow-on effects you attribute to it. Big visible bad effects that I think you can maybe attribute to it are Cruz and Vance polarizing against (what they perceived as) attempts to regulate a budding normal tech industry, and some big dems also solidifying a position against doing much (e.g. Newsom and Pelosi). More insidiously and less clearly, I suspect that SB-1047 was a force holding the Overton window together. It was implicitly saying “you can’t talk about the danger that AI kills everyone and be taken seriously” to all who would listen. It was implicitly saying “this is a sort of problem that could be pretty-well addressed by requiring labs to file annual safety reports” to all who would listen. I think these are some pretty false and harmful memes.
        
        With regards to the Overton window shifting: I think this effect is somewhat real, but I doubt it has as much importance as you imply.
        
        For one thing, I started meeting with various staffers in the summer of 2023, and the reception I got is a big part of why I started pitching Eliezer on the world being ready for a book (a project that we started in early 2024). Also, the anecdote in the post is dated to late 2024 but before o3 or DeepSeek. Tbc, it did seem to me like the conversation changed markedly in the wake of DeepSeek, but it changed from a baseline of elected officials being receptive in ways that shocked onlookers.
        
        For another thing, in my experience, anecdotes like “the AI cheats and then hides it” or experimental results like “the AI avoids shutdown sometimes” are doing as much if not more of the lifting as capabilities advances. (Though I think that’s somewhat of a digression.)
        
        For a third thing, I suspect that one piece of the puzzle you’re missing is how much the Overton window has been shifting because courageous people have been putting in the legwork for the last couple years. My guess is that the folks putting these demos and arguments in front of members of congress are a big part of why we’re seeing the shift, and my guess is that the ones who are blunt and courageous are causing the shift to happen moreso (and are causing it to happen in a better direction).
        
        I’m worried about the people who go in and talk only about (e.g.) AI-enabled biorisk while avoiding saying a word about superintelligence or loss-of-control. I think this happens pretty often and that it comes with a big opportunity cost in the best cases, and that it’s actively harmful in the worst cases—when (e.g.) it reinforces a silly Overton window, or when it shuts down some congress member’s budding thoughts about the key problems, or when it orients them towards silly issues. I also think it spends down future credibility; I think it risks exasperating them when you try to come back next year and say that we’re on track to all die. I also think that the lack of earnestness is fishy in a noticeable way (per the link in the OP).
        
        [edited for clarity and to fix typos, with apologies about breaking the emoji-reaction highlights]
        Buck 30 Jun 2025 0:10 UTC
        5 points
        0
        Parent
        Ok. I agree with many particular points here, and there are others that I think are wrong, and others where I’m unsure.
        For what it’s worth, I think SB-1047 would have been good for AI takeover risk on the merits, even though (as you note) it isn’t close to all we’d want from AI regulation.
    - Buck 8 Jul 2025 1:23 UTC
      2 points
      0
      Parent
      Yeah to reiterate, idk why you think
      broadcasting uncertainty about things like “25% vs 90+%” when they could instead be broadcasting confidence about “this is ridiculous and should stop”
      My guess is that the main reason they broadcast uncertainty is because they’re worried that their position is unpalatable, rather than because of their internal sense of uncertainty.
      - Richard_Ngo 8 Jul 2025 18:11 UTC
        17 points
        −6
        Parent
        FWIW I broadcast the former rather than the latter because from the 25% perspective there are many possible worlds which the “stop” coalition ends up making much worse, and therefore I can’t honestly broadcast “this is ridiculous and should stop” without being more specific about what I’d want from the stop coalition.
        A (loose) analogy: leftists in Iran who confidently argued “the Shah’s regime is ridiculous and should stop”. It turned out that there was so much variance in how it stopped that this argument wasn’t actually a good one to confidently broadcast, despite in some sense being correct.
        peterbarnett 8 Jul 2025 18:36 UTC
        45 points
        21
        Parent
        Maybe it’s hard to communicate nuance, but it seems like there’s a crazy thing going on where many people in the AI x-risk community think something like “Well obviously I wish it would stop, and the current situation does seem crazy and unacceptable by any normal standards of risk management. But there’s a lot of nuance in what I actually think we should do, and I don’t want to advocate for a harmful stop.”
        And these people end up communicating to external people something like “Stopping is a naive strategy, and continuing (maybe with some safeguards etc) is my preferred strategy for now.”
        This seems to miss out the really important part where they would actually want to stop if we could, but it seems hard and difficult/nuanced to get right.
        Richard_Ngo 8 Jul 2025 23:55 UTC
        16 points
        1
        Parent
        Yeah, I agree that it’s easy to err in that direction, and I’ve sometimes done so. Going forward I’m trying to more consistently say the “obviously I wish people just wouldn’t do this” part.
        Though note that even claims like “unacceptable by any normal standards of risk management” feel off to me. We’re talking about the future of humanity, there is no normal standard of risk management. This should feel as silly as the US or UK invoking “normal standards of risk management” in debates over whether to join WW2.
- Ronny Fernandez 7 Jul 2025 22:50 UTC
  0 points
  −2
  Parent
  To check, do you have particular people in mind for this hypothesis? Seems kinda rude to name them here, but could you maybe send me some guesses privately? I currently don’t find this hypothesis as stated very plausible, or like sure maybe, but I think it’s a relatively small fraction of the effect.