Buck comments on A case for courage, when speaking of AI danger

Buck 27 Jun 2025 15:21 UTC
84 points
47
Both Soares and me get mixed reviews for our social presentation, so you might want to take all this with a grain of salt. But here’s my two cents of response.
I agree it’s good to not seem ashamed of something you’re saying. I think this is mostly a special case of how it’s good to be personally confident. (See Soares here for some good advice on what it means to be confident in a confusing situation.) One reason is that it’s really helpful to convey to your conversational partner that, despite the fact you’re interested in what they think, you’ll be fine and chill regardless of how they respond to you; this lets them relax and e.g. say their true objections if they want to.
But I think it’s generally a mistake to act as if beliefs are “obvious” if in fact they aren’t obvious to your audience. That is, I think that you should talk differently when saying the following different types of statements:
- Statements that your audience already agrees with
- Statements that your audience already disagrees with
- Statements that your audience hasn’t thought about yet, but that they’ll immediately agree with upon consideration
- Statements that your audience hasn’t thought about yet, but that they’ll not really have an opinion about when you say them
- Statements that your audience has heard about and feels unsure about.
This is mostly because people are more comfortable if they think you understand their state. If you act as if a statement will be obvious to them, they’ll implicitly wonder whether you think they’re an idiot, whether you’re very confused about how many people agree with you, whether you’re an ideologue. They’ll feel like you’re not following the social script that gives them affordance to say “actually that’s not obvious, why do you think that?”
I think that when someone asks you about AI doom, you should say your concerns plainly, but while making it clear that a lot of the things you’re saying might not be obvious to the person you’re talking to, and preferably signposting which you think are more obvious. E.g. for me I generally try to note in my words and tone of voice that I think “AI is a huge deal” is a simpler thing to be persuaded of than “the AI might want to take our stuff”.
I personally find it really annoying when people act as if something they’re saying is obvious to me, when it is in fact contentious.
(See here for a blog post criticizing me for acting as if weird beliefs are obvious in a talk :P I gave that talk intending to more explain than persuade, hoping that I could invite pushback from the audience that we can talk through. But it didn’t come across that way, unfortunately, probably because I didn’t signal well enough that I was expecting and hoping for pushback.)
- So8res 27 Jun 2025 15:53 UTC
  54 points
  31
  Parent
  I agree that it’s usually helpful and kind to model your conversation-partner’s belief-state (and act accordingly).
  
  And for the avoidance of doubt: I am not advocating that anyone pretend they think something is obvious when they in fact do not.
  
  By “share your concerns as if they’re obvious and sensible”, I was primarily attempting to communicate something more like: I think it’s easy for LessWrong locals to get lost in arguments like whether AI might go fine because we’re all in a simulation anyway, or confused by turf wars about whether AI has a 90+% chance of killing us or “only” a ~25% chance. If someone leans towards the 90+% model and gets asked their thoughts on AI, I think it’s worse for them to answer in a fashion that’s all wobbly and uncertain because they don’t want to be seen as overconfident against the 25%ers, and better for them to connect back to the part where this whole situation (where companies are trying to build machine superintelligence with very little idea of what they’re doing) is wacky and insane and reckless, and speak from there.
  
  I don’t think one should condescend about the obviousness of it. I do think that this community is generally dramatically failing to make the argument “humanity is building machine superintelligence while having very little idea of what it’s doing, and that’s just pretty crazy on its face” because it keeps getting lost in the weeds (or in local turf wars).
  
  And I was secondarily attempting to communicate something like: I think our friends in the policy world tend to cede far too much social ground. A bunch of folks in DC seem to think that the views of (say) Yann LeCun and similar is the scientific consensus with only a few radicals protesting, whereas the actual facts is that “there’s actually a pretty big problem here” is much closer to consensus, and that a lack of scientific consensus is a negative sign rather than a positive sign in a situation like this one (because it’s an indication that the scientific field has been able to get this far without really knowing what the heck it’s doing, which doesn’t bode well if it goes farther). I think loads of folks are mentally framing themselves as fighting for an unpopular fringe wacky view when that’s not the case, and they’re accidentally signaling “my view is wacky and fringe” in cases where that’s both false and harmful.
  
  (I was mixing both these meanings into one sentence because I was trying to merely name my old spiels rather than actually giving them, because presenting the old spiels was not the point of the post. Perhaps I’ll edit the OP to make this point clearer, with apologies to future people for confusion caused if the lines that Buck and I are quoting have disappeared.)
  - Buck 27 Jun 2025 16:15 UTC
    19 points
    −18
    Parent
    I do think that this community is generally dramatically failing to make the argument “humanity is building machine superintelligence while having very little idea of what it’s doing, and that’s just pretty crazy on its face” because it keeps getting lost in the weeds (or in local turf wars).
    I don’t think the weed/local turf wars really cause the problems here, why do you think that?
    The weeds/local turf wars seem like way smaller problems for AI-safety-concerned people communicating that the situation seems crazy than e.g. the fact that a bunch of the AI safety people work at AI companies.
    And I was secondarily attempting to communicate something like: I think our friends in the policy world tend to cede far too much social ground.
    Idk, seems plausible.
    - So8res 27 Jun 2025 16:25 UTC
      46 points
      29
      Parent
      
      I don’t think the weed/local turf wars really cause the problems here, why do you think that?
      
      The hypothesized effect is: people who have been engaged in the weeds/turf wars think of themselves as “uncertain” (between e.g. the 25%ers and the 90+%ers) and forget that they’re actually quite confident about some proposition like “this whole situation is reckless and crazy and Earth would be way better off if we stopped”. And then there’s a disconnect where (e.g.) an elected official asks a local how bad things look, and they answer while mentally inhabiting the uncertain position (“well I’m not sure whether it’s 25%ish or 90%ish risk”), and all they manage to communicate is a bunch of wishy-washy uncertainty. And (on this theory) they’d do a bunch better if they set aside all the local disagreements and went back to the prima-facie “obvious” recklessness/insanity of the situation and tried to communicate about that first. (It is, I think, usually the most significant part to communicate!)
      - Buck 27 Jun 2025 16:45 UTC
        8 points
        −13
        Parent
        Yeah I am pretty skeptical that this is a big effect—I don’t know anyone who I think speaks ~~unconfidently~~ without the courage of their convictions when talking to audiences like elected officials for this kind of reason—but idk.
        Richard_Ngo 30 Jun 2025 6:05 UTC
        32 points
        20
        Parent
        Whoa, this seems very implausible to me. Speaking with the courage of one’s convictions in situations which feel high-stakes is an extremely high bar, and I know of few people who I’d describe as consistently doing this.
        If you don’t know anyone who isn’t in this category, consider whether your standards for this are far too low.
        Lukas Finnveden 1 Jul 2025 20:56 UTC
        9 points
        5
        Parent
        I read Buck’s comment as consistent with him knowing people who speak without the courage of their convictions for other reasons than stuff like “being uncertain between 25% doom and 90% doom”.
        So8res 27 Jun 2025 17:03 UTC
        17 points
        13
        Parent
        Huh! I’ve been in various conversations with elected officials and have had the sense that most people speak without the courage of their convictions (which is not quite the same thing as “confidence”, but which is more what the post is about, and which is the property I’m more interested in discussing in this comment section, and one factor of the lack of courage is broadcasting uncertainty about things like “25% vs 90+%” when they could instead be broadcasting confidence about “this is ridiculous and should stop”). In my experience, it’s common to the point of others expressing explicit surprise when someone does and it works (as per the anecdote in the post).
        
        I am uncertain to what degree we’re seeing very different conversations, versus to what degree I just haven’t communicated the phenomena I’m talking about, versus to what degree we’re making different inferences from similar observations.
        Buck 28 Jun 2025 1:29 UTC
        11 points
        −3
        Parent
        I don’t think your anecdote supports that it’s important to have the courage of your convictions when talking. I think the people I know who worked on SB-1047 are totally happy to say “it’s ridiculous that these companies don’t have any of the types of constraints that might help mitigate extreme risks from their work” without wavering because of the 25%-vs-90% thing. I interpret your anecdote as being evidence about which AI-concerned-beliefs go over well, not about how you should say them. (Idk how important this is, np if you don’t want to engage further.)
        So8res 28 Jun 2025 17:03 UTC
        49 points
        28
        Parent
        A few claims from the post (made at varying levels of explicitness) are:
        
        1 . Often people are themselves motivated by concern X (ex: “the race to superintelligence is reckless and highly dangerous”) and decide to talk about concern Y instead (ex: “AI-enabled biorisks”), perhaps because they think it is more palatable.
        
        2 . Focusing on the “palatable” concerns is a pretty grave mistake.
        
        2a. The claims Y are often not in fact more palatable; people are often pretty willing to talk about the concerns that actually motivate you.
        
        2b. When people try talking about the concerns that actually motivate them while loudly signalling that they think their ideas are shameful and weird, this is not a terribly good test of claim (2a).
        
        2c. Talking about claims other than the key ones comes at a steep opportunity cost.
        
        2d. Talking about claims other than the key ones risks confusing people who are trying to make sense of the situation.
        
        2e. Talking about claims other than the key ones risks making enemies of allies (when those would-be allies agree about the high-stakes issues and disagree about how to treat the mild stuff).
        
        2f. Talking about claims other than the key ones triggers people’s bullshit detectors.
        
        3 . Nate suspects that many people are confusing “I’d be uncomfortable saying something radically different from the social consensus” with “if I said something radically different from the social consensus then it would go over poorly”, and that this conflation is hindering their ability to update on the evidence.
        
        3a. Nate is hopeful that evidence of many people’s receptiveness to key concerns will help address this failure.
        
        3b. Nate suspects that various tropes and mental stances associated with the word “couarge” are perhaps a remedy to this particular error, and hopes that advice like “speak with the courage of your convictions” is helpful for remembering the evidence in (3a) and overcoming the error of (3).
        
        I think the people I know who worked on SB-1047 are totally happy to say “it’s ridiculous that these companies don’t have any of the types of constraints that might help mitigate extreme risks from their work” without wavering
        
        I don’t think this is in much tension with my model.
        
        For one thing, that whole sentence has a bunch of the property I’d call “cowardice”. “Risks” is how one describes tail possibilities; if one believes that a car is hurtling towards a cliff-edge, it’s a bit cowardly to say “I think we should perhaps talk about gravity risks” rather than “STOP”. And the clause “help mitigate extreme risks from their work” lets the speaker pretend the risks are tiny on Monday and large on Tuesday; it doesn’t extend the speaker’s own neck.
        
        For another thing, willingness to say that sort of sentence when someone else brings up the risks (or to say that sort of sentence in private) is very different from putting the property I call “courage” into the draft legislation itself.
        
        I observe that SB-1047 itself doesn’t say anything about a big looming extinction threat that requires narrowly targeted legislation. It maybe gives the faintest of allusions to it, and treads no closer than that. The bill lacks the courage of the conviction “AI is on track to ruin everything.” Perhaps you believe this simply reflects the will of Scott Weiner. (And for the record: I think it’s commendable that Senator Weiner put forth a bill that was also trying to get a handle on sub-extinction threats, though it’s not what I would have done.) But my guess is that the bill would be written very differently if the authors believed that the whole world knew how insane and reckless the race to superintelligence is. And “write as you would if your ideas were already in the Overton window” is not exactly what I mean by “have the courage of your convictions”, but it’s close.
        
        (This is also roughly my answer to the protest “a lot of the people in D.C. really do care about AI-enabled biorisk a bunch!”. If the whole world was like “this race to superintelligence is insane and suicidal; let’s start addressing that”, would the same people be saying “well our first priority should be AI-enabled biorisk; we can get to stopping the suicide race later”? Because my bet is that they’re implicitly focusing on issues that they think will fly, and I think that this “focus on stuff you think will fly” calculation is gravely erroneous and harmful.)
        
        As for how the DC anecdote relates: it gives an example of people committing error (1), and it provides fairly direct evidence for claims (2a) and (2c). (It also provided evidence for (3a) and (3b), in that the people at the dinner all expressed surprise to me post-facto, and conceptualized this pretty well in terms of ‘courage’, and have been much more Nate!courageous at future meetings I’ve attended, to what seem to me like good effects. Though I didn’t spell those bits out in the original post.)
        
        I agree that one could see this evidence and say “well it only shows that courage works for that exact argument in that exact time period” (as is mentioned in a footnote, and as is a running theme throughout the post). Various other parts of the post provide evidence for other claims (e.g. the Vance, Cruz, and Sacks references provide evidence for (2d), (2e), and (2f)). I don’t expect this post to be wholly persuasive, and indeed a variant of it has been sitting in my drafts folder for years. I’m putting it out now in part (because I am trying to post more in the lead-up to the book and in part) because folks have started saying things like “this is slightly breaking my model of where things are in the overton window”, which causes me to think that maybe the evidence has finally piled up high enough that people can start to internalize hypothesis (3), even despite how bad and wrong it might feel for them to (e.g.) draft legislation in accordance with beliefs of theirs that radically disagree with perceived social consensus.
        What links here?
        A case for courage, when speaking of AI danger by So8res (27 Jun 2025 2:15 UTC; 519 points)
        Max H's comment on TurnTrout’s shortform feed by TurnTrout (30 Jun 2025 4:58 UTC; 19 points)
        Buck 29 Jun 2025 14:18 UTC
        13 points
        0
        Parent
        Ok. I don’t think your original post is clear about which of these many different theses it has, or which points it thinks are evidence for other points, or how strongly you think any of them.
        I don’t know how to understand your thesis other than “in politics you should always pitch people by saying how the issue looks to you, Overton window or personalized persuasion style be damned”. I think the strong version of this claim is obviously false. Though maybe it’s good advice for you (because it matches your personality profile) and perhaps it’s good advice for many/most of the people we know.
        I think that making SB-1047 more restrictive would have made it less likely to pass, because it would have made it easier to attack and fewer people would agree that it’s a step in the right direction. I don’t understand who you think would have flipped from negative to positive on the bill based on it being stronger—surely not the AI companies and VCs who lobbyied against it and probably eventually persuaded Newsom to veto?
        I feel like the core thing that we’ve seen in DC is that the Overton window has shifted, almost entirely as a result of AI capabilities getting better, and now people are both more receptive to some of these arguments and more willing to acknowledge their sympathy.
        So8res 29 Jun 2025 14:48 UTC
        49 points
        36
        Parent
        To be clear, my recommendation for SB-1047 was not “be basically the same bill but talk about extinction risks and levy a few more restrictions on the labs”, but rather “focus very explicitly on the extinction threat; say ‘this bill is trying to address a looming danger described by a variety of scientists and industry leaders’ or suchlike, shape the bill differently to actually address the extinction threat straightforwardly”.
        
        I don’t have a strong take on whether SB-1047 would have been more likely to pass in that world. My recollection is that, back when I attempted to give this advice, I said I thought it would make the bill less likely to pass but more likely to have good effects on the conversation (in addition to it being much more likely to matter in cases where it did pass). But that could easily be hindsight bias; it’s been a few years. And post facto, the modern question of what is “more likely” depends a bunch on things like how stochastic you think Newsom is (we already observed that he vetoed the real bill, so I think there’s a decent argument that a bill with different content has a better chance even if it’s a lower than our a-priori odds on SB-1047), though that’s a digression.
        
        I do think that SB-1047 would have had a substantially better effect on the conversation if it was targeted towards the “superintelligence is on track to kill us all” stuff. I think this is a pretty low bar because I think that SB-1047 had an effect that was somewhere between neutral and quite bad, depending on which follow-on effects you attribute to it. Big visible bad effects that I think you can maybe attribute to it are Cruz and Vance polarizing against (what they perceived as) attempts to regulate a budding normal tech industry, and some big dems also solidifying a position against doing much (e.g. Newsom and Pelosi). More insidiously and less clearly, I suspect that SB-1047 was a force holding the Overton window together. It was implicitly saying “you can’t talk about the danger that AI kills everyone and be taken seriously” to all who would listen. It was implicitly saying “this is a sort of problem that could be pretty-well addressed by requiring labs to file annual safety reports” to all who would listen. I think these are some pretty false and harmful memes.
        
        With regards to the Overton window shifting: I think this effect is somewhat real, but I doubt it has as much importance as you imply.
        
        For one thing, I started meeting with various staffers in the summer of 2023, and the reception I got is a big part of why I started pitching Eliezer on the world being ready for a book (a project that we started in early 2024). Also, the anecdote in the post is dated to late 2024 but before o3 or DeepSeek. Tbc, it did seem to me like the conversation changed markedly in the wake of DeepSeek, but it changed from a baseline of elected officials being receptive in ways that shocked onlookers.
        
        For another thing, in my experience, anecdotes like “the AI cheats and then hides it” or experimental results like “the AI avoids shutdown sometimes” are doing as much if not more of the lifting as capabilities advances. (Though I think that’s somewhat of a digression.)
        
        For a third thing, I suspect that one piece of the puzzle you’re missing is how much the Overton window has been shifting because courageous people have been putting in the legwork for the last couple years. My guess is that the folks putting these demos and arguments in front of members of congress are a big part of why we’re seeing the shift, and my guess is that the ones who are blunt and courageous are causing the shift to happen moreso (and are causing it to happen in a better direction).
        
        I’m worried about the people who go in and talk only about (e.g.) AI-enabled biorisk while avoiding saying a word about superintelligence or loss-of-control. I think this happens pretty often and that it comes with a big opportunity cost in the best cases, and that it’s actively harmful in the worst cases—when (e.g.) it reinforces a silly Overton window, or when it shuts down some congress member’s budding thoughts about the key problems, or when it orients them towards silly issues. I also think it spends down future credibility; I think it risks exasperating them when you try to come back next year and say that we’re on track to all die. I also think that the lack of earnestness is fishy in a noticeable way (per the link in the OP).
        
        [edited for clarity and to fix typos, with apologies about breaking the emoji-reaction highlights]
        Buck 30 Jun 2025 0:10 UTC
        5 points
        0
        Parent
        Ok. I agree with many particular points here, and there are others that I think are wrong, and others where I’m unsure.
        For what it’s worth, I think SB-1047 would have been good for AI takeover risk on the merits, even though (as you note) it isn’t close to all we’d want from AI regulation.
        Buck 8 Jul 2025 1:23 UTC
        2 points
        0
        Parent
        Yeah to reiterate, idk why you think
        broadcasting uncertainty about things like “25% vs 90+%” when they could instead be broadcasting confidence about “this is ridiculous and should stop”
        My guess is that the main reason they broadcast uncertainty is because they’re worried that their position is unpalatable, rather than because of their internal sense of uncertainty.
        Richard_Ngo 8 Jul 2025 18:11 UTC
        17 points
        −6
        Parent
        FWIW I broadcast the former rather than the latter because from the 25% perspective there are many possible worlds which the “stop” coalition ends up making much worse, and therefore I can’t honestly broadcast “this is ridiculous and should stop” without being more specific about what I’d want from the stop coalition.
        A (loose) analogy: leftists in Iran who confidently argued “the Shah’s regime is ridiculous and should stop”. It turned out that there was so much variance in how it stopped that this argument wasn’t actually a good one to confidently broadcast, despite in some sense being correct.
        peterbarnett 8 Jul 2025 18:36 UTC
        45 points
        21
        Parent
        Maybe it’s hard to communicate nuance, but it seems like there’s a crazy thing going on where many people in the AI x-risk community think something like “Well obviously I wish it would stop, and the current situation does seem crazy and unacceptable by any normal standards of risk management. But there’s a lot of nuance in what I actually think we should do, and I don’t want to advocate for a harmful stop.”
        And these people end up communicating to external people something like “Stopping is a naive strategy, and continuing (maybe with some safeguards etc) is my preferred strategy for now.”
        This seems to miss out the really important part where they would actually want to stop if we could, but it seems hard and difficult/nuanced to get right.
        Richard_Ngo 8 Jul 2025 23:55 UTC
        16 points
        1
        Parent
        Yeah, I agree that it’s easy to err in that direction, and I’ve sometimes done so. Going forward I’m trying to more consistently say the “obviously I wish people just wouldn’t do this” part.
        Though note that even claims like “unacceptable by any normal standards of risk management” feel off to me. We’re talking about the future of humanity, there is no normal standard of risk management. This should feel as silly as the US or UK invoking “normal standards of risk management” in debates over whether to join WW2.
      - Ronny Fernandez 7 Jul 2025 22:50 UTC
        0 points
        −2
        Parent
        To check, do you have particular people in mind for this hypothesis? Seems kinda rude to name them here, but could you maybe send me some guesses privately? I currently don’t find this hypothesis as stated very plausible, or like sure maybe, but I think it’s a relatively small fraction of the effect.
- Knight Lee 28 Jun 2025 9:51 UTC
  10 points
  3
  Parent
  Sorry for budging in, but I can’t help but notice I agree with both what you and So8res are saying, but I think you aren’t arguing about the same thing.
  You seem to be talking about the dimension of “confidence,” “obviousness,” etc. and arguing that most proponents of AI concern seem to have enough of it, and shouldn’t increase it too much.
  So8res seems to be talking about another dimension which is harder to name. “Frank futuristicness” maybe? Though not really.
  If you adjust your “frank futuristicness” to an absurdly high setting, you’ll sound a bit crazy. You’ll tell lawmakers “I’m not confident, and this isn’t obvious, but I think that unless we pause AI right now, we risk a 50% chance of building a misaligned superintelligence. It might use nanobots to convert all the matter in the universe into paperclips, and the stars and galaxies will fade one by one.”
  But if you adjust your “frank futuristicness” to an absurdly low setting, you’ll end up being ineffective. You’ll say “I am confident, and this is obvious: we should regulate AI companies more because they are less regulated than other companies. For example the companies which research vaccines have to jump through so many clinical trial loops, meanwhile AI models are just as untested as vaccines and really, they have just as much potential to harm people. And on top of that we can’t even prove that humanity is safe from AI, so we should be careful. I don’t want to give an examples for how exactly humanity isn’t safe from AI because it might sound like sci-fi, so I’ll only talk about it abstractly. My point is, we should follow the precautionary principle and be slow because it doesn’t hurt.”
  There is an optimum level of “frank futuristicness” between the absurdly high setting and the absurdly low setting. But most people are far below this optimum level.
  - Buck 29 Jun 2025 14:20 UTC
    5 points
    −2
    Parent
    Ok; what do you think of Soares’s claim that SB-1047 should have been made stronger and the connections to existential risk should have been made more clearly? That seems probably false to me.
    - Richard_Ngo 30 Jun 2025 6:08 UTC
      13 points
      7
      Parent
      I think that if it were to go ahead, it should have been made stronger and clearer. But this wouldn’t have been politically feasible, and therefore if that were the standard being aimed for it wouldn’t have gone ahead.
      This I think would have been better than the outcome that actually happened.
    - Knight Lee 29 Jun 2025 17:29 UTC
      5 points
      −10
      Parent
      To be honest, I don’t know.
      All I know is that a lot of organizations seem to be shy talking about the AI takeover risk, and the endorsements the book got surprised me, regarding how receptive government officials are. (Considering how little cherry-picking they did.)
      My very uneducated guess is that Newsom vetoed the bill because he was more of a consequentialist/Longtermist than the cause-driven lawmakers who passed the bill, so one can argue the failure mode was a “lack of appeal to consequentialist interests.” One might argue “it passed by cause-driven lawmakers by a wide margin, but got blocked by the consequentialist.” But the cause-driven vs. consequentialist motives are pure speculation, I know nothing about these people aside from Newsom’s explanation...
      - Rob Bensinger 29 Jun 2025 17:52 UTC
        19 points
        13
        Parent
        (Considering how little cherry-picking they did.)
        From my perspective, FWIW, the endorsements we got would have been surprising even if they had been maximally cherry-picked. You usually just can’t find cherries like those.