Neel Nanda comments on AI Doomerism in 1879

Neel Nanda 15 May 2025 1:28 UTC
79 points
69
Strong disagree voted. To me this is analogous to saying that, given that Leonardo da Vinci tried to design a flying machine and believed this to be possible, despite not really understanding aerodynamics, that the Wright brothers believing the aeroplane they designed would fly “can’t really be based on those technical details in any deep or meaningful way.”

“Maybe a thing smarter than humans will eventually displace us” is really not a very complicated argument, and no one is claiming it is. So it should be part of our hypothesis class, and various people like Turing thought of it well before modern ML. The “rationally grounded in a technical understanding of today’s deep learning systems” part is about how we update our probabilities of the hypotheses in our hypothesis class, and how we can comfortably say “yes, terrible outcomes still seem plausible”, as they did on priors without needing to look at AI systems at all (my probability is moderately lower than it would have been without looking at AIs at all, but with massive uncertainty)

Intuition and rigour agreeing is not some kind of highly suspicious gotcha
- Matthew Barnett 15 May 2025 1:44 UTC
  1 point
  1
  Parent
  “Maybe a thing smarter than humans will eventually displace us” is really not a very complicated argument, and no one is claiming it is. So it should be part of our hypothesis class, and various people like Turing thought of it well before modern ML.
  This is a claim about what is possible, but I am talking about what people claim is probable. If the core idea of “AI doomerism” is that AI doom is merely possible, then I agree: little evidence is required to believe the claim. In this case, it would be correct to say that someone from the 19th century could indeed have anticipated the arguments for AI doom being possible, as such a claim would be modest and hard to argue against.
  Yet a critical component of modern AI doomerism is not merely about what’s possible, but what is likely to occur: many people explicitly assert that AI doom is probable, not merely possible. My point is that if the core reasons supporting this stronger claim could have been anticipated in the 19th century, then it is a mistake to think that the key cruxes generating disagreement about AI doom hinge on technical arguments specific to contemporary deep learning.
  - Neel Nanda 15 May 2025 1:55 UTC
    32 points
    25
    Parent
    The way I think about it, you should have a prior distribution over doom Vs no doom, and then getting a bunch of info about current ML should update that. In my opinion, it is highly unreasonable to have a very low prior on “thing smarter than humans successfully acts significantly against our interests”, and that you should generally be highly uncertain and view this as high variance
    
    So I guess the question is how many people who think doom is very unlikely just start from a really low prior but agree with me on the empirical updates, or start from some more uncertain prior but update a bunch downwards on empirical evidence or at least reasoning about the world. Like oh, companies are rational enough that they just wouldn’t build something that would be dangerous and it’ll be easy to test for and they’ll do this testing. Historically, we’ve solved issues with technology before they arose so this will be fine. Or we can just turn it off if something gets wrong. I would consider even the notion that there exists the ability to turn it off as using information that someone would not recently have had in the 19th century
    
    My guess is that most reasonable people with low P(doom), who are willing to actually engage with probabilities here, start at at least 5% but just update down a bunch for reasons I tend to disagree with/consider wildly overconfident? But maybe you’re arguing that the disagreement stems now from priors?
    - Matthew Barnett 16 May 2025 3:04 UTC
      1 point
      1
      Parent
      You strong disagree downvoted my comment, but it’s still not clear to me that you actually disagree with my core claim. I’m not making a claim about priors, or whether it’s reasonable to think that p(doom) might be non-negligible a priori.
      My point is instead about whether the specific technical details of deep learning today are ultimately what’s driving some people’s high probability estimates of AI doom. If the intuition behind these high estimates could’ve been provided in the 19th century (without modern ML insights), then modern technical arguments don’t seem to be the real crux.
      Therefore, while you might be correct about priors regarding p(doom), or whether existing evidence reinforces high concern for AI doom, these points seem separate from my core claim about the primary motivating intuitions behind a strong belief in AI doom.
      - Neel Nanda 16 May 2025 3:26 UTC
        17 points
        5
        Parent
        (To clarify, I strong disagree voted, I haven’t downvoted at all—I still strongly disagree)
        
        I am confused and feel like I must be misunderstanding your point. It feels like you’re attempting a “gotcha” argument, but I don’t understand your point or who you’re trying to criticize. It seems like bizarre rhetorical practice. It is not a valid argument to say that “people can hold position A for bad reason X, therefore all people who hold position A also hold it for bad reason X even if they claim it is for good reason Y”. But that seems to be your argument? For A=high doom, X=weird 19th century intuition, Y=actually good technical reasons grounded in modern ML. What am I missing? If you want to argue that someone else really believes bad reason X, you need to engage with specific details of that person and why you believe they are saying false things about their beliefs.
        
        I could easily flip this argument. In the 19th century, I’m sure people said machines could never possibly be dangerous—“God will protect us” or “They are tools, and tools are always subservient to man.” or “They will never have a soul, and so can never be truly dangerous.”. This is a raw, intuition-backed argument. People today who claim to believe that AI will be safe for sophisticated technical reasons could have held these same beliefs in the 19th century, which suggests they are being dishonest. Why does your argument hold, but mine break?
        
        I also don’t actually know which people you want to criticize. My sense is that many community members with high p(doom), like Yudkowsky, developed these views 10-20 years ago and haven’t substantially updated since, so obviously they can’t come from nuanced views of modern ML. As far as I am aware they don’t seem to claim their beliefs are heavily driven by sophisticated technical reasons about current ML systems—they simply maintain their existing views. It still seems a strawman to call views formed without specific technical grounding “raw intuition-backed reactions to the idea of mechanical minds”. Like, regardless of how much you agree, “Superintelligence” clearly makes a much more sophisticated case than you imply, while predating deep learning.
        
        I’m not actually aware of anyone who claims to be afraid of just current ML systems due to specific technical reasons. The reasons for being afraid are pretty obvious, but there are very specific facts about these systems that can adjust them. Now that modern deep learning exists, some of these concerns seem validated, while others seem less significant, and new issues have arisen. This seems completely normal and exactly what you would expect? My personal view is that we should be moderately but not extremely concerned about Doom. I understand modern machine learning well, and it hasn’t substantially shifted my position in either direction. The large language model paradigm somewhat increased my optimism about safety, while the shift toward long-horizon RL somewhat increased my concern about Doom, though this development was expected eventually.
        
        Can you give some concrete examples of specific people/public statements that you are trying to criticise here? That might help ground out this disagreement.
        Matthew Barnett 16 May 2025 3:58 UTC
        9 points
        −7
        Parent
        I am confused and feel like I must be misunderstanding your point. It feels like you’re attempting a “gotcha” argument, but I don’t understand your point or who you’re trying to criticize. It seems like bizarre rhetorical practice. It is not a valid argument to say that “people can hold position A for bad reason X, therefore all people who hold position A also hold it for bad reason X even if they claim it is for good reason Y”. But that seems to be your argument?
        I think you’re overinterpreting my comment and attributing to me the least charitable plausible interpretation of what I wrote (along with most other people commenting and voting in this thread. As a general rule that I’ve learned from my time in online communities, whenever someone makes a claim on an online forum that indicates a rejection of a belief central to that forum’s philosophy, people tend to reply to that person by ruthlessly assuming the most foolish plausible interpretation of their remarks. LessWrong is no exception.)
        My actual position is simply this: if the core arguments for AI doom could have genuinely been presented and anticipated in the 19th century, then the crucial factor that actually determines whether most “AI doomers” believe in AI doom is probably something relatively abstract or philosophical, rather than specific technical arguments grounded in the details of machine learning. This does not imply that technical arguments are irrelevant, it just means they’re probably not as cruxy to whether people actually believe that doom is probable or not.
        (Also to be clear, unless otherwise indicated, in this thread I am using “belief in AI doom” as shorthand for “belief that AI doom is more likely than not” rather than “belief that AI doom is possible and at least a little bit plausible, so therefore worth worrying about.” I think these two views should generally be distinguished.)
        (To clarify, I strong disagree voted, I haven’t downvoted at all—I still strongly disagree)
        Oops, I recognize that, I just misstated it in my original comment.
        Neel Nanda 17 May 2025 1:49 UTC
        4 points
        0
        Parent
        Thanks for clarifying. I’m sorry you feel strawmanned, but I’m still fairly confused.
        
        Possibly the confusion is that you’re using AI doom to mean >50%? I personally think that it is not very reasonable to get that high based on conceptual arguments someone in the 19th century could understand, and definitely not >90%. But getting to >5% seems totally reasonable to me. I didn’t read this post as arguing that you should be >50% back in the 19th century, though I could easily imagine a given author being overconfident. And specific technical details of ML is totally enough for enough of an update to bring you above or below 50%, so this matters. I personally do not think there’s >50% of doom, but am still very concerned.