johnswentworth comments on The Case Against AI Control Research

johnswentworth 21 Jan 2025 16:07 UTC
81 points
13
Addendum: Why Am I Writing This?
Because other people asked me to. I don’t particularly like getting in fights over the usefulness of other peoples’ research agendas; it’s stressful and distracting and a bunch of work, I never seem to learn anything actually useful from it, it gives me headaches, and nobody else ever seems to do anything useful as a result of such fights. But enough other people seemed to think this was important to write, so Maybe This Time Will Be Different.
- Alex Mallen 21 Jan 2025 18:52 UTC
  24 points
  18
  Parent
  I think it’s useful and would be surprised if discussions like this post weren’t causing people to improve their models and change their behaviors.
- mattmacdermott 22 Jan 2025 13:44 UTC
  17 points
  8
  Parent
  
  and nobody else ever seems to do anything useful as a result of such fights
  
  I would guess a large fraction of the potential value of debating these things comes from its impact on people who aren’t the main proponents of the research program, but are observers deciding on their own direction.
  
  Is that priced in to the feeling that the debates don’t lead anywhere useful?
  - Buck 22 Jan 2025 16:11 UTC
    23 points
    11
    Parent
    It’s usually the case that online conversations aren’t for persuading the person you’re talking to, they’re for affecting the beliefs of onlookers.
  - johnswentworth 22 Jan 2025 16:00 UTC
    3 points
    0
    Parent
    That’s where most of the uncertainty is, I’m not sure how best to price it in (though my gut has priced in some estimate).
  - Kabir Kumar 18 Mar 2025 12:17 UTC
    1 point
    0
    Parent
    Helps me decide which research to focus on
- Seth Herd 22 Jan 2025 1:45 UTC
  6 points
  3
  Parent
  Thank you for writing this, John.
  
  It’s critical to pick good directions for research. But fighting about it is not only exhausting, it’s often counterproductive—it can make people tune out “the opposition.”
  
  In this case, you’ve been kind enough about it, and the community here has good enough standards (amazing, I think, relative to the behavior of the average modern hominid) that one of the primary proponents of the approach you’re critiquing started his reply with “thank you”.
  
  This gives me hope that we can work together and solve the several large outstanding problems.
  
  I often think of writing critiques like this, but I don’t have the standing with the community for people to take them seriously. You do.
  
  So I hope this one doesn’t cause you headaches, and thanks for doing it.
  
  Object-level discussion in a separate comment.
  - Buck 22 Jan 2025 19:09 UTC
    16 points
    8
    Parent
    I think criticisms from people without much of a reputation are often pretty well-received on LW, e.g. this one.
    - Seth Herd 22 Jan 2025 21:47 UTC
      6 points
      0
      Parent
      That’s a good example. LW is amazing that way. My previous field of computational cognitive neuroscience, and its surrounding fields, did not treat challenges with nearly that much grace or truth-seeking.
      
      I’ll quit using that as an excuse to not say what I think is important—but I will try to say it politely.
  - yams 22 Jan 2025 19:42 UTC
    1 point
    0
    Parent
    Writing (good) critiques is, in fact, a way many people gain standing. I’d push back on the part of you that thinks all of your good ideas will be ignored (some of them probably will be, but not all of them; don’t know until you try, etc).
    - Seth Herd 22 Jan 2025 21:43 UTC
      2 points
      0
      Parent
      I’m not worried about my ideas being ignored so much as actively doing harm to the group epistemics by making people irritated with my pushback, and by association, irritated with the questions I raise and therefore resistant to thinking about them.
      
      I am pretty sure that motivated reasoning does that, and it’s a huge problem for progress in existing fields. More here: Motivated reasoning/confirmation bias as the most important cognitive bias
      
      LessWrong does seem way less prone to motivated reasoning. I think this is because rationalism demands actually being proud of changing your mind. This value provides resistance but not immunity to motivated reasoning. I want to write a post about this.
      - yams 23 Jan 2025 0:16 UTC
        2 points
        0
        Parent
        If you wrote this exact post, it would have been upvoted enough for the Redwood team to see it, and they would have engaged with you similarly to how they engaged with John here (modulo some familiarity, because theyse people all know each other at least somewhat, and in some pairs very well actually).
        If you wrote several posts like this, that were of some quality, you would lose the ability to appeal to your own standing as a reason not to write a post.
        This is all I’m trying to transmit.
        [edit: I see you already made the update I was encouraging, an hour after leaving the above comment to me. Yay!]
- Chris van Merwijk 7 Feb 2025 14:15 UTC
  5 points
  0
  Parent
  Here is an additional reason why it might seem less useful than it actually is: Maybe the people whose research direction is being criticized do process the criticism and change their views, but do not publicly show that they change their mind because it seems embarrassing. It could be that it takes them some time to change their mind, and by that time there might be a bigger hurdle to letting you know that you were responsible for this, so they keep it to themselves. Or maybe they themselves aren’t aware that you were responsible.
- Hzn 21 Jan 2025 19:35 UTC
  3 points
  0
  Parent
  AI alignment research like other types of research reflects a dynamic which is potentially quite dysfunctional in which researchers doing supposedly important work receive funding from convinced donors which then raises the status of those researchers which makes their claims more convincing and these claims tend to reinforce the idea that the researchers are doing important work. I don’t know a good way around this problem. But personally I am far more skeptical of this stuff than you are.

johnswentworth comments on The Case Against AI Control Research

Addendum: Why Am I Writing This?