Buck comments on Tim Hua’s Shortform

Buck 22 Dec 2025 6:30 UTC
LW: 13 AF: 10
9
AF
I do judge comments more harshly when they’re phrased confidently—your tone is effectively raising the stakes on your content being correct and worth engaging with.
If I agreed with your position, I’d probably have written something like:
I don’t think this is an important source of risk. I think that basically all the AI x-risk comes from AIs that are smart enough that they’d notice their own overconfidence (maybe after some small number of experiences being overconfident) and then work out how to correct for it.
There are other epistemic problems that I think might affect the smart AIs that pose x-risk, but I don’t think this is one of them.
In general, this seems to me like a minor capability problem that is very unlikely to affect dangerous AIs. I’m very skeptical that trying to address such problems is helpful for mitigating x-risk.
What changed? I think it’s only slightly more hedged. I personally like using “I think” everywhere for the reason I say here and the reason Ben says in response. To me, my version also more clearly describes the structures of my beliefs and how people might want to argue with me if they want to change my mind (e.g. by saying “basically all the AI x-risk comes from” instead of “The kind of intelligent agent that is scary”, I think I’m stating the claim in a way that you’d agree with, but that makes it slightly more obvious what I mean and how to dispute my claim—it’s a lot easier to argue about where x-risk comes from than whether something is “scary”).
I also think that the word “stupid” parses as harsh, even though you’re using it to describe something on the object level and it’s not directed at any humans. That feels like the kind of word you’d use if you were angry when writing your comment, and didn’t care about your interlocutors thinking you might be angry.
I think my comment reads as friendlier and less like I want the person I’m responding to to feel bad about themselves, or like I want onlookers to expect social punishment if they express opinions like that in the future. Commenting with my phrasing would cause me to feel less bad if it later turned out I was wrong, which communicates to the other person that I’m more open to discussing the topic.
(Tbc, sometimes I do want the person I’m responding to to feel bad about themselves, and I do want onlookers to expect social punishment if they behave like the person I was responding to; e.g. this is true in maybe half my interactions with Eliezer. Maybe that’s what you wanted here. But I think that would be a mistake in this case.)
- Jeremy Gillen 23 Dec 2025 17:27 UTC
  11 points
  1
  Parent
  I am confident about this, so I’m okay with you judging accordingly.
  I appreciate your rewrite. I’ll treat it as something to aspire to, in future. I agree that it’s easier to engage with.
  I was annoyed when writing. Angry is too strong a word for it though, it’s much more like “Someone is wrong on the internet!”. It’s a valuable fuel and I don’t want to give it up. I recognise that there are a lot of situations that call for hiding mild annoyance, and I’ll try to do it more habitually in future when it’s easy to do so.
  There’s a background assumption that maybe I’m wrong to have. If I write a comment with a tone of annoyance, and you disagree with it, it would surprise me if that made you feel bad about yourself. I don’t always assume this, but I often assume it on Lesswrong because I’m among nerds for whom disagreement is normal.
  So overall, I think my current guess is that you’re trying to hold me to standards that are unnecessarily high. It seems supererogatory rather than obligatory.
  - Buck 24 Dec 2025 0:45 UTC
    5 points
    0
    Parent
    If you wrote a rude comment in response to me, I wouldn’t feel bad about myself, but I would feel annoyed at you. (I feel bad about myself when I think my comments were foolish in retrospect or when I think they were unnecessarily rude in retrospect; the rudeness of replies to me don’t really affect how I feel about myself.) Other people are more likely to be hurt by rude comments, I think.
    I wouldn’t be surprised if Tim found your comment frustrating and it made him less likely to want to write things like this in future. I don’t super agree with Tim’s post, but I do think LW is better if it’s the kind of place where people like him write posts like that (and then get polite pushback).
    I have other thoughts here but they’re not very important.
    - Tim Hua 24 Dec 2025 12:42 UTC
      4 points
      0
      Parent
      (fwiw I agree with Buck that the comment seemed unnecessarily rude and we should probably have less of rudeness on lesswrong, but I don’t feel deterred from posting.)