interstice comments on Tim Hua’s Shortform

interstice 21 Dec 2025 18:44 UTC
4 points
2

No. The kind of intelligent agent that is scary is the kind that would notice its own overconfidence—after some small number of experiences being overconfident—and then work out how to correct for it.

I mean, the main source of current x-risk is that humans are agents which are capable enough to do dangerous things(like making AI) but too overconfident to notice that doing so is a bad idea, no?
- Jeremy Gillen 22 Dec 2025 4:21 UTC
  15 points
  3
  Parent
  “Overconfident” gets thrown around a lot by people who just mean “incorrect”. Rarely do they mean actual systematic overconfidence. If everyone involved in building AI shifted their confidence down across the board, I’d be surprised if this changed their safety-related decisions very much. The mistakes they are making are more complicated, e.g. some people seem “underconfident” about how to model future highly capable AGI, and are therefore adopting a wait-and-see strategy. This isn’t real systematic underconfidence, it’s just a mistake (from my perspective). And maybe some are “overconfident” that early AGI will be helpful for solving future problems, but again this is just a mistake, not systemic overconfidence.
  - interstice 22 Dec 2025 5:16 UTC
    4 points
    2
    Parent
    I think that generally when people say “overconfident” they have a broader class of irrational beliefs in mind than “overly narrow confidence intervals around their beliefs”, things like bias towards thinking well of yourself can be part of it too.
    
    And maybe some are “overconfident” that early AGI will be helpful for solving future problems, but again this is just a mistake, not systemic overconfidence
    
    OK but whatever the exact pattern of irrationality is, it clearly exists simultanaeously with humans being competent enough to possibly cause x-risk. It seems plausible that AIs might share similar (or novel!) patterns of irrationality that contribute to x-risk probability while being orthogonal to alignment per se.
    - Jeremy Gillen 23 Dec 2025 16:27 UTC
      4 points
      0
      Parent
      Yes, I agree with that.