Wei Dai comments on Relitigating the Race to Build Friendly AI

Wei Dai 6 Dec 2025 2:02 UTC
LW: 3 AF: 3
1
AF

I think no one should build AGI. If someone is going to build AGI anyway, then it might be correct to make AGI yourself first, if you have a way to make actually aligned (hopefully task-ish or something).

If Eliezer or MIRI as a whole had said something like this, especially the first part “I think no one should build AGI.” while pursuing their plans, I would be more tempted to give them a pass. But I don’t recall them saying this, and a couple of AIs I asked couldn’t find any such statements (until after their latest pivot).

Also I wouldn’t actually endorse this statement, because because it doesn’t take into account human tendency/bias to think of oneself as good/careful and others as evil/reckless.

I’m still not sure where you’re getting this?

Eliezer claiming to have solved metaethics. Saying that he wouldn’t “flinch from” trying to solve all philosophical problems related to FAI by himself. (man, it took me 30-60 minutes to find this link) Being overconfident on other philosophical positions like altruism and identity.

If there was (by great surprise) some amazing pile of insights that made a safe Task-AGI seem feasible, and that stood up to comprehensive scrutiny (somehow), then it would plausibly be a good plan to actually do.

I would be more ok with this (but still worried about unknown unknowns) if “comprehensive scrutiny” meant scrutiny by thousands of world-class researchers over years/decades with appropriate institutional design to help mitigate human biases (e.g., something like academic cryptography research + NIST’s open/public standardization process for crypto algorithms). But nothing like this was part of MIRI’s plans, and couldn’t be because of the need for speed and secrecy.
- TsviBT 6 Dec 2025 5:06 UTC
  LW: 2 AF: 1
  0
  AF Parent
  Ok. I think I might bow out for now unless there’s something especially salient that I should look at, but by way of a bit of summary: I think we agree that Yudkowsky was somewhat overconfident about solving FAI, and that there’s a super high bar that should be met before making an AGI, and no one knows how to meet that bar; my guess would be that we disagree about
  1. the degree to which he was overconfident,
  2. how high a bar would have to be met before making an AGI, in desperate straits.
  - Wei Dai 6 Dec 2025 5:42 UTC
    LW: 3 AF: 3
    1
    AF Parent
    Definitely read the second link if you haven’t already (it’s very short and salient), but otherwise, sure.
    - TsviBT 6 Dec 2025 5:45 UTC
      LW: 2 AF: 1
      0
      AF Parent
      (I did read that one; it’s interesting but basically in line with how I think he’s overconfident; it’s possible one or both of us is incorrectly reading in / not reading in to what he wrote there, about his absolute level of confidence in solving the philosophical problems involved.)
      - Wei Dai 6 Dec 2025 6:02 UTC
        LW: 3 AF: 3
        1
        AF Parent
        Hmm, did you also read my immediate reply to him, where I made the point “if you’re the only philosopher in the team, how will others catch your mistakes?” How to understand his (then) plan except that he would have been willing to push the “launch” button even if there were zero other similarly capable philosophers available to scrutinize his philosophical ideas?
        TsviBT 6 Dec 2025 6:56 UTC
        LW: 5 AF: 4
        3
        AF Parent
        (Also just recording that I appreciate the OP and these threads, and people finding historical info. I think the topic of how “we” have been going wrong on strategy is important. I’m participating because I’m interested, though my contributions may not be very helpful because
        
        I was a relative latecomer, in that much of the strategic direction (insofar as that existed) had already been fixed and followed;
        I didn’t especially think about strategy that much initially, so I didn’t have many mental hooks for tracking what was happening in the social milieu in terms of strategic thinking and actions.)
        TsviBT 6 Dec 2025 6:17 UTC
        LW: 3 AF: 1
        −1
        AF Parent
        (Oh I hadn’t read the full thread, now I have; still no big update? Like, I continue to see him being seemingly overconfident in his ability to get those solutions, but I’m not seeing “oh he would have mistakenly come to think he had a solution when he didn’t”, if that’s what you’re trying to say.)