RobertM comments on Three Missing Cakes, or One Turbulent Critic?

RobertM 17 Jul 2025 5:00 UTC
3 points
1
FYI I am generally good at tracking inside baseball but I understand neither what specific failures^[1] you would have wanted to see discussed in an open postmortem nor what things you’d consider to be “improvements” (and why the changes since 2022/04/01 don’t qualify).
1. ^
  I’m sure there were many, but I have no idea what you consider to have been failures, and it seems like you must have an opinion because otherwise you wouldn’t be confident that the changes over the last three years don’t qualify as improvements.
- Ben Pace 17 Jul 2025 5:37 UTC
  8 points
  2
  Parent
  Not sure what Benquo would say, but I think a natural question when a community of people fails at its goal after ~10 years, is to ask why it failed and what could’ve been done differently. It’s a good opportunity to learn, and it’s a good incentive to expect to do it so that you make sensible choices during the initial period that you’re doing the work (to expect to have to justify them if you fail).
  I think that the CFAR one is more natural, because it seems to me that MIRI set itself a great scientific challenge with an externally imposed deadline, whereas CFAR did not have the external deadline on developing an art of rationality (which is also a very difficult problem). So it’s more naturally a place where the locus of control was within yourself, and a postmortem will be able to be accurate.
  (I’m interested in this topic because I have myself considered trying to put together a public retro on the past ten years of safe-the-world efforts from this scene.)
- Benquo 17 Jul 2025 14:00 UTC
  3 points
  0
  Parent
  The main effects of the sort of “AI Safety/Alignment” movement Eliezer was crucial in popularizing have been OpenAI, which Eliezer says was catastrophic, and funding for “AI Safety/Alignment” professionals, whom Eliezer believes to predominantly be dishonest grifters. This doesn’t seem at all like what he or his sincere supporters thought they were trying to do.
  I’ve written extensively on this sort of perverse optimization, but I don’t see either serious public engagement with my ideas here, or a serious alternative agenda.
  For instance, Ben Pace is giving me approving vibes but it didn’t occur to him to respond to your message by talking about the obvious well-publicized catastrophic failures I mentioned in the OP. And it seems like you forgot about them too by the time you wrote your comment.
  - habryka 18 Jul 2025 2:40 UTC
    8 points
    9
    Parent
    I am sympathetic to your takes here, but I am not that sympathetic to statements like this:
    but I don’t see either serious public engagement with my ideas here, or a serious alternative agenda.
    As it happens I also happen to have written many tens-of-thousands of words about this in many comments across LW and the EA Forum. I also haven’t seen you engage with those things! (and my guess is the way you are phrasing it suggests you are not aware of them)
    Like, man, I do feel like I resonate with the things that you are saying, but it just feels particularly weird to have you show up and complain that no one has engaged with your content on this, while having that exact relationship to approximately the people you are talking to. I, the head admin of LessWrong, have actually spent on the order of many hundreds of hours, maybe 1000+ hours on doing postmortem-ish things in the space, or at least calling for them. I don’t know whether you think what I did/do makes any sense, but I think there is a real attempt of the kind of thing you are hoping for (to be clear, mostly ending with a kind of disappointment and resulting distancing from much of the associated community’s, but it’s not like you can claim a better track record here).
    And in contrast to your relationship with my content, I have read your content and have engaged with it a good amount. You can read through my EA Forum comments and LW comments on the topic if you want to get a sense of how I think about these things.
    - Benquo 18 Jul 2025 13:55 UTC
      2 points
      0
      Parent
      I’m aware that you’ve complained about these problems, but I’m specifically calling for the development and evaluation of explanatory models, which is a different activity. If you’ve done much of that in your public writing I missed it—anything you’d like to point me to?
      - habryka 18 Jul 2025 15:32 UTC
        2 points
        0
        Parent
        I have tried to do that, though it’s definitely more dispersed.
        Most of it is still in comments and so a bit hard to extract, but one post I did write about this was My tentative best guess on how EAs and Rationalists sometimes turn crazy.
        Benquo 18 Jul 2025 18:43 UTC
        7 points
        4
        Parent
        That does seem like it’s overtly concerned with developing an explanation, but it seems concerned with deviance rather than corruption, so it’s on a different topic than the ones I complain about in the OP. I was aware of that one already, as I replied with a comment at the time.
        Benquo 15 Nov 2025 13:16 UTC
        0 points
        −2
        Parent
        OK so the reason I linked AGAIN to multiple posts I’d written on the topic here is that you keep writing as though you had not read those posts and were unaware of their content. Repeatedly claiming that they’re helpful and that you’re contributing to the conversation is not the same thing as actually being informed by them and revealing that information, or actually contributing to the conversation. Rehearsing the basic observation that there’s adversariality and implying that it’s unexplained without engaging with the available explanations on offer is lame, boring, and dishonest.
        habryka 15 Nov 2025 18:54 UTC
        6 points
        2
        Parent
        Come on, what? Look, I like your posts, and found them helpful for thinking about this stuff, but I do not consider them anywhere close to the level where they “resolved” the relevant question, and where any new post I write would need to explicitly address them.
        Like, if my thinking here was highly derivative of yours, sure, but it is not!
        Look, you can have your own takes on whether I am contributing to the conversation, but what on earth is “dishonest” about me trying my best to write up my models here? I certainly do not owe you engagement with your models in order to think about this, and if doing so results in you trying to somehow prosecute me for dishonesty, then I would rather not read them! I like them, I respect you as a thinker, but they are not like foundational or crucial to my understanding of this space, and I do not consider them common-knowledge among the audience I talk to.
        Rehearsing the basic observation that there’s adversariality
        None of my posts are about this. Failing to actually read my posts and claiming that you are contributing to the conversation is not actually the same as being informed by them and revealing that information, or actually contributing to the conversation, as you would say.
        Which to be clear, is totally fine. You don’t owe me reading my posts. I don’t think it’s particularly lame, boring, or dishonest.
        I do wish you engaged in a way where we had a chance of having an actual real conversation, because I again do like your writing on this, but it’s fine by my lights if we can’t. Just let me know if for some reason me reading your posts and finding them valuable must apparently produce dishonesty in my writing by your lights if they don’t influence my writing in just the right way, and I can stop.
        Benquo 15 Nov 2025 21:31 UTC
        0 points
        0
        Parent
        Too many nonsequiturs here for me to have any idea what an earnest object level response would even be.
        habryka 15 Nov 2025 22:59 UTC
        4 points
        0
        Parent
        Sure, we don’t need to engage further here. I could restate or try to clarify, but my sense is you are not very hopeful or excited about investing effort to understand (which IDK, is fine though feels a bit lame from my perspective, but not majorly so).
  - RobertM 18 Jul 2025 4:45 UTC
    4 points
    2
    Parent
    And it seems like you forgot about them too by the time you wrote your comment.
    It was not clear from your comment which particular catastrophic failures you meant (and in fact it’s still not clear to me which things from your post you consider to be in that particular class of “catastrophic failures”, which of them you attribute at least partial responsibility for to MIRI/CFAR, by what mechanisms/causal pathways, etc).
    ETA: “OpenAI existing at all” is an obvious one, granted. I do not think EY considers SBF to be his responsibility (reasonable, given SBF’s intellectual inheritance from the parts of EA that were least downstream of EY’s thoughts). You don’t mention other grifters in your post.