Benquo comments on Three Missing Cakes, or One Turbulent Critic?

Benquo 14 Jul 2025 0:15 UTC
10 points
3
The former. There should have been a lot of open-minded postmortem discussion, not just a pivot.
- Vaniver 16 Jul 2025 4:42 UTC
  6 points
  0
  Parent
  Hmm. I HMCFed after that, I think, but I don’t remember why I didn’t talk much about it publicly. (Also I think there was a CFAR postmortem that I don’t recall getting written up and discussed online, tho there was lots of in-person discussion.)
  - Benquo 16 Jul 2025 16:48 UTC
    12 points
    1
    Parent
    I remember being told by an attendee that I hadn’t been invited to a CFAR postmortem weekend because I’m “not a STEM person.” Since I did statistical programming professionally for five years and have an MS in Mathematics and Statistics from an elite-ish university, I can only interpret that as meaning I’m unwilling to use technical math to dissociate from substantive humanities problems.
    I have to rate all the time spent that didn’t result in improvements visible from the outside as nothing but costs paid to sustain internal narcissistic supply; I can’t credit it as an attempt to solve or even discuss a problem unless I receive further evidence. The uniformly positive things I’ve heard about “Don’t Create the Torment Nexus II: If Anyone Builds It, Everyone Dies” implies not much in the way of new perspective or even consensus that one is needed.
    - Vaniver 18 Jul 2025 15:58 UTC
      9 points
      2
      Parent
      I can’t comment on why you weren’t invited [to the CFAR postmortem], because I was not involved with the decision-making for who would be invited; I just showed up to the event. Naively, I would’ve guessed it was because you didn’t work at CFAR (unless you did and I missed it?); I think only one attendee wasn’t in that category, for a broad definition of ‘work at’.
      I have to rate all the time spent that didn’t result in improvements visible from the outside as nothing but costs paid to sustain internal narcissistic supply
      This seems fair to me.
      The uniformly positive things I’ve heard about “Don’t Create the Torment Nexus II: If Anyone Builds It, Everyone Dies” implies not much in the way of new perspective or even consensus that one is needed.
      I think the main difference between MIRI pre-2022 and post-2022 is that pre-2022 had much more willingness to play along with AI companies and EAs, and post-2022 is much more willing to be openly critical.
      There are other differences, and also I think we might be focusing on totally different parts of MIRI. Would you care to say more about where you think there needs to be new perspective?
      - Benquo 18 Jul 2025 18:30 UTC
        8 points
        4
        Parent
        If the transition from less to more disagreeableness doesn’t come along with an investigation of why agreeableness seemed like a plausible strategy and what was learned, then we’re still stuck trying to treat an adversary as an environment.
        Vaniver 19 Jul 2025 0:56 UTC
        2 points
        0
        Parent
        I think I agree with your statement; I assume that this happened, though? Or, at least, in a mirror of the ‘improvements visible from the outside’ comment earlier, the question is whether MIRI is now operating in a way that leads to successfully opposing their adversaries, rather than whether they’ve exposed their reasoning about this to the public.
      - Benquo 18 Jul 2025 18:31 UTC
        4 points
        0
        Parent
        Naively, I would’ve guessed it was because you didn’t work at CFAR (unless you did and I missed it?)
        The attendee who told me about it never worked at CFAR, and neither did a couple other people I knew who went. Also I did guest-instruct at a CFAR workshop once.
    - RobertM 17 Jul 2025 5:00 UTC
      3 points
      1
      Parent
      FYI I am generally good at tracking inside baseball but I understand neither what specific failures^[1] you would have wanted to see discussed in an open postmortem nor what things you’d consider to be “improvements” (and why the changes since 2022/04/01 don’t qualify).
      ^
      I’m sure there were many, but I have no idea what you consider to have been failures, and it seems like you must have an opinion because otherwise you wouldn’t be confident that the changes over the last three years don’t qualify as improvements.
      - Ben Pace 17 Jul 2025 5:37 UTC
        8 points
        2
        Parent
        Not sure what Benquo would say, but I think a natural question when a community of people fails at its goal after ~10 years, is to ask why it failed and what could’ve been done differently. It’s a good opportunity to learn, and it’s a good incentive to expect to do it so that you make sensible choices during the initial period that you’re doing the work (to expect to have to justify them if you fail).
        I think that the CFAR one is more natural, because it seems to me that MIRI set itself a great scientific challenge with an externally imposed deadline, whereas CFAR did not have the external deadline on developing an art of rationality (which is also a very difficult problem). So it’s more naturally a place where the locus of control was within yourself, and a postmortem will be able to be accurate.
        (I’m interested in this topic because I have myself considered trying to put together a public retro on the past ten years of safe-the-world efforts from this scene.)
      - Benquo 17 Jul 2025 14:00 UTC
        3 points
        0
        Parent
        The main effects of the sort of “AI Safety/Alignment” movement Eliezer was crucial in popularizing have been OpenAI, which Eliezer says was catastrophic, and funding for “AI Safety/Alignment” professionals, whom Eliezer believes to predominantly be dishonest grifters. This doesn’t seem at all like what he or his sincere supporters thought they were trying to do.
        I’ve written extensively on this sort of perverse optimization, but I don’t see either serious public engagement with my ideas here, or a serious alternative agenda.
        For instance, Ben Pace is giving me approving vibes but it didn’t occur to him to respond to your message by talking about the obvious well-publicized catastrophic failures I mentioned in the OP. And it seems like you forgot about them too by the time you wrote your comment.
        habryka 18 Jul 2025 2:40 UTC
        8 points
        9
        Parent
        I am sympathetic to your takes here, but I am not that sympathetic to statements like this:
        but I don’t see either serious public engagement with my ideas here, or a serious alternative agenda.
        As it happens I also happen to have written many tens-of-thousands of words about this in many comments across LW and the EA Forum. I also haven’t seen you engage with those things! (and my guess is the way you are phrasing it suggests you are not aware of them)
        Like, man, I do feel like I resonate with the things that you are saying, but it just feels particularly weird to have you show up and complain that no one has engaged with your content on this, while having that exact relationship to approximately the people you are talking to. I, the head admin of LessWrong, have actually spent on the order of many hundreds of hours, maybe 1000+ hours on doing postmortem-ish things in the space, or at least calling for them. I don’t know whether you think what I did/do makes any sense, but I think there is a real attempt of the kind of thing you are hoping for (to be clear, mostly ending with a kind of disappointment and resulting distancing from much of the associated community’s, but it’s not like you can claim a better track record here).
        And in contrast to your relationship with my content, I have read your content and have engaged with it a good amount. You can read through my EA Forum comments and LW comments on the topic if you want to get a sense of how I think about these things.
        Benquo 18 Jul 2025 13:55 UTC
        2 points
        0
        Parent
        I’m aware that you’ve complained about these problems, but I’m specifically calling for the development and evaluation of explanatory models, which is a different activity. If you’ve done much of that in your public writing I missed it—anything you’d like to point me to?
        habryka 18 Jul 2025 15:32 UTC
        2 points
        0
        Parent
        I have tried to do that, though it’s definitely more dispersed.
        Most of it is still in comments and so a bit hard to extract, but one post I did write about this was My tentative best guess on how EAs and Rationalists sometimes turn crazy.
        Benquo 18 Jul 2025 18:43 UTC
        7 points
        4
        Parent
        That does seem like it’s overtly concerned with developing an explanation, but it seems concerned with deviance rather than corruption, so it’s on a different topic than the ones I complain about in the OP. I was aware of that one already, as I replied with a comment at the time.
        Benquo 15 Nov 2025 13:16 UTC
        0 points
        −2
        Parent
        OK so the reason I linked AGAIN to multiple posts I’d written on the topic here is that you keep writing as though you had not read those posts and were unaware of their content. Repeatedly claiming that they’re helpful and that you’re contributing to the conversation is not the same thing as actually being informed by them and revealing that information, or actually contributing to the conversation. Rehearsing the basic observation that there’s adversariality and implying that it’s unexplained without engaging with the available explanations on offer is lame, boring, and dishonest.
        habryka 15 Nov 2025 18:54 UTC
        6 points
        2
        Parent
        Come on, what? Look, I like your posts, and found them helpful for thinking about this stuff, but I do not consider them anywhere close to the level where they “resolved” the relevant question, and where any new post I write would need to explicitly address them.
        Like, if my thinking here was highly derivative of yours, sure, but it is not!
        Look, you can have your own takes on whether I am contributing to the conversation, but what on earth is “dishonest” about me trying my best to write up my models here? I certainly do not owe you engagement with your models in order to think about this, and if doing so results in you trying to somehow prosecute me for dishonesty, then I would rather not read them! I like them, I respect you as a thinker, but they are not like foundational or crucial to my understanding of this space, and I do not consider them common-knowledge among the audience I talk to.
        Rehearsing the basic observation that there’s adversariality
        None of my posts are about this. Failing to actually read my posts and claiming that you are contributing to the conversation is not actually the same as being informed by them and revealing that information, or actually contributing to the conversation, as you would say.
        Which to be clear, is totally fine. You don’t owe me reading my posts. I don’t think it’s particularly lame, boring, or dishonest.
        I do wish you engaged in a way where we had a chance of having an actual real conversation, because I again do like your writing on this, but it’s fine by my lights if we can’t. Just let me know if for some reason me reading your posts and finding them valuable must apparently produce dishonesty in my writing by your lights if they don’t influence my writing in just the right way, and I can stop.
        Benquo 15 Nov 2025 21:31 UTC
        0 points
        0
        Parent
        Too many nonsequiturs here for me to have any idea what an earnest object level response would even be.
        Expand this thread
        habryka 15 Nov 2025 22:59 UTC
        4 points
        0
        Parent
        Sure, we don’t need to engage further here. I could restate or try to clarify, but my sense is you are not very hopeful or excited about investing effort to understand (which IDK, is fine though feels a bit lame from my perspective, but not majorly so).
        RobertM 18 Jul 2025 4:45 UTC
        4 points
        2
        Parent
        And it seems like you forgot about them too by the time you wrote your comment.
        It was not clear from your comment which particular catastrophic failures you meant (and in fact it’s still not clear to me which things from your post you consider to be in that particular class of “catastrophic failures”, which of them you attribute at least partial responsibility for to MIRI/CFAR, by what mechanisms/causal pathways, etc).
        ETA: “OpenAI existing at all” is an obvious one, granted. I do not think EY considers SBF to be his responsibility (reasonable, given SBF’s intellectual inheritance from the parts of EA that were least downstream of EY’s thoughts). You don’t mention other grifters in your post.
- lemonhope 22 Jul 2025 9:08 UTC
  4 points
  2
  Parent
  “Death with dignity” was clearly intended to trigger the audience to HMCF right? He was doing exactly what you are asking for
  - sunwillrise 22 Jul 2025 10:29 UTC
    5 points
    0
    Parent
    “Trigger the audience into figuring out what went wrong with MIRI’s collective past thinking and decision-making” would be a strange purpose from a post written by the founder of MIRI, its key decision-maker, and a long-time proponent of secrecy in how the organization should relate to outsiders (or even how members inside the organization should relate to other members of MIRI).
    - Ben Pace 22 Jul 2025 19:09 UTC
      8 points
      6
      Parent
      Not disagreeing with your point, just want to add the datapoint that for me it did lead me to something like “giving up faith in” MIRI. I no longer believed that they were working on a plan for getting the problem solved, and so I resigned myself to the world where I had to take responsibility for the problem getting solved.