AnnaSalamon comments on CFAR update, and New CFAR workshops

AnnaSalamon 5 Oct 2025 1:33 UTC
28 points
5
If this is meant to be a characterization of my past actions (or those of any other CFAR team member, for that matter), I disagree with it. I did and do feel a duty of care. When I had particular agendas about eg AI safety recruiting that were relevant to my interactions with a participant in particular, I generally shared it with them. The thing I tried to describe as a mistake, and to change, was about an orientation to “narrative syncing” and general community set up; it was not about the deontology owed to CFAR participants as individuals.
- Eli Tyre 9 Oct 2025 5:47 UTC
  19 points
  4
  Parent
  FWIW, this broadly matches my own experience of working with Anna and participating in CFAR workshops.
  
  There were tensions in how to relate to participants at AIRCS workshops, in particular.
  
  These were explicitly recruitment programs for MIRI. This was extremely explicit—it was stated on the website, and I believe (though Buck could confirm) that all or most of the participants did a technical interview before they were invited to a workshop.
  
  The workshops were part of an extended interview process. It was a combination of 1) the staff assessing the participants, 2) the participants assessing the staff, and 3) (to some extent) enculturating the participants into MIRI/rationalist culture.
  
  However, the environment was dramatically less formal and more vulnerable than most job interviews: about a fourth of the content of the workshops was Circling, for instance.
  This meant that workshop staff were both assessing the participants and assessing their fit-to-the-culture, while also aiming to be helpful to them and their personal development by their own lights, including helping them untangle philosophical confusions or internal conflicts.
  
  These goals were not incompatible, but there were sometimes in tension. It could feel callous to spend a few days having deep personal conversations with someone, talking with them and trying to support them, but then later, in a staff meeting, relatively quickly coming to a judgement that they didn’t make the cut.
  
  This was a tension that we were aware of and discussed at the time. I think we overall did a good job of navigating it.
  
  This was a very weird environment, by normal profesional standards. But to my knowledge, there was no incident in which we failed to do right by a AIRCS participant, exploited them, or treated them badly.
  
  The majority of people who came had a good experience, regardless of whether they eventually got hired by MIRI. Of those that did not have a good experience, I believe this was predominantly (possibly entirely?) people who felt that the workshop was a waste of time, rather than that they had actively been harmed.
  
  I would welcome any specific information to the contrary. I could totally believe that there was stuff that I was unaware of, or subtle dynamics that I wasn’t tracking at the time, but that I would conclude were fucked up on reflection, if it was pointed out to me. I can only speak from my personal perspective, not make a blanket claim about what happened.
  
  But as it is, I don’t think we failed to discharge our deontological duty towards any participants.
- CronoDAS 5 Oct 2025 22:55 UTC
  2 points
  5
  Parent
  People that disclose a conflict of interest usually aren’t any less biased in practice than people who don’t disclose the same conflict, even though they’re generally perceived as more trustworthy. :/
  - AnnaSalamon 8 Oct 2025 22:26 UTC
    30 points
    10
    Parent
    That may be. I made my comment in reply to a previous version of Duncan’s comment (he edited after) which IIRC said specifically that I didn’t disclose conflicts of interest, and [some phrase I don’t recall, that I interpreted as, that I had said I wasn’t even trying to treat participants according to a certain standard of care]. That is false, and is the kind of thing I can know, so I commented with it. I don’t mean to imply that disclosing a conflict makes a person neutral on the subject, or that first person judgments of intent are reliable.
    I did spend a good deal of time since 2020 discussing errors from my past CFAR stuff, and ways I and we may have harmed things we care about, but it was more along the lines of [doing stuff that is kinda common, but that messes with the epistemics of groups, and of individuals who sync with groups, a la Tsvi’s comment], not [skipping basic attempts to be honest, kind, respectful of expectations about confidentiality, mindful of a workshop participants’ well-being and of likely impacts on it, etc]. I agree with Tsvi that the epistemics-while-in-groups stuff is tricky for many groups, and is not much amenable to patching, which is more or less why this CFAR reboot took me five years, why I tested my guesses about how to do better in a couple smaller/easier group workshop contexts first, and why I’m calling these new CFAR workshops “pilots” and reserving the option to quit if the puzzles seem too hard for us. Tsvi I think remains mostly-pessimistic about my new plans, and I don’t, so there’s that.
    Regardless:
    Attending a workshop where groups of people think together about thinking, and practice new cognitive habits, while living together for five days and talking earnestly about a bunch of life stuff that one doesn’t normally discuss with near-strangers.… is indeed a somewhat risky activity, compared to eg attending a tennis-playing weekend or something else less introspective. Many will come out at least a little changed, and in ways that aren’t all deliberate. It’s worth knowing this, and thinking on whether one wants this.
    
    A large portion of past CFAR participants said they were quite glad they came, including months and years afterward; and I suspect it was probably good for people on net (particularly people who passed through briefly, and retained independent cultural influences, I think?); but I also suspect there were a bunch of people who experienced subtle damage to parts of their skills and outlook and aesthetics in ways that were hard for them and us to track. And I expect some of that going forward, too, although I hope to make this less frequent via:
    respecting individuals more deeply
    having a plurality of frameworks that includes eg [falsification/feedback loops] and [pride in craftsmanship] and other stuff discussed in my OP rather than only [Bayesian updating + agentiness/goal-achievement]
    having a “hobbyist convention” vibe where our guests are fellow hobbyists and can bring their own articulate or inarticulate frameworks
    not myself being in “emergency mode” around AI risk (and being myself in something closer to a “geek out with people and try to be helpful and see where things go” mode, rather than in a super-goal-oriented mode), which I think should be better for not losing “peripheral vision” or “inarticulate-but-important bits of perception.”
    
    One should not expect the CFAR alumni community to be only made of trustworthy carefully vetted people. We plan not to accept people who we suspect have unusually bad character; but I’m not that good at telling, and I don’t know that we as a team are, either. Also, there’s a question of false-negatives vs false-positives here, and I don’t plan to be maximally risk averse, although I do plan to try some; guests and alumni should keep in mind, when interacting with any future alumni community, that strangers vary in their trustworthiness.
    
    I’m sometimes fairly skilled at seeing the gears inside peoples’ minds, especially when people try to open up to me; and when things are going well this can be useful to all parties. But I’ve harmed people via trying to dialog with bits of their mind that weren’t set up for navigating outside pressures, albeit mostly not mainline workshop participants, mostly in cases where I didn’t understand ways they were different from me and so moves that would’ve been okay on me were worse for them, and mostly in contact with peoples’ untempered urge to try to be relevant to AI safety which created a lot of fear/drive that might’ve been bumpy without me too (eg, folks who felt “I must continue to work at CFAR or MIRI or [wherever] or else my life won’t matter,” and so weren’t okay with the prospect of maybe-losing a job that most people would’ve quit because of the pain or difficulty of it). I do think I’m better at not causing harm in this way now (via chilling out in general, via somewhat better models of how some non-me people work, and via the slow accumulation of common sense), but whether other people have grounds to believe me about this varies.
    Is the above enough bad that the world would be better off without CFAR re-opening and offering workshops? IMO, no. CFAR ran sixty-ish multi-day events from 2012-2020 with close to two thousand participants; some things went badly wrong, many things were meh compared to our hopes (our rationality curriculum is fairly cool, but didn’t feedback its way to superpowers); many things went gloriously right (new friendships; new businesses; more bay area rationality community in which many found families or other things they wanted; many alumni who tell me they learned, at the workshop, how to actively notice and change personal habits or parts of their lives that weren’t working for them). 2025 is somehow a time when many organizations and community structures have shut down; and I think there’s something sad about that and I don’t want CFAR’s to be one of them.
    It seems good to me that people, including Duncan, want to tell their friends and family their views. (Also including people in top-level comments below this one who want to share positives; those are naturally a bit easier for me personally to enjoy.). A cacophony of people trying to share takes and info seems healthy to me, and healthier than a context where people-with-knowledge are pressured never to share negatives (or where people-with-knowledge who have positives are quiet about those in case CFAR is actually bad and they trick people).
    I hope for relatively relaxed narrative structures both about CFAR and broadly, where peoples’ friends can help them make sense of whatever they are seeing and experiencing, and can help them get info they might want, in public view where sensible, without much all-or-nothingness. (I don’t mean from Duncan, whose honest take here is extremely negative, and who deserves to have that tracked; but from the mixture of everyone.)
  - Duncan Sabien (Inactive) 8 Oct 2025 23:13 UTC
    9 points
    4
    Parent
    Just noting for the audience that the edits which Anna references in her reply to CronoDAS, as if they had substantively changed the meaning of my original comment, were to add:
    The phrase “directly observed”
    The parenthetical about having good epistemic hygiene with regards to people’s protestations to the contrary
    The bit about agendas often not being made explicit
    It did not originally specify undisclosed conflicts of interest in any way that the new version doesn’t. Both versions contained the same core (true) claim: that multiple of the staff members common to both CFAR!2017 and CFAR!2025 often had various (i.e. not only the AI stuff) agendas which would bump participant best interests to second, third, or even lower on the priority ladder.
    I’ve also added, just now, a clarifying edit to a higher comment: “Some of these staff members are completely blind to some centrally important axes of care.” This seemed important to add, given that Anna is below making claims of having seen, modeled, and addressed the problems (a refrain I have heard from her, directly, in multiple epochs, and taken damage from naively trusting more than once). More (abstract, philosophical) detail on my views about this sort of dynamic here.
    - AnnaSalamon 8 Oct 2025 23:49 UTC
      15 points
      4
      Parent
      > given that Anna is below making claims of having seen, modeled, and addressed the problems
      I think I am mostly saying that I don’t agree that there were ever problems of the sort you are describing, w.r.t standard of care etc. That is: I think I and other CFAR staff were following the basics of standard deontology w.r.t. participants the whole time, and I think the workshops were good enough that it was probably better to be running them the whole time.
      I added detail to caveat that and to try to make the conversation less confusing for the few who’re trying to follow it in a high-detail way.