Sam Marks comments on ryan_greenblatt’s Shortform

Sam Marks 26 Apr 2026 21:04 UTC
61 points
8
Here I’ll reflect on things that make me (an Anthropic employee, though I’m speaking for myself only) engage less on LW than I otherwise would. (Some of these points aren’t LW-specific and also apply to other interactions, e.g. in-person conversations, I have with people in the AI safety community.)
1. Being very busy. Obviously this isn’t something that LW can fix, but it’s the single most important factor. It also compounds with the factors below; e.g. if an interaction will be emotionally taxing or if I’ll need to think carefully about how to navigate confidentiality, then it makes an already-expensive interaction seem less worth it.
2. 1. LW being a public forum makes this somewhat worse, because when I comment on LW, the various people who are waiting on me for something can see I’m commenting on LW instead.
3. Blending of advocacy and object-level discussion. When I engage on LW, it’s typically because I think there’s an important object-level point worth discussing. But once I enter the conversation, it sometimes feels like people stop being curious about the object-level point and instead move into an advocacy mode where their goal is to get Anthropic to act differently in light of their point (which is assumed correct). That is, instead of continuing the object-level discussion, my interlocutors sometimes move to criticize Anthropic or the beliefs of Anthropic staff, or to ask Anthropic (via me) to do something differently.
4. 1. (I especially find this frustrating when my interlocutors implicitly assume that I have the Anthropic “house belief” on some topic when I don’t or feel unsure.)
  2. Possible mitigations:
  3. LW posters could engage in object-level discussion longer before moving to advocacy.
    When LW posters want to argue against what they understand to be the Anthropic “house belief,” they could write things like “My understanding is that many Anthropic staff believe something like ‘...’ I think this view wrong because …”
    LW posters could more clearly flag and separate advocacy from object-level discussion. E.g. writing things like “To be clear about what I view as being the stakes here, I think that if I’m right about this point, it means that Anthropic is making a mistake by …”
5. Some people on LW being (IMO) unnecessarily rude. I sometimes feel like object-level discussions (which are what I’m mainly here for) become tinged with LWers venting frustration about AI developers or trying to apply social censure. This can distract from the object-level discussion and make it taxing to engage.
6. 1. TBC, I think it’s valid/good to apply social censure to AI developers and their employees if you think AI developers are doing something harmful! But tactically:
  2. It’s not clear LW is the best venue to do this. (E.g. maybe keep it on X.)
    Even if it is reasonable to do it on LW, I think it should be kept as separate as possible from scientific/object-level discussions.
  3. I really like the way Ryan put it in his post: I think it would be better if criticism of labs on LW were less rude, but not necessarily less hostile. For instance, Ryan often writes posts and comments that are critical about Anthropic but not rude (recent example). I always appreciate these and often find them useful.
7. (A more general category for points (2) and (3) is that I wish people would just interact with me like a normal person rather than like one of Anthropic’s many faces. But I acknowledge that this is too broad of a principle and that it’s reasonable for people to approach discussions with AI company employees differently than normal conversations in some ways.)
8. Avoiding misunderstandings. I wish that I could just write things that I think are true and not worry so much about being misunderstood. (If someone is confused about what I wrote, they can just ask me to clarify, right?) Unfortunately, this is not the case. Some people might be trying very hard to draw inferences from what I write that go far beyond what I intend and which IMO are not valid inferences. E.g. people might try to infer from the tone of my writing things like “How optimistic is Anthropic about alignment?” Worse, some people might be reading my writing adversarially, trying to pick out specific quotes that sound as bad as possible. This means that I need to write somewhat defensively, anticipating and proactively clarifying both accidental misunderstandings (e.g. adding the list of things I don’t believe to the end of this comment) and rewriting anything that could be intentionally misunderstood by a malicious actor.
9. 1. One nightmare I have is writing something that is misunderstood to be reassuring about Anthropic which people end up feeling badly misled by. For instance, maybe I say something like “Yeah, I think that Anthropic should [do thing]” and people misinterpret this as a soft commitment to [do thing].
10. Confidentiality. Some topics are just tricky to talk about, especially topics around “What should/will Anthropic do?”
Overall, I strongly agree with Ryan that it would be nice for research discussions on LW to follow norms more similar to typical research norms (e.g. those followed on StackExchange). (Though I disagreed with the specific example of typically reaching out privately with critiques instead of just posting a comment.) I think LW can be a weird blend of social media and research discussion, and I’d rather those be kept separate as much as possible.
What links here?
- Ben Pace's comment on ryan_greenblatt’s Shortform by ryan_greenblatt (27 Apr 2026 1:10 UTC; 12 points)
- ryan_greenblatt's comment on ryan_greenblatt’s Shortform by ryan_greenblatt (26 Apr 2026 21:55 UTC; 6 points)
- Ben Pace 27 Apr 2026 0:50 UTC
  34 points
  17
  Parent
  2. Blending of advocacy and object-level discussion. When I engage on LW, it’s typically because I think there’s an important object-level point worth discussing. But once I enter the conversation, it sometimes feels like people stop being curious about the object-level point and instead move into an advocacy mode where their goal is to get Anthropic to act differently in light of their point (which is assumed correct). That is, instead of continuing the object-level discussion, my interlocutors sometimes move to criticize Anthropic or the beliefs of Anthropic staff, or to ask Anthropic (via me) to do something differently.
  (I especially find this frustrating when my interlocutors implicitly assume that I have the Anthropic “house belief” on some topic when I don’t or feel unsure.)
  Possible mitigations:
  LW posters could engage in object-level discussion longer before moving to advocacy.
  When LW posters want to argue against what they understand to be the Anthropic “house belief,” they could write things like “My understanding is that many Anthropic staff believe something like ‘...’ I think this view wrong because …”
  LW posters could more clearly flag and separate advocacy from object-level discussion. E.g. writing things like “To be clear about what I view as being the stakes here, I think that if I’m right about this point, it means that Anthropic is making a mistake by …”
  While I agree that many of these mitigations would help some on the margin, in my opinion the simplest and strongest mitigation would be for Anthropic to designate a place where the leadership will show up to engage with general criticism, and reliably do so, and then to direct such critics there (e.g. Dario Amodei could add a public comment section on his blog that he engages with in-depth, or Anthropic leadership could show up to a quarterly AMA on LessWrong). It would be fairly straightforward to redirect off-topic criticism there, if such a place existed, and I would happily lend my own support to people staying more on-topic everywhere else.
  Until there is a time and place where Anthropic reliably engages with critics (and does not avoid the top criticism), there will be a strong pressure on other discussion spaces for it to leak through, even where I would otherwise agree it is not locally appropriate.
- ryan_greenblatt 26 Apr 2026 21:36 UTC
  9 points
  6
  Parent
  I appreciate this comment and the suggestions you made seem very reasonable to me.