Sometimes, conclusions don’t need to be particularly nuanced. Sometimes, a system is built of many parts, and yet a valid, non-misleading description of that system as a whole is that it is untrustworthy.
The central case where conclusions don’t need to be particularly nuanced is when you’re engaged in a conflict and you’re trying to attack the other side.
In other cases, when you’re trying to figure out how the world works and act accordingly, nuance typically matters a lot.
Calling an organization “untrustworthy” is like calling a person “unreliable”. Of course some people are more reliable than others, but when you smuggle in implicit binary standards you are making it harder in a bunch of ways to actually model the situation.
I sent Mikhail the following via DM, in response to his request for “any particular parts of the post [that] unfairly attack Anthropic”:
I think that the entire post is optimized to attack Anthropic, in a way where it’s very hard to distinguish between evidence you have, things you’re inferring, standards you’re implicitly holding them to, standards you’re explicitly holding them to, etc.
My best-guess mental model here is that you were more careful about this post than about the other posts, but that there’s a common underlying generator to all of them, which is that you’re missing some important norms about how healthy critique should function.
I don’t expect to be able to convey those norms or their importance to you in this exchange, but I’ll consider writing up a longform post about them.
I think Situational Awareness is a pretty good example of what it looks like for an essay to be optimized for a given outcome at the expense of epistemic quality. In Situational Awareness, it’s less that any given statement is egregiously false, and more that there were many choices made to try to create a conceptual frame that promoted racing. I have critiqued this at various points (and am writing up a longer critique) but what I wanted from Leopold was something more like “here are the key considerations in my mind, here’s how I weigh them up, here’s my nuanced conclusion, here’s what would change my mind”. And that’s similar to what I want from posts like yours too.
This seems focused on intent in a way that’s IMO orthogonal to the post. There’s explicit statements that Anthropic made and then violated. Bringing in intent (or especially nationality) and then pivoting to discourse norms seems on net bad for figuring out “should you assume this lab will hold to commitments in the future when there are incentives for them not to”.
I particularly dislike that this topic has stretched into psychoanalysis (of Anthropic staff, of Mikhail Samin, of Richard Ngo) when I felt that the best part of this article was its groundedness in fact and nonreliance on speculation. Psychoanalysis of this nature is of dubious use and pretty unfriendly.
Any decision to work with people you don’t know personally that relies on guessing their inner psychology is doomed to fail.
The central case where conclusions don’t need to be particularly nuanced is when you’re engaged in a conflict and you’re trying to attack the other side.
In other cases, when you’re trying to figure out how the world works and act accordingly, nuance typically matters a lot.
Calling an organization “untrustworthy” is like calling a person “unreliable”. Of course some people are more reliable than others, but when you smuggle in implicit binary standards you are making it harder in a bunch of ways to actually model the situation.
I sent Mikhail the following via DM, in response to his request for “any particular parts of the post [that] unfairly attack Anthropic”:
I think that the entire post is optimized to attack Anthropic, in a way where it’s very hard to distinguish between evidence you have, things you’re inferring, standards you’re implicitly holding them to, standards you’re explicitly holding them to, etc.
My best-guess mental model here is that you were more careful about this post than about the other posts, but that there’s a common underlying generator to all of them, which is that you’re missing some important norms about how healthy critique should function.
I don’t expect to be able to convey those norms or their importance to you in this exchange, but I’ll consider writing up a longform post about them.
I think Situational Awareness is a pretty good example of what it looks like for an essay to be optimized for a given outcome at the expense of epistemic quality. In Situational Awareness, it’s less that any given statement is egregiously false, and more that there were many choices made to try to create a conceptual frame that promoted racing. I have critiqued this at various points (and am writing up a longer critique) but what I wanted from Leopold was something more like “here are the key considerations in my mind, here’s how I weigh them up, here’s my nuanced conclusion, here’s what would change my mind”. And that’s similar to what I want from posts like yours too.
This seems focused on intent in a way that’s IMO orthogonal to the post. There’s explicit statements that Anthropic made and then violated. Bringing in intent (or especially nationality) and then pivoting to discourse norms seems on net bad for figuring out “should you assume this lab will hold to commitments in the future when there are incentives for them not to”.
I particularly dislike that this topic has stretched into psychoanalysis (of Anthropic staff, of Mikhail Samin, of Richard Ngo) when I felt that the best part of this article was its groundedness in fact and nonreliance on speculation. Psychoanalysis of this nature is of dubious use and pretty unfriendly.
Any decision to work with people you don’t know personally that relies on guessing their inner psychology is doomed to fail.