ITT-passing and civility are good; “charity” is bad; steelmanning is niche

Rob Bensinger5 Jul 2022 0:15 UTC

165 points

This post has been recorded as part of the LessWrong Curated Podcast, and can be listened to on Spotify, Apple Podcasts, and Libsyn.

I often object to claims like “charity/steelmanning is an argumentative virtue”. This post collects a few things I and others have said on this topic over the last few years.

My current view is:

Steelmanning (“the art of addressing the best form of the other person’s argument, even if it’s not the one they presented”) is a useful niche skill, but I don’t think it should be a standard thing you bring out in most arguments, even if it’s an argument with someone you strongly disagree with.
Instead, arguments should mostly be organized around things like:
- Object-level learning and truth-seeking, with the conversation as a convenient excuse to improve your own model of something you’re curious about.
- Trying to pass each other’s Ideological Turing Test (ITT), or some generalization thereof. The ability to pass ITTs is the ability “to state opposing views as clearly and persuasively as their proponents”.
  - The version of “ITT” I care about is one where you understand the substance of someone’s view well enough to be able to correctly describe their beliefs and reasoning; I don’t care about whether you can imitate their speech patterns, jargon, etc.
- Trying to identify and resolve cruxes: things that would make one or the other of you (or both) change your mind about the topic under discussion.
Argumentative charity is a complete mess of a concept⁠—people use it to mean a wide variety of things, and many of those things are actively bad, or liable to cause severe epistemic distortion and miscommunication.
Some version of civility and/or friendliness and/or a spirit of camaraderie and goodwill seems like a useful ingredient in many discussions. I’m not sure how best to achieve this in ways that are emotionally honest (“pretending to be cheerful and warm when you don’t feel that way” sounds like the wrong move to me), or how to achieve this without steering away from candor, openness, “realness”, etc.

I’ve said that I think people should be “nicer and also ruder”. And:

The sweet spot for EA PR is something like: ‘friendly, nuanced, patient, and totally unapologetic about being a fire hose of inflammatory hot takes’. 🙂

I have an intuition that those are pieces of the puzzle, along with (certain aspects or interpretations of) NVC tech, circling tech, introspection tech, etc. But I’m not sure how to hit the right balance in general.

I do feel very confident that “steelmanning” and “charity” aren’t the right tech for achieving this goal. (Because “charity” is a bad meme, and “steelmanning” is a lot more niche than that.)

Things other people have said

Ozy Brennan wrote Against Steelmanning in 2016, and Eliezer Yudkowsky commented:

Be it clear: Steelmanning is not a tool of understanding and communication. The communication tool is the Ideological Turing Test. “Steelmanning” is what you do to avoid the equivalent of dismissing AGI after reading a media argument. It usually indicates that you think you’re talking to somebody as hapless as the media.

The exception to this rule is when you communicate, “Well, on my assumptions, the plausible thing that sounds most like this is...” which is a cooperative way of communicating to the person what your own assumptions are and what you think are the strong and weak points of what you think might be the argument.

Mostly, you should be trying to pass the Ideological Turing Test if speaking to someone you respect, and offering “My steelman might be...?” only to communicate your own premises and assumptions. Or maybe, if you actually believe the steelman, say, “I disagree with your reason for thinking X, but I’ll grant you X because I believe this other argument Y. Is that good enough to move on?” Be ready to accept “No, the exact argument for X is important to my later conclusions” as an answer.

“Let me try to imagine a smarter version of this stupid position” is when you’ve been exposed to the Deepak Chopra version of quantum mechanics, and you don’t know if it’s the real version, or what a smart person might really think is the issue. It’s what you do when you don’t want to be that easily manipulated sucker who can be pushed into believing X by a flawed argument for not-X that you can congratulate yourself for being skeptically smarter than. It’s not what you do in a respectful conversation.

In 2017, Holden Karnofsky wrote:

I try to avoid straw-manning, steel-manning, and nitpicking. I strive for an accurate understanding of the most important premises behind someone’s most important decisions, and address those. (As a side note, I find it very unsatisfying to engage with “steel-man” versions of my arguments, which rarely resemble my actual views.)

And Eliezer wrote, in a private Facebook thread:

Reminder: Eliezer and Holden are both on record as saying that “steelmanning” people is bad and you should stop doing it.

As Holden says, if you’re trying to understand someone or you have any credence at all that they have a good argument, focus on passing their Ideological Turing Test. “Steelmanning” usually ends up as weakmanning by comparison. If they don’t in fact have a good argument, it’s falsehood to pretend they do. If you want to try to make a genuine effort to think up better arguments yourself because they might exist, don’t drag the other person into it.

Things I’ve said

In 2018, I wrote:

When someone makes a mistake or has a wrong belief, you shouldn’t “steelman” that belief by replacing it with a different one; it makes it harder to notice mistakes and update from them, and it also makes it harder to understand people’s real beliefs and actions.
“What belief does this person have?” is a particular factual question. Steelmanning, like “charity”, is sort of about unfocusing your eyes and tricking yourself into treating the factual question as though it were a game: you want to find a fairness-preserving allocation of points to all players, where more credible views warrant more points. Some people like that act of unfocusing because it’s fun to brainstorm new arguments; or they think it’s a useful trick for reducing social conflict or resistance to new ideas. But it’s dangerous to frame that unfocusing as “steelmanning” or “charity” rather than explicitly flagging “I want to change the topic to this other thing your statement happened to remind me of”.

In 2019, I said:

Charity seems more useful for rhetoric/persuasion/diplomacy; steel-manning seems more useful for brainstorming; both seem dangerous insofar as they obscure the original meaning and make it harder to pass someone’s Ideological Turing Test.
“Charity” seems like the more dangerous meme to me because it encourages more fuzziness about whether you’re flesh-manning [i.e., just trying to accurately model] vs. steel-manning the argument, and because it has more moral overtones. It’s more epistemically dangerous to filter your answers to factual questions by criteria other than truth, than to decide to propose a change of topic.
[...] I endorse “non-uncharitableness”—trying to combat biases toward having an inaccurately negative view of your political enemies and so on.
I worry that removing the double negative makes it seem like charity is an epistemic end in its own right, rather than an attempt to combat a bias. I also worry that the word “charity” makes it tempting to tie non-uncharitableness to niceness/friendliness, which makes it more effortful to think about and optimize those goals separately.
Most of my worries about charity and steelmanning go away if they’re discussed with the framings ‘non-uncharitableness and niceness are two separate goals’ and ‘good steelmanning and good fleshmanning are two separate goals’, respectively.
E.g., actively focus on examples of:
being epistemically charitable in ways that aren’t nice, friendly, or diplomatic.
being nice and prosocial in ways that require interpreting the person as saying something less plausible.
trying to better pass someone’s Ideological Turing Test by focusing on less plausible claims and arguments.
coming up with steelmen that explicitly assert the falsehood of the claim they’re the steelman of.
I also think that the equivocation in “charity” is doing some conversational work.
E.g.: Depending on context and phrasing, saying that you’re optimizing for friendliness can make you seem manipulative or inauthentic, or it can seem like a boast or a backhanded attack (“I was trying to be nice when I said it that way” / “I’m trying to be friendly”.) Framing a diplomatic goal as though it were epistemic can mitigate that problem.
Similarly, if you’re in an intellectual or academic environment and you want to criticize someone for being a jerk, “you’re being uncharitable” is likely to get less pushback, not only because it’s relatively dry but because criticisms of tone are generally more controversial among intellectuals than criticisms of content.
“You’re being uncharitable” is also a common accusation in a motte-and-bailey context. Any argument can be quickly dismissed if it makes your conclusion sound absurd, because the arguer must just be violating the principle of charity. It may not even be necessary to think of an alternative, stronger version of the claim under attack, if you’re having an argument over twitter and can safely toss out the “That sounds awfully uncharitable” line and then disappear in the mist.
… Hm, this comment ended up going in a more negative direction than I was intending. The concerns above are important, but the thing I originally intended to say was that it’s not an accident “charity” is equivocal, and there’s some risk in disambiguating it without recognizing the conversational purposes the ambiguity was serving, contra my earlier insistence on burning the whole thing down. It may be helping make a lot of social interactions smoother, helping giving people more cover to drop false views with minimal embarrassment (by saying they really meant the more-charitable interpretation all along), etc.

(I now feel more confident that, no, “charity” is just a bad meme. Ditch it and replace it with something new.)

From 2021:

The problem isn’t ‘charity is a good conversational norm, but these people are doing it wrong’; the problem is that charity is a bad conversational norm. If nothing else, it’s bad because it equivocates between ‘be friendly’ norms and ‘have accurate beliefs about others’ norms.
Good norms:
Keep discussions civil and chill.
Be wary of biases to strawman others.
Try to pass others’ ITT.
Use steelmen to help you think outside the box.
Bad norms:
Treat the above norms as identical.
Try to delude yourself about how good others’ arguments are.

From 2022:

I think the term ‘charity’ is genuinely ambiguous about whether you’re trying to find the person’s true view, vs. trying to steel-man, vs. some combination. Different people at different times do all of those things and call it argumentative ‘charity’.
This if anything strikes me as even worse than saying ‘I’m steel-manning’, because at least steel-manning is transparent about what it’s doing, even if people tend to underestimate the hazards of doing it.

What links here?

Rob Bensinger5 Jul 2022 0:15 UTC

165 points

37 comments6 min readLW link 1 review

Steelmanning Ideological Turing Tests Community Rationality