It’s been roughly 7 years since the LessWrong user-base voted on whether it’s time to close down shop and become an archive, or to move towards the LessWrong 2.0 platform, with me as head-admin. For roughly equally long have I spent around one hundred hours almost every year trying to get Said Achmiz to understand and learn how to become a good LessWrong commenter by my lights.[1] Today I am declaring defeat on that goal and am giving him a 3 year ban.
What follows is an explanation of the models of moderation that convinced me this is a good idea, the history of past moderation actions we’ve taken for Said, and some amount of case law that I derive from these two. If you just want to know the moderation precedent, you can jump straight there.
I think few people have done as much to shape the culture of LessWrong as Said. More than 50% of the time when I would ask posters, commenters and lurkers about their models of LessWrong culture, they’d say some version of either:
Of all the places on the internet, LessWrong is a place that really forces you to get your arguments together. It’s very much a no-bullshit culture, and I think this is one of the things that makes it one of the most valuable forums on the internet.
Or
Man, posting on LessWrong seems really unrewarding. You show up, you put a ton of effort into a post, and at the end the comment section will tear apart some random thing that isn’t load bearing for your argument, isn’t something you consider particularly important, and whose discussion doesn’t illuminate what you are trying to communicate, all the while implying that they are superior in their dismissal of your irrational and dumb ideas.
And frequently when I dig into how they formed these impressions, a comment by Said would be at least heavily involved in that.
I think both of these perspectives are right. LessWrong is a unique place on the internet where bad ideas do get torn apart in ways that are rare and valuable, and also a place where there is a non-trivial chance that your comment section gets derailed by someone making some extremely confident assumption about what you intended to say, followed by a pile of sneering dismissal.[2]
I am overall making this decision to ban Said with substantial sadness. As is evident by me spending hundreds of hours over the years trying to resolve this via argument and soft-touch moderation, this was very far from an obvious choice. This post itself was also many dozens of hours of work, and I hope it illuminates some of the ways this decision was made, what it means for the future of LessWrong, and how it will affect future moderation. I apologize for the length.
The sneer attractor
One of the recurring attractors of the modern internet, dominating many platforms, subreddits, and subcultures, is the sneer attractor. Exemplified in my mind by places like RationalWiki, the eponymous “SneerClub”, but also many corners of Reddit and of course 4chan. At its worst, that culture looks like this:
Sociologically, my sense is this culture comes from a mixture of the following two dynamics:
Conflationary alliances, which conflate between all the different ways why something is bad. The key component of making good sneer club criticism is to never actually say out loud what your problem is. Sneerers say something that the reader can fill in with whatever they think the problem is, which allows them to establish an appearance of consensus, without a shared model. When pressed, sneerers hide behind irony or deflect.[3]
A culture of loose status-focused social connection. Fellow sneerers are not trying to build anything together. They are not relying on each other for trade, coordination or anything else. They don’t need to develop protocols of communication that produce functional outcomes, they just need to have fun sneering together.
Since the sneer attractor is one of the biggest and most destructive attractors of the modern internet, I worry about LessWrong also being affected by its dynamics. I think we are unlikely to ever become remotely as bad as SneerClub, or most of Reddit or Twitter, but I do see similar cultural dynamics rear their head on LessWrong. If these dynamics were to get worse, the thing I would mostly expect to see is the site quietly dying; fewer people venturing new/generative content, fewer people checking the site in hopes of such content, and an eventual ghost town of boring complaints about the surrounding scene, with links.
But before I go into that, let’s discuss the sneer attractor’s mirror image:
The LinkedIn attractor
In the other corners of the internet and the rest of the world, especially in the land of professional communities, we have what I will call the “LinkedIn attractor”. In those communities saying anything bad about another community member is frowned upon. Disputes are supposed to be kept private. When someone intends to run an RCT on which doctors in your hospital are most effective, you band together and refuse to participate, because establishing performance metrics would hurt the unity of your community.
Since anything but abstract approval is risky in those communities, a usual post in those communities looks like this largely vacuous engagement[4]:
And at the norms level like this (cribbed from an interesting case study of the “Obama Campaign Alumni” Facebook group descending into this attractor):
This cultural attractor is not mutually exclusive with the sneer attractor. Indeed, the LinkedIn attractor appears to be the memetically most successful way groups relate to their ingroup members, while the sneer attractor governs how they relate to their outgroups.
The dynamics behind the LinkedIn attractor seem mechanistically straightforward. I think of them as “mutual reputation protection alliances”.
In almost every professional context I’ve been in, these alliances manifest as a constant stream of agreements—”I say good things about you, if you say good things about me.”
This makes sense. It’s unlikely I would benefit from people spreading negative information about you, and we would both clearly benefit from protecting each other’s reputation. So a natural equilibrium emerges where people gather many of these mutual reputation protection alliances, ultimately creating groups with strong commitments to protect each other’s reputation and the group’s reputation in the eyes of the rest of the world.
Of course, the people trying to use reputation to navigate the world — to identify who is trustworthy — are much more diffuse and aren’t party to these negotiations. But their interests weigh heavily enough that some ecosystem pressure exists for antibodies to mutual reputation protection alliances to develop (such as rating systems for Uber drivers, as opposed to taxi cartels where feedback to individuals is nearly impossible to aggregate).
How this relates to LessWrong
Ok, now why am I going on this digression about The Sneer Attractor and the LinkedIn Attractor? The reason is because I think that much of the heatedness of moderation discussions related to Said, and people’s fears, have been routing through people being worried that LessWrong will end up in either the Sneer Attractor or the LinkedIn Attractor.
As I’ve talked with many people with opinions on Said’s comments on the site, a recurring theme has been that Said is what prevents LessWrong from falling into the LinkedIn attractor. Said, in many people’s minds, is the bearer of a flag like this:
“Just because you are hurt by, and anxious about others criticizing you or your ideas, doesn’t mean we are going to accommodate you. It is the responsibility of your audience to determine what they think of you and your contributions.
You donot own your reputation. Every individual owns their own judgment of you.
You can shape it by doing good or bad things, but you do not get to shape it by preventing me and others from openly discussing you and your contributions.”
And I really care about this flag too. Indeed, much of the decisions I have made around LessWrong have been to foster a culture that understands and rallies behind this flag. LessWrong is not LinkedIn, and LessWrong is not the EA Forum, and that is good and important.
And I do think Said provides a shield. Having Said comment on LessWrong posts, and having those comments be upvoted, helps against sliding down the attractor towards LinkedIn.
But on the other hand, I notice that in myself a lot of what I am most worried about is the Sneer Attractor. For LessWrong to become a place that can’t do much but to tear things down. Where criticism is vague and high-level and relies on conflationary alliances to get traction, but does not ultimately strengthen what it criticizes or who reads the criticism. Filled with comments that aims to make the readers and the voters feel superior to all those fools who keep saying wrong things, despite not equipping readers to say any less wrong things themselves.
And I do think Said moves LessWrong substantially towards that path. When Said is at his worst, he writes comments like this:
This, to be clear, is still better than the SneerClub comment visible above. For example when asked to clarify, Said obliges:
But the overall effect on the culture is still there, and the thread still results in Benquo eventually disengaging in frustration intending to switch his moderation guidelines to “Reign of terror” and deleting any future similar comment threads, as Said (as far as I can tell) refuses to do much cognitive labor in the rest of the thread until Benquo runs out of energy.
So, to get this conversation started[5] and to maybe give people a bit more trust that I am tracking some things they care about: “Yes, a lot of the world is broken and stuck in an equilibrium of people trying to punish others for saying anything that might reflect badly on anyone else, in endless cycles of mutual reputation protection that make it hard to know what is fake and what is real, and yes, LessWrong, as most things on the internet, is at risk of falling into that attractor. I am tracking this, I care a lot about it, and even knowing that, I think it’s the right call to ban Said.”
Now that this is out of the way, I think we can talk in more mechanistic terms about what is going wrong in comment threads involving Said, and maybe even learn some things about online moderation.
Weaponized obtuseness and asymmetric effort ratios
An excerpt from a recent Benquo post on the Said moderation decision:
Said is annoying, both because his demands for rigor don’t seem prioritized reasonably, and because he’s simultaneously insulting and rude, dismissive of others’ feelings around being “insulted,” and sensitive to insults himself. He’s also disagreeable. I asked Zack for a list of Said’s best comments (see email), and they’re pretty much all procedural criticisms or calls for procedural rigor seemingly with no sense of proportion. In the spirit of his “show me the cake” principle, I don’t see the cake there. On the other hand, he’s a leading contributor to GreaterWrong, which makes this site more usable.
I concur with much of this. To get more concrete, in my experience a breakdown of the core dynamics that make comment threads with Said rarely worth it looks something like this:[6]
Said will write a top level comment that will read like an implicit claim that you have violated some social norm in the things you have written for which you deserve to be punished (though this will not be said explicitly), or if not that, make you look negligent by not answering an innocuous open-seeming question
You will try to address this claim by writing some kind of long response or explanation, answering his question or providing justification on some point
Said will dismiss your response as being totally insufficient, confused, or proving the very point he was trying to make
You will try to clarify more, while Said will continue to make insinuations that your failure to respond properly validates whatever judgment he is invoking
Motivated commenters/authors will go up a level and ask “by what standard are you trying to invoke a negative judgment here?”[7]
Said will deny and all such invocation of standards or judgment, saying he is (paraphrased) “purely talking on the object level and not trying to make any implicit claims of judgment or low-status or any such kind”
After all of this you are left questioning your own sanity, try a bit to respond more on the object-level, and ultimately give up feeling dejected and like a lot of people on LessWrong hate you. You probably don’t post again.
In the more extreme cases, someone will try to prosecute this behavior and reach out to the moderators, or make a top-level post or quick take about it. Whoever does this quickly finds out that the moderators feel approximately as powerless to stop this cycle as they are. This leaves you even more dejected.
With non-trivial probability your post or comment ends up hosting a 100+ comment thread with detailed discussion of Said’s behavior and moderation norms and whether it’s ever OK to ban anyone, in which voting and commenting is largely dominated by the few people who care much more than average about banning and censorship. You feel an additional pang of guilt and concern about how many people you might have upset with your actions, and how much time you might have wasted.
Now, I think it is worth asking what the actual issue with the comments above is. Why do they produce this kind of escalation?
Asymmetric effort ratios and isolated demands for rigor
A key dynamic in many threads with Said is that the critic has a pretty easy job at each step. First of all, they have little to lose. They need to make no positive statements and explain no confusing phenomena. All they need to do is to ask questions, or complain about the imprecision of some definition. If the author can answer compellingly, well, then they can take credit for how they helped elicit an explanation to a confusion that clearly many people must have had. And if the author cannot answer compellingly, then even better, then the critic has properly identified and prosecuted bad behavior and excised the bad ideas that otherwise would have polluted the commons. At the end of the day, are you really going to fault someone for just asking questions? What kind of totalitarian state are you trying to create here?
The critic can disengage at any point. No one faults a commenter for suddenly disappearing or not giving clear feedback on whether a response satisfied them. The author, on the other hand, does usually feel responsible for reacting and responding to any critique made of his ideas, which he dared to put so boldly and loudly in front of the public eye.
My best guess is that the usual ratio of “time it takes to write a critical comment” to “time it takes to respond to it to a level that will broadly be accepted well” is about 5x. This isn’t in itself a problem in an environment with lots of mutual trust and trade, but in an adversarial context it means that it’s easily possible to run a DDOS attack on basically any author whose contributions you do not like by just asking lots of questions, insinuating holes or potential missing considerations, and demanding a response, approximately independently of the quality of their writing.
Maintaining strategic ambiguity about any such dynamics
Of course, commenters on LessWrong are not dumb, and have read Scott Alexander, and have vastly more patience than most commenters on the internet, and so many of them will choose to dissect and understand what is going on.
The key mechanism that shields Said from the accountability that would accompany such analysis is his care to avoid making explicit claims about the need for the author to respond, or the implicit judgment associated with his comments. In any given comment thread, each question is phrased as to be ambiguous between a question driven by curiosity, and a question intended to expose the author’s hypocrisy.
This ambiguity is not without healthy precedent. I have seen healthy math departments and science departments in which a prodding question might be phrased quite politely, and phrased ambiguously between a matter of personal confusion and the intention of pointing out a flaw in a proof.
“Can you explain to me how you got from Lemma C to Proof D? It seems like you are invoking an assumption here I can’t quite understand”
is a common kind of question. And I think overall fine, appropriate, healthy.
That said, most of the time, when I was in those environments, I could tell what was going on, and I mostly knew that other people could tell as well. If someone repeatedly asked questions in a way that did clearly indicate an understanding of a flaw in the provided proofs or arguments, but kept insisting on only getting there via Socratic questioning, they would lose points over time. And if they kept asking probing questions in each seminar that were easily answered, with each question taking up space and bandwidth, then they would quickly lose lots of points and asked to please interrupt less. And furthermore, the tone of voice would often make it clear whether the question asked was more on the genuine curiosity side, or the suggested criticism side.
But here on the internet, in the reaches of an online forum, with...
people coming in and out,
the popularity of topics waxing and waning with the fashions of the internet,
few common-knowledge creating events,
the bandwidth limited to text,
voting in most niche threads dominated by a very small subset of people who you can’t see but shape what gets approval nevertheless,
and branching comment threads with no natural limitation on the bandwidth or volume of requests that can be made of an author,
...the mechanisms that productively channeled this behavior no longer work.
And so here, where you can endlessly deflect any accusations with little risk that common-knowledge can be built that you are wasting time, or making repeated bids for social censure that fail to get accepted, by just falling back on saying “look, all I am doing is asking questions and asking for clarification, clearly that is not a crime”, there is no natural limit to how much heckling you can do. You alone could be the death of a whole subculture, if you are just persistent enough.
And so, at the heart of all of this, is either a deep obliviousness, or more likely, the strategic disarmament of opposition by denying load-bearing subtext (or anything else that might obviously allow prosecution) in these interactions.
And unfortunately, this does succeed. My guess is here on LessWrong better than most places, because we have a shared belief in the power of explicit reasoning, and we have learned an appropriate fear of focusing on subtext with a culture where debate is supposed to focus on the object level claims, not whatever status dynamics are going on, and I think this is good and healthy most of the time. But the purpose of those norms is not to completely eschew analysis and evaluation of the underlying status dynamics, but simply to separate them from the object level claims (I also think it’s good to have some norms against focusing too much on status dynamics and claims in total, so I think a lot of the generic hesitation is justified, which I elaborate on a bit later).
But I think in doing so, part by selection, part by training, we have created an environment where trying to police the subtext and status-dynamics surrounding conversations gets met with fear and counter-reactions, that make moderation and steering close very difficult.[8]
Crimes that are harder to catch should be more harshly punished
A small sidenote on a dynamic relevant to how I am thinking about policing in these cases:
A classical example of microeconomics-informed reasoning about criminal justice is the following snippet of logic.
If someone can gain in-expectation X dollars by committing some crime (which has negative externalities of Y>X dollars), with a probability p of getting caught, then in order to successfully prevent people from committing the crime you need to make the cost of receiving the punishment (Z) be greater than Xp, i.e. X<p∗Z.
Or in less mathy terms, the more likely it is that someone can get away with committing a crime, the harsher the punishment needs to be for that crime.
In this case, a core component of the pattern of plausibly deniable aggression that I think is present in much of Said’s writing is that it is very hard to catch someone doing it, and even harder to prosecute it successfully in the eyes of a skeptical audience. As such, in order to maintain a functional incentive landscape the punishment for being caught in passive or ambiguous aggression needs to be substantially larger than for e.g. direct aggression, as even though being straightforwardly aggressive has in some sense worse effects on culture and norms (though also less bad effects in some other ways)[9], the probability of catching someone in ambiguous aggression is much lower.
An under-discussed aspect of LessWrong is how voting affects culture, author expectations and conversational dynamics. Voting is anonymous, even to admins (the only cases where we look at votes is when we are investigating mass-downvoting or sockpuppetting or other kinds of extreme voting abuse). Now, does that mean that everyone is free to vote however they want?
The answer is a straightforward “no”. Ultimately, voting is a form of participation on the site that can be done well and badly, and while I think it’s good for that participation to be anonymous and generally shielded from retaliation, at a broader level, it is a job of the moderators to pay attention to unhealthy vote dynamics. We cannot police what you do with your votes, but if you do abuse your votes, you will make the site worse, and we might end up having to change the voting system towards something less expressive but more robust as a result.
These are general issues, but how do they relate to this whole Said banning thing?
Well, another important dimension of how the dynamics in these threads go is roughly the following:
Said makes a top-level comment asking a question or making some kind of relatively low-effort critique
This will most of the time get upvoted, because questions or top-level critiques rarely get downvoted
The discussion continues for 3-4 replies, and with each reply the number of users voting goes down
By the time the discussion is ~3 levels deep, basically all voting is done by two groups of people: LessWrong moderators and a very small set of LessWrong users who are extremely active and tend to take a strong interest in Said’s comments
If the LessWrong moderators do not vote, the author is now in a position where any further replies by Said get upvoted and their responses get reliably downvoted. If they do vote, the variance in voting in the thread quickly goes through the roof, resulting in lots of people with strongly upvoted or strongly downvoted comments, since everyone involved has very high vote-strength.
This is bad. The point of voting is to give an easy way of aggregating information about the quality and reception of content. When voting ends up dominated by a small interest[10] group without broader site buy-in, and with no one being able to tell that is what’s going on, it fails at that goal. And in this case, it’s distorting people’s perception about the site consensus in particularly high-stakes contexts where authors are trying to assess what people on the site think about their content, and about the norms of posting on LessWrong.
I don’t really know what to do about this. It’s one of the things that makes me more interested in bans than other things, since site-wide banning also comes with removal of vote-privileges, though of course we could also rate-limit votes or find some other workaround that achieves the same aim. I also am not confident this is really what’s going on as I do not look at vote data, and nothing here alone would make me confident I would want to ban anyone, but I think the voting has been particularly whack in a lot of these threads, and that seemed important to call out.
But why ban someone, can’t people just ignore Said?
What this de-facto means is that there is always an obligation by the author to respond to your comment, or otherwise be interpreted to be ignorant.
There is always an obligation by any author to respond to anyone’s comment along these lines. If no response is provided to (what ought rightly to be) simple requests for clarification (such as requests to, at least roughly, define or explain an ambiguous or questionable term, or requests for examples of some purported phenomenon), the author should be interpreted as ignorant. These are not artifacts of my particular commenting style, nor are they unfortunate-but-erroneous implications—they are normatively correct general principles.
Many people don’t have the time, or find engaging with commenters exhausting
Then they shouldn’t post on a discussion forum, should they? What is the point of posting here, if you’re not going to engage with commenters?
this creates a default expectation that if they do not engage extensively with your comments in particular (with higher priority than anything else in the comment thread) there will be a public attack on them left unanswered.
This is only because most people don’t bother to ask (what I take to be) such obvious, and necessary, clarifying questions. (Incidentally, I take this fact to be a quite damning indictment of the epistemic norms of most of Less Wrong’s participants.) When I ask such questions, it is because no one else is doing it. I would be happy to see others do it in my stead.
distinguish between a question that is intended as a critique when left unanswered, and one that is an optional request for clarification
Viewing such clarifications as “optional” also speaks to an unacceptable low standard of intellectual honesty.
Once again: there is no confusion; there is no dichotomy. A request for clarification is neither an attack nor even a critique. The normal, expected form of the interaction, in the case where the original post is correct, sensible, and otherwise good (and where the only problem is an insufficiency in communicating the idea), is simply “[request for clarification] → [satisfactory clarification] → [end]”. Only a failure of this process to take place is in need of “defending”.[11]
Now, in the comment thread in which the comment above was made, both mods and authors have clarified that no, authors do not have an obligation to respond with remotely the generality outlined here, and the philosophy of discourse Said outlines is absolutely not site consensus. However, this does little for most authors. Most people have never read the thread where those clarifications were made, and never will. And even if we made a top-level post, or added a clarification to our new user guidelines, this would do little to change what is going on.
Because every time Said leaves a top-level comment, it is clear to most authors and readers that he is implying the presence of a social obligation to respond. And because of the dynamics I elaborated on in the previous section, it is not feasible for moderators or other users to point out the underlying dynamic each time, which itself requires careful compiling of evidence and pointing out (and then probably disputing) subtext.
So, despite it being close to site-consensus that authors do not face obligations to respond to each and every one of Said’s questions, on any given post, there is basically nothing to be done to build common knowledge of this. Said can simply make another comment thread implying that if someone doesn’t respond they deserve to be judged negatively, and there will always be enough people voting who have not seen this pattern play out, or even support Said’s relationship to author obligation against the broad site consensus, to make the question and critiques be upvoted enough. And so despite their snark and judgment they will appear to be made in good standing, and so deserve to be responded to, and the cycle will begin anew.
Now in order to fix this dynamic, the moderation team has made multiple direct moderation requests of Said, summarized here as a high-level narrative (though in reality it played out over more like a decade):
To which Said responded by claiming that anyone who dared to use moderation tools on their posts could only possibly do so for bad-faith reasons and should face social punishment for doing so
To which we responded by telling Said to please let authors moderate as they desire and to not do that again, and gave him a 3 month rate-limit
So ultimately, what other option do we have but a ban? We could attach a permanent mark to Said’s profile with a link to a warning that this user has a long history of asking heckling questions and implying social punishment without much buy-in for that, but that seems to me to create more of an environment of ongoing hostility, and many would interpret such a mark as cruel to have imposed.
So no, I think banning is the best option available.
Ok, but shouldn’t there be some kind of justice process?
I expect it to be uncontroversial to suggest that most moderation on LessWrong should be soft-touch. The default good outcome of a moderator showing up on a post is to leave a comment warning of some bad conversational pattern, or telling one user to change their behavior in some relatively specific way. The involved users take the advice, the thread gets better or ends, and everyone moves on with their day.
Ideally this process starts and completes within 10 minutes, from the moderator noticing something going off the rails to sending off the comment providing either advice or a warning.
However, sometimes these interactions escalate, or moderators notice a more systematic pattern with specific users causing repeated problems, and especially if someone disagrees with the advice or recommendation of a moderator, it’s less clear how things are supposed to proceed. Historically the moderation standard for LessWrong has been unilateral dictatorship awarded to a head admin. But even with that dictatorship being granted to me, it is still up to me to decide how I want myself and the LessWrong team to handle these kinds of cases.
At a high-level I think there are two reasonable clusters of approaches here:
High-stakes decisions get made by some kind of court that tries to recruit impartial judges to a dispute. The judges get appointed by the head moderator, but have a tenure and substantial independent power.
High-stakes decisions get made directly by the acting head-moderator, who takes personal responsibility for all decisions on the site
(In either case it would make sense to try to generalize any judgments or decisions made into meaningful case-law to serve as the basis for future similar decisions, and to be added to some easily searchable set of past judgments that users and authors can use to gain more transparency into the principles behind moderation decisions, and predict how future decisions are likely to be made.)
Both of these approaches are fairly common in general society. Companies generally have a CEO who can unilaterally make firing and rule-setting decisions. Older institutions and governmental bodies often tend to have courts or committees. I considered for quite a while whether as part of this moderation decision I should find and recruit a set of more impartial judges to make high-stakes decisions like this.
But after thinking about it for a while, I decided against it. There are a few reasons:
Judging cases like this is hard and takes a lot of work. It also benefits from having spent lots of time thinking about moderation and incentives and institutional design. Recruiting someone good for this role would require paying them a bunch for their high opportunity cost, and is also just kind of objectively hard.
Incentivizing strong judge performance seems difficult. The default outcome I’ve seen from committees and panels is that everyone on them half-asses their job, because they rarely have stake in the outcome being good. Even if someone cares about LessWrong, that is not the same as being generally held personally responsible for it, and having your salary and broader reputation depend on how well LessWrong is going.
I think there is a broader pattern in society of people heavily optimizing for “defensibility”, and this mostly makes things worse. I think most of the reason for the popularity of committees and panels and boards deciding high-stakes matters is that this makes the decision harder to attack externally, by creating a veneer of process. Blankfacedness as Scott Aaronson described it is also a substantial part of this. I do not want LessWrong to become a blankfaced bureaucracy that hands down judgment from on high. Whenever it’s possible I would like people to be able to talk to someone who is directly responsible for the outcome of the decision, who could change their mind in that very conversation if presented with compelling evidence.
Considerations like these are what convinced me that even high-stakes decisions like this should be made on my own personal conscience. I have a stake in LessWrong going well, and I can take the time to give these kinds of decisions the resources they deserve to get made well, and I can be available for people to complain and to push on if people disagree with them.
But in doing so, I do want to do a bunch of stuff that gets us the good parts of the more judicially oriented process. Here are some things that I think make sense, even in a process oriented around personal responsibility:
I think it’s good to publish countervailing opinions when possible. I frequently expect people on the Lightcone team to disagree with decisions I make, and when that happens, I will encourage them to write up their perspective and serve as a record that will make it easier to spot broader blind spots in my decision-making (and also reduce gaslighting dynamics where people feel forced to support decisions I make out of fear of being retaliated against).
I think it’s good to have a cool-off period before making high-stakes moderation decisions. I expect to basically never make a ban decision on any contributor with a substantial comment history without taking at least a few days to step away from any active discussion to think about it. This doesn’t mean that a ban decision might not be preceded by a heated discussion (I think it’s usually good to give whoever is being judged the chance to defend themselves, and those discussions might very well turn out heated), but the basic shape of the decision will be formed away from any specific heated thread.
I want to make it easier to find past moderation decisions the moderators made and for people to form a sense of the case-law of the site. As part of me working on this post I’ve started on a redesign of the /moderation page to make it substantially easier to see a history of moderation comments, as well as to include an overview over the basic principles of how moderation is done on the site.
I think I should very rarely[14] prevent whoever is being affected by a moderation decision from defending themselves on LessWrong. For practical reasons, I need to limit the time I personally spend replying and engaging with their defense, but I think it makes sense to allow the opposing side in basically all cases like this to publish a defense on the site, and do at least some to signal-boost it together with any moderation decisions.
So what options do I have if I disagree with this decision?
Well, the first option you always have, and which is the foundation of why I feel comfortable governing LessWrong with relatively few checks and balances, is to just not use LessWrong. Not using LessWrong probably isn’t that big of a deal for you. There are many other places on the internet to read interesting ideas, to discuss with others, to participate in a community. I think LessWrong is worth a lot to a lot of people, but I think ultimately, things will be fine if you don’t come here.
Now, I do recommend that if you stop using the site, you do so by loudly giving up, not quietly fading. Leave a comment or make a top-level post saying you are leaving. I care about knowing about it, and it might help other people understand the state of social legitimacy LessWrong has in the broader world and within the extended rationality/AI-Safety community.
Of course, not all things are so bad as to make it the right choice to stop using LessWrong altogether. You can complain to the mods on Intercom, or make a shortform, or make a post about how you disagree with some decision we made. I will read them, and there is a decent chance we will respond or try to clarify more or argue with you more, though we can’t guarantee this. I also highly doubt you will end up coming away thinking that we are right on all fronts, and I don’t think you should use that as a requirement for thinking LessWrong is good for the world.
And if the stakes are even higher, you can ultimately try to get me fired from this job. The exact social process for who can fire me is not as clear to me as I would like, but you can convince Eliezer to give head-moderatorship to someone else, or convince the board of Lightcone Infrastructure to replace me as CEO, if you really desperately want LessWrong to be different than it is.
But beyond that, there is no higher appeals process. At some point I will declare that the decision is made, and stands, and I don’t have time to argue it further, and this is where I stand on the decision this post is about.
An overview over past moderation discussion surrounding Said
I have tried to make this post relatively self-contained and straightforward to read, trying to avoid making you the reader feel like you have to wade through 100,000+ words of previous comment threads to have any idea what is going on[15], at least from my perspective. However, for the sake of completeness, and because I do think it provides useful context for the people who want to really dive into this kind of decision, here is a quick overview over past moderation discussion and decisions related to Said:
8 years ago, Elizabeth wrote a moderation warning to Said, and there was some followup discussion with him on Intercom. With habryka’s buy-in, Elizabeth said roughly “if you don’t change your behavior in some way you’ll be banned”, and Said said roughly “it’s not worth it for me to change my behavior, I would rather not participate on LW at all in that case”[16]. He did not change his behavior, we did not end up banning him at this time, and he also did not stop participating on LW.
6 years ago on New Year’s Eve/New Year’s Day, in a thread (with ~35,000 words in it), I wrote ~40 comments arguing with Said’s that his commenting style was corrosive to a good commenting culture on LessWrong.
2 years ago there was a conflict between Said Achmiz and Duncan Sabien that spanned many posts and comment threads and led Raymond Arnold to write a moderation post about it that received 560 (!) comments, or ~64,000 words. Ray took the mod action of giving Said a rate-limit of 3-comments-per-post-per-week, for a duration of 3 months.
Last month, a user banned Said from commenting on his posts. Said took to his own shortform where (amongst other things) he and others called that author a coward for banning him. Then Wei Dai commented on a 2018 post, that announced that authors (over a certain karma threshold) have the ability to ban users from their posts, to argue against this being allowed, under which a large thread developed again defending Said from being banned from that user’s posts, a thread of ~61,000 words.
The most substantial of these is Ray’s moderation judgement from two years ago. I would recommend the average reader not read it all, but it is the result of another 100+ hour effort, and so does contain a bunch of explanation and context. You can read through the comments Ray made in the appendix to this post.
What does this mean for the rest of us?
My current best guess is that not that much has to change. My sense is Said has been a commenter with uniquely bad effects on the site, and while there are people who are making mistakes along similar lines, there are very few who are as prolific or have invested as much into the site. I think the most likely way I can imagine the considerations in this post resulting in more than just banning Said is if someone decides to intentionally pick up the mantle of Said Achmiz in order to fill the role that they perceive he filled on the site, and imitate his behavior in ways that recreate the dynamics I’ve pointed out.[17]
There are a few users who I have similar concerns about as I had about Said, and I do want this post to save me effort in future moderation disputes. I do also expect to refer back to the ideas in this post for many years in various moderation discussions and moderation judgments, but don’t have any immediate instances of that in mind.
I do think it makes sense to try to squeeze out some guidance for future moderation decisions out of this. So in case-law fashion, here are some concrete guidelines derived from this case:
You are at least somewhat responsible for the subtext other people read into your comments, you can’t disclaim all responsibility for that
Sometimes things we write get read by other people to say things we didn’t mean. Sometimes we write things that we hope other people will pick up, but we don’t want to say straight out. Sometimes we have picked up patterns of speech or metaphors that we have observed “working”, but that actually don’t work the way we think they do (like being defensive when we get negative feedback resulting in less negative feedback, which one might naively interpret as being assessed less negatively).
On LessWrong, it is okay if an occasional stray reader misreads your comments. It is even okay if you write a comment that most of the broad internet would predictably misunderstand, or view as some kind of gaffe or affront. LessWrong has its own communication culture.
But if a substantial fraction of other commenters consistently interpret your comments to mean something different than what you claim they say when asked for clarification, especially if they do so in contexts where that misinterpretation happens to benefit you in conflicts you are involved in, then that is a thing you are at least partially responsible for.[18]
This all also intersects with “decoupling vs. contextualizing” norms. A key feature of LessWrong is that people here tend to be happy to engage with any specific object-level claim, largely independently of what the truth of that claim might imply at a status or reputation or blame level about the rest of the world. This, if you treat it as a single dimension, puts LessWrong pretty far into having “decoupling” norms . I think this is good and important and a crucial component of how LessWrong has maintained its ability to develop important ideas, and help people orient to the world.
This intersection produces a tension. If you are responsible for people on LessWrong reading context and implications and associations into your contributions you didn’t intend, then that sure sounds like the opposite of the kind of decoupling norms that I think is so important for LessWrong.
I don’t have a perfect resolution to this. Zack had a post on this with some of his thoughts that I found helpful:
I argue that, at best, this is a false dichotomy that fails to clarify the underlying issues—and at worst (through no fault of Leong or Nerst), the concept of “contextualizing norms” has the potential to legitimize derailing discussions for arbitrary political reasons by eliding the key question of which contextual concerns are genuinely relevant, thereby conflating legitimate and illegitimate bids for contextualization.
Real discussions adhere to what we might call “relevance norms”: it is almost universally “eminently reasonable to expect certain contextual factors or implications to be addressed.” Disputes arise over which certain contextual factors those are, not whether context matters at all.
The standard academic account explaining how what a speaker means differs from what the sentence the speaker said means, is H. P. Grice’s theory of conversational implicature. Participants in a conversation are expected to add neither more nor less information than is needed to make a relevant contribution to the discussion.
I disagree with Zack that the dichotomy between decoupling and contextualizing norms fails to clarify any of the underlying issues. I do think you can probably graph communities and spaces pretty well on a vector from “high decoupling” to “high contextualizing”, and this will allow you to make a lot of valid predictions.
But as Zack helpfully points out here, the key thing to understand is that of course many forms of context and implications are obviously relevant and important, and worth taking into account during a conversation. This is true on LessWrong as well as anywhere else. If your comments have a consistent subtext of denigrating authors who invoke reasoning by analogy, because you think most people who reason by analogy are confused (a potentially reasonable if contentious position on epistemics), then you better be ready to justify that denigration when asked about it.
Responses of the form “I am just asking these people for the empirical support they have for their ideas, I am not intending to make a broader epistemological point” are OK if they reflect a genuine underlying policy of not trying to shift the norms of the site towards your preferred epistemological style, and associated (bounded) efforts to limit such effects when asked. If you do intend to shift the norms of the site, you better be ready to argue for that, and it is not OK to follow an algorithm that is intending to have a denigrating effect, but that shields itself from the need for justification or inspection by invoking decoupling norms. What work is respected and rewarded on LessWrong is of real and substantial relevance to the participants of LessWrong. Sure, obsession with that dimension is unhealthy for the site, and I think it’s actively good to most of the time ignore it. But, especially if the subtext is repeated across many comments from the same author, it is the kind of thing that we need to be able to talk about, and sometimes moderate.
As as such, within these bounds, “tone” is very much a thing the LessWrong moderators will pay attention to, as are the implied connotations of the words you use, as are the metaphors you choose, and the associations that come with them. And while occasionally a moderator will take the effort to disentangle all your word choices, and pin down in excruciating detail why something you said implied something else and how you must have been aware of that on some level given what you were writing, they do not generally have the capacity to do so in most circumstances. Moderators need the authority to, at some level, police the vibe of your comments, even without a fully mechanical explanation of how that vibe arises from the specific words you chose.
Do not try to win arguments by fights of attrition
A common pattern on the internet is that whoever has the most patience for re-litigating and repeating their points ultimately wins almost any argument. As long as you avoid getting visibly frustrated, or insulting your opponents, and display an air of politeness, you can win most internet arguments by attrition. If you are someone who might have multiple hours per day available to write internet comments, you can probably eke out some kind of concession or establish some kind of norm in almost any social space, or win some edit war that you particularly care about.
This is a hard thing to combat, but the key thing that makes this tactic possible is to be in a social space in which it is assumed that comments or questions are made in good standing as long as they aren’t obviously egregious.
On LessWrong, if you make a lot of comments, or ask a lot of questions, with a low average hit-rate on providing value by the lights of the moderators, my best guess is you are causing more harm than good, especially if many of those comments are part of conversations that try to prove some kind of wrongdoing or misleadingness on behalf of your interlocutor (so that they feel an obligation to respond). And this is a pattern we will try to notice and correct (while also recognizing that sometimes it is worth pressing people on crucial and important questions, as people can be evasive and try to avoid reasonable falsification of their ideas in defense of their reputation/ego).
Building things that help LessWrong’s mission will make it less likely you will get banned
While the overall story of Said is one of him ultimately getting banned from LessWrong, it is definitely the case that having built readthesequences.com and greaterwrong.com and his contributions to gwern.net have all increased the thresholds we had for banning very substantially.
And I overall stand behind this choice. Being banned from LessWrong does affect people’s ability to contribute and participate in the broader Rationality community ecosystem, and I think it makes sense to tolerate people’s weaknesses in one domain, if that allows them to be a valuable contributor in another domain, even if those two domains are not necessarily governed by the same people.
So yeah, I do think you get to be a bit more of a dick, for longer, if you do a lot of other stuff that helps LessWrong’s broader mission. This has limits, and we will invest a bunch into limiting the damage or helping you improve, but it does also just help.
So with all that Said
And so we reach the end of this giant moderation post. I hope I have clarified at least my perspective on many things. I will aim to limit my engagement with the comments of this post to at most 10 hours. Said is also welcome to send additional commentary to me in the next 10 days, and if so, I will append it to this post and link to it somewhere high up so that people can see it if they get linked here.[19] I will also make one top-level comment below this post under which Said will be allowed to continue commenting for the next 2 weeks, and where people can ask questions.
Farewell. It’s certainly been a ride.
Appendix: 2022 moderation comments
In 2022 Ray wrote 10,000+ words the last time we took moderation action on Said, which I extracted here for convenience. I don’t recommend the average reader read them all, but I do think they were another high-effort attempt at explaining what was going on.
Overview/outline of initial comment
Okay, overall outline of thoughts on my mind here:
What actually happened in the recent set of exchanges? Did anyone break any site norms? Did anyone do things that maybe should be site norms but we hadn’t actually made it an explicit rule and we should take the opportunity to develop some case law and warn people not to do it in the future?
5 years ago, the moderation team has issued Said a mod warning about a common pattern of engagement he does that a lot of people have complained about (this was operationalized as “demanding more interpretive labor than he has given”). We said if he did it again we’d ban him for a month. My vague recollection is he basically didn’t do it for a couple years after the warning, but maybe started to somewhat over the past couple years, but I’m not sure (I think he may have not done the particular thing we asked him not to, but I’ve had a growing sense his commenting is making me more wary of how I use the site). What are my overall thoughts on that?
Various LW team members have concerns about how Duncan handles conflict. I’m a bit confused about how to think about it in this case. I think a number of other users are worried about this too. We should probably figure out how we relate to that and make it clear to everyone.
It’s Moderation Re-Evaluation Month. It’s a good time to re-evaluate our various moderation policies. This might include “how we handle conflict between established users”, as well as “are there any important updates to the Authors Can Moderate Their Posts rules/tech?
It seems worthwhile to touch on each of these at least somewhat. I’ll follow up on each topic at least somewhat.
Recap of mod team history with Said Achmiz
First, some background context. When LW2.0 was first launched, the mod team had several back-and-forths with Said over complaints about his commenting style. He was (and I think still is) the most-complained-about LW user. We considered banning him.
Ultimately we told him this:
As Eliezer is wont to say, things are often bad because the way in which they are bad is a Nash equilibrium. If I attempt to apply it here, it suggests we need both a great generative and a great evaluative process before the standards problem is solved, at the same time as the actually-having-a-community-who-likes-to-contribute-thoughtful-and-effortful-essays-about-important-topics problem is solved, and only having one solved does not solve the problem.
I, Oli and Ray will build a better evaluative process for this online community, that incentivises powerful criticism. But right now this site is trying to build a place where we can be generative (and evaluative) together in a way that’s fun and not aggressive. While we have an incentive toward better ideas (weighted karma and curation), it is far from a finished system. We have to build this part as well as the evaluative before the whole system works, and while we’ve not reached there you’re correct to be worried and want to enforce the standards yourself with low-effort comments (and I don’t mean to imply the comments don’t often contain implicit within them very good ideas).
But unfortunately, given your low-effort criticism feels so aggressive (according to me, the mods, and most writers I talk to in the rationality community), this is just going to destroy the first stage before we get the second. If you write further comments in this pattern which I have pointed to above, I will not continue to spend hours trying to pass your ITT and responding; I will just give you warnings and suspensions.
I may write another comment in this thread if there is something simple to clarify or something, but otherwise this is my last comment in this thread.
Followed by:
This was now a week ago. The mod team discussed this a bit more, and I think it’s the correct call to give Said an official warning (link) for causing a significant number of negative experiences for other authors and commenters.
Said, this moderation call is different than most others, because I think there is a place for the kind of communication culture that you’ve advocated for, but LessWrong specifically is not that place, and it’s important to be clear about what kind of culture we are aiming for. I don’t think ill of you or that you are a bad person. Quite the opposite; as I’ve said above, I deeply appreciate a lot of the things you’ve build and advice you’ve given, and this is why I’ve tried to put in a lot of effort and care with my moderation comments and decisions here. I’m afraid I also think LessWrong will overall achieve its aims better if you stop commenting in (some of) the ways you have so far.
Said, if you receive a second official warning, it will come with a 1-month suspension. This will happen if another writer has an extensive interaction with you primarily based around you asking them to do a lot of interpretive labour and not providing the same in return, as I described in my main comment in this thread.
I do have a strong sense of Said being quite law-abiding/honorable about the situation despite disagreeing with us on several object and meta-level moderation policy, which I appreciate a lot.
I do think it’s worth noting that LessWrong 2.0 feels like it’s at a more stable point than it was in 2018. There’s enough critical mass of people posting here I that I’m less worried about annoying commenters killing it completely (which was a very live fear during the initial LW2.0 revival)
But I am still worried about the concerns from 5 years ago, and do basically stand by Ben’s comment. And meanwhile I still think Said’s default commenting style is much worse than nearby styles that would accomplish the upside with less downside.
My summary of previous discussions as I recall them is something like:
Mods: “Said, lots of users have complained about your conversation style, you should change it.”
Said: “I think a) your preferred conversation norms here don’t make sense to me and/or seem actively bad in many cases, and b) I think the thing my conversation style is doing is really important for being a truthtracking forum.”
[...lots of back-and-forth...]
Mods: ”...can you change your commenting style at all?”
Said: “No, but I can just stop commenting in particular ways if you give me particular rules.”
Then we did that, and it sorta worked for awhile. But it hasn’t been wholly satisfying to me. (I do have some sense that Said has recently ended up commenting more in threads that are explicitly about setting norms, and while we didn’t spell this out in our initial mod warning, I do think it is extra costly to ban someone from discussions of moderation norms than from other discussion. I’m not 100% sure how to think about this)
Death by a thousand cuts and “proportionate”(?) response
A way this all feels relevant to current disputes with Duncan is that thing that is frustrating about Said is not any individual comment, but an overall pattern that doesn’t emerge as extremely costly until you see the whole thing. (i.e. if there’s a spectrum of how bad behavior is, from 0-10, and things that are a “3” are considered bad enough to punish, someone who’s doing things that are bad at a “2.5″ or “2.9” level don’t quite feel worth reacting to. But if someone does them a lot it actually adds up to being pretty bad.
If you point this out, people mostly shrug and move on with their day. So, to point it out in a way that people actually listen to, you have to do something that looks disproportionate if you’re just paying attention to the current situation. And, also, the people who care strongly enough to see that through tend to be in an extra-triggered/frustrated state, which means they’re not at their best when they’re doing it.
I think Duncan’s response looks very out-of-proportion. I think Duncan’s response is out of proportion to some degree (see Vaniver thread for some reasons why. I have some more reasons I plan to write about).
But I do think there is a correct thing that Duncan was noting/reacting to, which is that actually yeah, the current situation with Said does feel bad enough that something should change, and it indeed the mods hadn’t been intervening on it because it didn’t quite feel like a priority.
I liked Vaniver’s description of Duncan’s comments/posts as making a bet that Said was in fact obviously banworthy or worthy of significant mod action, and that there was a smoking gun to that effect, and if this was true then Duncan would be largely vindicated-in-retrospect.
I’ll lay out some more thinking as to why, but, my current gut feeling + somewhat considered opinion is that “Duncan is somewhat vindicated, but not maximally, and there are some things about his approach I probably judge him for.”
Maybe explicit rules against blocking users from “norm-setting” posts.
Onblocking users from commenting
I still endorse authors being able to block other users (whether for principles reasons, or just “this user is annoying”). I think a) it’s actually really important for authors for the site to be fun to use, b) there’s a lot of users who are dealbreakingly annoying to some people but not others. Banning them from the whole site would be overkill. c) authors aren’t obligated to lend their own karma/reputation to give space to other people’s content. If an author doesn’t want your comments on his post, whether for defensible reasons or not, I think it’s an okay answer that those commenters make their own post or shortform arguing the point elsewhere.
Yes, there are some trivial inconveniences to posting that criticism. I do track that in the cost. But I think that is outweighed by the effect on authors being motivated to post.
That all said...
Blockingusers on “norm-setting posts”
I think it’s more worrisome to block users on posts that are making major momentum towards changing site norms/culture. I don’t think the censorship effects are that strong or distorting in most cases, but I’m most worried about censorship effects being distorting in cases that affect ongoing norms about what people can say.
There’s a blurry line here, between posts that are putting forth new social concepts, and posts advocating for applying those concepts towards norms (either in the OP or in the comments), and a further blurry line between that and posts which arguing about applying that to specific people. i.e. I’d have an ascending wariness of:
I think it was already a little sketchy that Basics of Rationalist Discourse went out of it’s way to call itself “The Basics” rather than “Duncan’s preferred norms” (a somewhat frame-control-y move IMO although not necessarily unreasonably so), while also blocking Zack at the time. It feels even more sketchy to me to write Killing Socrates, which AFAICT a thinly veiled “build-social-momentum-against-Said-in-particular”, where Said can’t respond (and it’s disproportionately likely that Said’s allies also can’t respond)
Right now we don’t have tech to unblock users from a specific post, who have been banned from all of a user’s posts. But this recent set of events has me learning towards “build tech to do that”, and then make it a rule that post over at the threshold of “Basics” or higher (in terms of site-norm-momentum-building), need to allow everyone to comment.
I do expect that to make it less rewarding to make that sort of post. And, well, to (almost) quote Duncan:
Put another way: a frequent refrain is “well, if I have to put forth that much effort, I’ll never say anything at all,” to which the response is often [“sorry I acknowledge the cost here but I think that’s an okay tradeoff”]
Okaybut what do I do about Said when he shows up doing his whole pattern of subtly-missing/and/or/reframing-the-point-while-sprawling massive threads, in an impo
My answer is “strong downvote him, announce you’re not going to engage, maybe link to a place where you went into more detail about why if this comes up a lot, and move on with your day.” (I do generally wish Duncan did more of this and less trying to set-the-record straight in ways that escalate in IMO very costly ways)
Preliminary Verdict (but not “operationalization” of verdict)
tl;dr – @Duncan_Sabien and @Said Achmiz each can write up to two more comments on this post discussing what they think of this verdict, but are otherwise on a temporary ban from the site until they have negotiated with the mod team and settled on either:
credibly commit to changing their behavior in a fairly significant way,
or, accept some kind of tech solution that limits their engagement in some reliable way that doesn’t depend on their continued behavior.
or, be banned from commenting on other people’s posts (but still allowed to make new top level posts and shortforms)
(After the two comments they can continue to PM the LW team, although we’ll have some limit on how much time we’re going to spend negotiating)
Some background:
Said and Duncan are both among the two single-most complained about users since LW2.0 started (probably both in top 5, possibly literally top 2). They also both have many good qualities I’d be sad to see go.
The LessWrong team has spent hundreds of person hours thinking about how to moderate them over the years, and while I think a lot of that was worthwhile (from a perspective of “we learned new useful things about site governance”) there’s a limit to how much it’s worth moderating or mediating conflict re: two particular users.
So, something pretty significant needs to change.
A thing that sticks out in both the case of Said and Duncan is that they a) are both fairly law abiding (i.e. when the mods have asked them for concrete things, they adhere to our rules, and clearly support rule-of-law and the general principle of Well Kept Gardens), but b) both have a very strong principled sense of what a “good” LessWrong would look like and are optimizing pretty hard for that within whatever constraints we give them.
I think our default rules are chosen to be something that someone might trip accidentally, if you’re trying to mostly be good stereotypical citizen but occasionally end up having a bad day. Said and Duncan are both trying pretty hard to be good citizen in another country that the LessWrong team is consciously not trying to be. It’s hard to build good rules/guidelines that actually robustly deal with that kind of optimization.
I still don’t really know what to do, but I want to flag that the the goal I’ll be aiming for here is “make it such that Said and Duncan either have actively (credibly) agreed to stop optimizing in a fairly deep way, or, are somehow limited by site tech such that they can’t do the cluster of things they want to do that feels damaging to me.”
If neither of those strategies turn out to be tractable, banning is on the table (even though I think both of them contribute a lot in various ways and I’d be pretty sad to resort to that option). I have some hope tech-based solutions can work
(This is not a claim about which of them is more valuable overall, or better/worse/right-or-wrong-in-this-particular-conflict. There’s enough history with both of them being above-a-threshold-of-worrisome that it seems like the LW team should just actually resolve the deep underlying issues, regardless of who’s more legitimately aggrieved this particular week)
Re: Said:
One of the most common complaints I’ve gotten about LessWrong, from both new users as well as established, generally highly regarded users, is “too many nitpicky comments that feel like they’re missing the point”. I think LessWrong is less fragile than it was in 2018 when I last argued extensively with Said about this, but I think it’s still an important/valid complaint.
Said seems to actively prefer a world where the people who are annoyed by him go away, and thinks it’d be fine if this meant LessWrong had radically fewer posts. I think he’s misunderstanding something about how intellectual progress actually works, and about how valuable his comments actually are. (As I said previously, I tend to think Said’s first couple comments are worthwhile. The thing that feels actually bad is getting into a protracted discussion, on a particular (albeit fuzzy) cluster of topics)
We’ve had extensive conversations with Said about changing his approach here. He seems pretty committed to not changing his approach. So, if he’s sticking around, I think we’d need some kind of tech solution. The outcome I want here is that in practice Said doesn’t bother people who don’t want to be bothered. This could involve solutions somewhat specific-to-Said, or (maybe) be a sitewide rule that works out to stop a broader class of annoying behavior. (I’m skeptical the latter will turn out to work without being net-negative, capturing too many false positives, but seems worth thinking about)
Here are a couple ideas:
Easily-triggered-rate-limiting. I could imagine an admin feature that literally just lets Said comment a few times on a post, but if he gets significantly downvoted, gives him a wordcount-based rate-limit that forces him to wrap up his current points quickly and then call it a day. I expect fine-tuning this to actually work the way I imagine in my head is a fair amount of work but not that much.
Proactive warning.If a post author has downvoted Said comments on their post multiple times, they get some kind of UI alert saying “Yo, FYI, admins have flagged this user as somewhat with a pattern of commenting that a lot of authors have found net-negative. You may want to take that into account when deciding how much to engage”.
There’s some cluster of ideas surrounding how authors are informed/encouraged to use the banning options. It sounds like the entire topic of “authors can ban users” is worth revisiting so my first impulse is to avoid investing in it further until we’ve had some more top-level discussion about the feature.
Why is it worththis effort?
You might ask “Ray, if you think Said is such a problem user, why bother investing this effort instead of just banning him?”. Here are some areas I think Said contributes in a way that seem important:
Various ops/dev work maintaining sites like readthesequences.com, greaterwrong.com, and gwern.com. (edit: as Ben Pace notes, this is pretty significant, and I agree with his note that “Said is the person independent of MIRI (including Vaniver) and Lightcone who contributes the most counterfactual bits to the sequences and LW still being alive in the world”)
Most of his comments are in fact just pretty reasonable and good in a straightforward way.
While I don’t get much value out of protracted conversations about it, I do think there’s something valuable about Said being very resistant to getting swept up in fad ideas. Sometimes the emperor in fact really does have no clothes. Sometimes the emperor has clothes, but you really haven’t spelled out your assumptions very well and are confused about how to operationalize your idea. I do think this is pretty important and would prefer Said to somehow “only do the good version of this”, but seems fine to accept it as a package-deal.
Re: Duncan
I’ve spent years trying to hash out “what exactly is the subtle but deep/huge difference between Duncan’s moderation preferences and the LW teams.” I have found each round of that exchange valuable, but typically it didn’t turn out that whatever-we-thought-was-the-crux was a particularly Big Crux.
I think I care about each of the things Duncan is worried about (i.e. such as things listed in Basics of Rationalist Discourse). But I tend to think the way Duncan goes about trying to enforce such things extremely costly.
Here’s this month/year’s stab at it: Duncan cares particularly about things strawmans/mischaracterizations/outright-lies getting corrected quickly (i.e. within ~24 hours). See Concentration of Force for his writeup on at least one-set-of-reasons this matters). I think there is value in correcting them or telling people to “knock it off” quickly. But,
a) moderation time is limited b) even in the world where we massively invest in moderation… the thing Duncan cares most about moderating quickly just doesn’t seem like it should necessarily be at the top of the priority queue to me?
I was surprised and updated on You Don’t Exist, Duncan getting as heavily upvoted as it did, so I think it’s plausible that this is all a bigger deal than I currently think it is. (that post goes into one set of reasons that getting mischaracterized hurts). And there are some other reasons this might be important (that have to do with mischaracterizations taking off and becoming the de-facto accepted narrative).
I do expect most of our best authors to agree with Duncan that these things matter, and generally want the site to be moderated more heavily somehow. But I haven’t actually seen anyone but Duncan argue they should be prioritized nearly as heavily as he wants. (i.e. rather than something you just mostly take-in-stride, downvote and then try to ignore, focusing on other things)
I think most high-contributing users agree the site should be moderated more (see the significant upvotes on LW Team is adjusting moderation policy), but don’t necessarily agree on how. It’d be cruxy for me if more high-contributing-users actively supported the sort of moderation regime Duncan-in-particular seems to want.
I don’t know that really captured the main thing here. I feel less resolved on what should change on LessWrong re: Duncan. But I (and other LW site moderators), want to be clear that while strawmanning is bad and you shouldn’t do it, we don’t expect to intervene on most individual cases. I recommend strong downvoting, and leaving one comment stating the thing seems false.
I continue to think it’s fine for Duncan to moderate his own posts however he wants (although as noted previously I think an exception should be made for posts that are actively pushing sitewide moderation norms)
Some goals I’d have are:
people on LessWrong feel safe that they aren’t likely to get into sudden, protracted conflict with Duncan that persists outside his own posts.
the LessWrong team and Duncan are on-the-same-page about LW team not being willing to allocate dozens of hours of attention at a moments notice in the specific ways Duncan wants. I don’t think it’s accurate to say “there’s no lifeguard on duty”, but I think it’s quite accurate to say that the lifeguard on duty isn’t planning to prioritize the things Duncan wants, so, Duncan should basically participate on LessWrong as if there is, in effect “no lifeguard” from his perspective. I’m spending ~40 hours this week processing this situation with a goal of basically not having to do that again.
In the past Duncan took down all his LW posts when LW seemed to be actively hurting him. I’ve asked him about this in the past year, and (I think?) he said he was confident that he wouldn’t. One thing I’d want going forward is a more public comment that, if he’s going to keep posting on LessWrong, he’s not going to do that again. (I don’t mind him taking down 1-2 problem posts that led to really frustrating commenting experiences for him, but if he were likely to take all the posts down that undercuts much of the value of having him here contributing)
FWIW I do think it’s moderately likely that the LW team writes a post taking many concepts from Basics of Rationalist Discourse and integrating it into our overall moderation policy. (It’s maybe doable for Duncan to rewrite the parts that some people object to, and to enable commenting on those posts by everyone. but I think it’s kinda reasonable for people to feel uncomfortable with Duncan setting the framing, and it’s worth the LW team having a dedicated “our frame on what the site norms are” anyway)
In general I think Duncan has written a lot of great posts – many of his posts have been highly ranked in the LessWrong review. I expect him to continue to provide a lot of value to the LessWrong ecosystem one way or another.
I’ll note that while I have talked to Duncan for dozens(?) of hours trying to hash out various deep issues and not met much success, I haven’t really tried negotiating with him specifically about how he relates to LessWrong. I am fairly hopeful we can work something out here.
Why spend so much time engaging with a single commenter? Well, the answer is that I do think the specific way Said has been commenting on the site had a non-trivial chance of basically just killing the site, in the sense of good conversation and intellectual progress basically ceasing, if not pushed back on and the collateral damage limited by moderator action.
Said has been by far the most complained user on the site, with many top authors citing him as a top reason for why they do not want to post on the site, or comment here, and also I personally (and the LessWrong team more broadly) would have had little interest in further investing in LessWrong if the kind of the kind of culture that Said brings had taken hold here.
So the stakes have been high, and the alternative would have been banning, which I think also in itself requires at least many dozens of hours of effort, and given that Said is a really valuable contributor via projects like greaterwrong and readthesequences.com, a choice I felt appropriately hesitant about.
Now, one might think that it seems weird for one person to be able to derail a comment thread. However, I claim this is indeed the case. As long as you can make comments that do not get reliably downvoted, you can probably successfully cause a whole comment thread to almost exclusively focus on the concern you care about. This is the result of a few different dynamics:
Commenters on LW do broadly assume that they shouldn’t ask a question or write a critique that someone else has already asked, or has indirectly already been answered in a different comment (this is good, this is what makes comment sections on LW much more coherent than e.g. on HN)
LW is high enough trust that I think authors generally assume that an upvoted comment is made in good standing and as such deserves some kind of response. This is good, since it allows people to coordinate reasonably well on requesting justification or clarification. However, voting is really quite low-fidelity, it’s not that hard to write comments that will not get downvoted, and there is little consistent reputational tracking on LessWrong across many engagements, meaning it’s quite hard for anyone to lose good standing.
Ultimately I think it’s the job of the moderators to remove or at least mark commenters who have lost good standing of this kind, given the current background of social dynamics (or alternatively to rejigger the culture and incentives to remove this exploit of authors thinking non-downvoted comments are made in good standing)
Occupy Wall Street strikes me as another instance of the same kind of popular sneer culture. Occupy Wall Street had no coherent asks, no worldview that was driving their actions. Everyone participating in the movement seemed to have a different agenda of what they wanted to get out of it. The thing that united them was a shared dislike of something in the vague vicinity of capitalism, or government, or the man, not anything that could be used as the basis for any actual shared creations or efforts.
To be clear, I think it’s fine and good for people to congratulate each other on getting new jobs. It’s a big life change. But of course if your discourse platform approximately doesn’t allow anything else, as I expand in the rest of this section, then you run into problems.
I am here trying to give a high-level gloss that tries to elucidate the central problems. Of course many individual conversations diverge, and there are variations on this, often in positive ways, but I would argue the overall tendency is strong and clear.
I switched to a different comment thread here, as a different thread made it easier to see the dynamics at hand. The Benquo thread also went meta, and you can read it here, but it seemed a bit harder to follow without reading a huge amount of additional context, and was a less clear example of the pattern I am trying to highlight.
See e.g. this comment thread with IMO a usually pretty good commenter who kept extremely strongly insisting that any analysis or evaluation of IMO clearly present subtext is fabricated or imagined.
I am not making a strong claim here that direct aggression is much worse or much better than passive aggression, I feel kind of confused about it, but I am saying that independently of that, there is one argument that passive/hidden aggression requires harsher punishment when prosecution does succeed.
A bit unclear what exactly happened, you can read the thread yourself. Mostly we argued for a long time about what kind of place LessWrong should be and how authors should relate to criticism until we gave an official warning, and then nothing about Said’s behavior changed in the future, but we didn’t have the energy to prosecute the subtext another time.
Buuuut what’s going on here is that—and this is imo unfortunate—the website you guys have built is such that posting or commenting on it provides me with a fairly low amount of value
This is something I really do find disappointing, but it is what it is (for now? things change, of course)
So again it’s not that I disagree with you about anything you’ve said
But the sort of care / attention / effort w.r.t. tone and wording and tact and so on, that you’re asking, raises the cost of participation for me above the benefit
(Another aspect of this is that if I have to NOT say what I actually think, even on e.g. the CFAR thing w.r.t. Double Crux, well, again, what then is the point)
(I can say things I don’t really believe anywhere)
[...]
If the takeaway here is that I have to learn things or change my behavior, well—I’m not averse in principle to doing that ever under any circumstances, but it has to be worth my while, if you see what I mean
Currently it is not
I hope to see that change, of course!
The specific moderation message we sent at the end of that exchange was:
The mod team has talked about it, and we’re going to insist you comment with the same level of tact you showed while talking with me. If that makes it not worth your while to comment on the new LW that’s regrettable and we hope someday the quality makes it worth your while to come back on these terms, but we understand and there are no hard feelings.
To be clear, I think as I point out in the earlier sections of this post, I think there are ways of doing this that would be good for the site, and functions that Said performed that are good, but I would be quite concerned about people doing a cargo-culting thing here.
And to be clear, this is all a pretty tricky topic. It is not rare for whole social groups, including the rationality community, to pretend to misunderstand something. As the moderators it’s part of our job to take into account whether the thing that is going on here is some social immune reaction that is exaggerating their misunderstandings, or maybe even genuinely preventing any real understanding from forming at all, and to adjust accordingly. This is hard.
Banning Said Achmiz (and broader thoughts on moderation)
It’s been roughly 7 years since the LessWrong user-base voted on whether it’s time to close down shop and become an archive, or to move towards the LessWrong 2.0 platform, with me as head-admin. For roughly equally long have I spent around one hundred hours almost every year trying to get Said Achmiz to understand and learn how to become a good LessWrong commenter by my lights.[1] Today I am declaring defeat on that goal and am giving him a 3 year ban.
What follows is an explanation of the models of moderation that convinced me this is a good idea, the history of past moderation actions we’ve taken for Said, and some amount of case law that I derive from these two. If you just want to know the moderation precedent, you can jump straight there.
I think few people have done as much to shape the culture of LessWrong as Said. More than 50% of the time when I would ask posters, commenters and lurkers about their models of LessWrong culture, they’d say some version of either:
Or
And frequently when I dig into how they formed these impressions, a comment by Said would be at least heavily involved in that.
I think both of these perspectives are right. LessWrong is a unique place on the internet where bad ideas do get torn apart in ways that are rare and valuable, and also a place where there is a non-trivial chance that your comment section gets derailed by someone making some extremely confident assumption about what you intended to say, followed by a pile of sneering dismissal.[2]
I am overall making this decision to ban Said with substantial sadness. As is evident by me spending hundreds of hours over the years trying to resolve this via argument and soft-touch moderation, this was very far from an obvious choice. This post itself was also many dozens of hours of work, and I hope it illuminates some of the ways this decision was made, what it means for the future of LessWrong, and how it will affect future moderation. I apologize for the length.
The sneer attractor
One of the recurring attractors of the modern internet, dominating many platforms, subreddits, and subcultures, is the sneer attractor. Exemplified in my mind by places like RationalWiki, the eponymous “SneerClub”, but also many corners of Reddit and of course 4chan. At its worst, that culture looks like this:
Sociologically, my sense is this culture comes from a mixture of the following two dynamics:
Conflationary alliances, which conflate between all the different ways why something is bad. The key component of making good sneer club criticism is to never actually say out loud what your problem is. Sneerers say something that the reader can fill in with whatever they think the problem is, which allows them to establish an appearance of consensus, without a shared model. When pressed, sneerers hide behind irony or deflect.[3]
A culture of loose status-focused social connection. Fellow sneerers are not trying to build anything together. They are not relying on each other for trade, coordination or anything else. They don’t need to develop protocols of communication that produce functional outcomes, they just need to have fun sneering together.
Since the sneer attractor is one of the biggest and most destructive attractors of the modern internet, I worry about LessWrong also being affected by its dynamics. I think we are unlikely to ever become remotely as bad as SneerClub, or most of Reddit or Twitter, but I do see similar cultural dynamics rear their head on LessWrong. If these dynamics were to get worse, the thing I would mostly expect to see is the site quietly dying; fewer people venturing new/generative content, fewer people checking the site in hopes of such content, and an eventual ghost town of boring complaints about the surrounding scene, with links.
But before I go into that, let’s discuss the sneer attractor’s mirror image:
The LinkedIn attractor
In the other corners of the internet and the rest of the world, especially in the land of professional communities, we have what I will call the “LinkedIn attractor”. In those communities saying anything bad about another community member is frowned upon. Disputes are supposed to be kept private. When someone intends to run an RCT on which doctors in your hospital are most effective, you band together and refuse to participate, because establishing performance metrics would hurt the unity of your community.
Since anything but abstract approval is risky in those communities, a usual post in those communities looks like this largely vacuous engagement[4]:
And at the norms level like this (cribbed from an interesting case study of the “Obama Campaign Alumni” Facebook group descending into this attractor):
This cultural attractor is not mutually exclusive with the sneer attractor. Indeed, the LinkedIn attractor appears to be the memetically most successful way groups relate to their ingroup members, while the sneer attractor governs how they relate to their outgroups.
The dynamics behind the LinkedIn attractor seem mechanistically straightforward. I think of them as “mutual reputation protection alliances”.
In almost every professional context I’ve been in, these alliances manifest as a constant stream of agreements—”I say good things about you, if you say good things about me.”
This makes sense. It’s unlikely I would benefit from people spreading negative information about you, and we would both clearly benefit from protecting each other’s reputation. So a natural equilibrium emerges where people gather many of these mutual reputation protection alliances, ultimately creating groups with strong commitments to protect each other’s reputation and the group’s reputation in the eyes of the rest of the world.
Of course, the people trying to use reputation to navigate the world — to identify who is trustworthy — are much more diffuse and aren’t party to these negotiations. But their interests weigh heavily enough that some ecosystem pressure exists for antibodies to mutual reputation protection alliances to develop (such as rating systems for Uber drivers, as opposed to taxi cartels where feedback to individuals is nearly impossible to aggregate).
How this relates to LessWrong
Ok, now why am I going on this digression about The Sneer Attractor and the LinkedIn Attractor? The reason is because I think that much of the heatedness of moderation discussions related to Said, and people’s fears, have been routing through people being worried that LessWrong will end up in either the Sneer Attractor or the LinkedIn Attractor.
As I’ve talked with many people with opinions on Said’s comments on the site, a recurring theme has been that Said is what prevents LessWrong from falling into the LinkedIn attractor. Said, in many people’s minds, is the bearer of a flag like this:
And I really care about this flag too. Indeed, much of the decisions I have made around LessWrong have been to foster a culture that understands and rallies behind this flag. LessWrong is not LinkedIn, and LessWrong is not the EA Forum, and that is good and important.
And I do think Said provides a shield. Having Said comment on LessWrong posts, and having those comments be upvoted, helps against sliding down the attractor towards LinkedIn.
But on the other hand, I notice that in myself a lot of what I am most worried about is the Sneer Attractor. For LessWrong to become a place that can’t do much but to tear things down. Where criticism is vague and high-level and relies on conflationary alliances to get traction, but does not ultimately strengthen what it criticizes or who reads the criticism. Filled with comments that aims to make the readers and the voters feel superior to all those fools who keep saying wrong things, despite not equipping readers to say any less wrong things themselves.
And I do think Said moves LessWrong substantially towards that path. When Said is at his worst, he writes comments like this:
This, to be clear, is still better than the SneerClub comment visible above. For example when asked to clarify, Said obliges:
But the overall effect on the culture is still there, and the thread still results in Benquo eventually disengaging in frustration intending to switch his moderation guidelines to “Reign of terror” and deleting any future similar comment threads, as Said (as far as I can tell) refuses to do much cognitive labor in the rest of the thread until Benquo runs out of energy.
So, to get this conversation started[5] and to maybe give people a bit more trust that I am tracking some things they care about: “Yes, a lot of the world is broken and stuck in an equilibrium of people trying to punish others for saying anything that might reflect badly on anyone else, in endless cycles of mutual reputation protection that make it hard to know what is fake and what is real, and yes, LessWrong, as most things on the internet, is at risk of falling into that attractor. I am tracking this, I care a lot about it, and even knowing that, I think it’s the right call to ban Said.”
Now that this is out of the way, I think we can talk in more mechanistic terms about what is going wrong in comment threads involving Said, and maybe even learn some things about online moderation.
Weaponized obtuseness and asymmetric effort ratios
An excerpt from a recent Benquo post on the Said moderation decision:
I concur with much of this. To get more concrete, in my experience a breakdown of the core dynamics that make comment threads with Said rarely worth it looks something like this:[6]
Said will write a top level comment that will read like an implicit claim that you have violated some social norm in the things you have written for which you deserve to be punished (though this will not be said explicitly), or if not that, make you look negligent by not answering an innocuous open-seeming question
You will try to address this claim by writing some kind of long response or explanation, answering his question or providing justification on some point
Said will dismiss your response as being totally insufficient, confused, or proving the very point he was trying to make
You will try to clarify more, while Said will continue to make insinuations that your failure to respond properly validates whatever judgment he is invoking
Motivated commenters/authors will go up a level and ask “by what standard are you trying to invoke a negative judgment here?”[7]
Said will deny and all such invocation of standards or judgment, saying he is (paraphrased) “purely talking on the object level and not trying to make any implicit claims of judgment or low-status or any such kind”
After all of this you are left questioning your own sanity, try a bit to respond more on the object-level, and ultimately give up feeling dejected and like a lot of people on LessWrong hate you. You probably don’t post again.
In the more extreme cases, someone will try to prosecute this behavior and reach out to the moderators, or make a top-level post or quick take about it. Whoever does this quickly finds out that the moderators feel approximately as powerless to stop this cycle as they are. This leaves you even more dejected.
With non-trivial probability your post or comment ends up hosting a 100+ comment thread with detailed discussion of Said’s behavior and moderation norms and whether it’s ever OK to ban anyone, in which voting and commenting is largely dominated by the few people who care much more than average about banning and censorship. You feel an additional pang of guilt and concern about how many people you might have upset with your actions, and how much time you might have wasted.
Now, I think it is worth asking what the actual issue with the comments above is. Why do they produce this kind of escalation?
Asymmetric effort ratios and isolated demands for rigor
A key dynamic in many threads with Said is that the critic has a pretty easy job at each step. First of all, they have little to lose. They need to make no positive statements and explain no confusing phenomena. All they need to do is to ask questions, or complain about the imprecision of some definition. If the author can answer compellingly, well, then they can take credit for how they helped elicit an explanation to a confusion that clearly many people must have had. And if the author cannot answer compellingly, then even better, then the critic has properly identified and prosecuted bad behavior and excised the bad ideas that otherwise would have polluted the commons. At the end of the day, are you really going to fault someone for just asking questions? What kind of totalitarian state are you trying to create here?
The critic can disengage at any point. No one faults a commenter for suddenly disappearing or not giving clear feedback on whether a response satisfied them. The author, on the other hand, does usually feel responsible for reacting and responding to any critique made of his ideas, which he dared to put so boldly and loudly in front of the public eye.
My best guess is that the usual ratio of “time it takes to write a critical comment” to “time it takes to respond to it to a level that will broadly be accepted well” is about 5x. This isn’t in itself a problem in an environment with lots of mutual trust and trade, but in an adversarial context it means that it’s easily possible to run a DDOS attack on basically any author whose contributions you do not like by just asking lots of questions, insinuating holes or potential missing considerations, and demanding a response, approximately independently of the quality of their writing.
For related musings see the Scott Alexander classic Beware Isolated Demands For Rigor.
Maintaining strategic ambiguity about any such dynamics
Of course, commenters on LessWrong are not dumb, and have read Scott Alexander, and have vastly more patience than most commenters on the internet, and so many of them will choose to dissect and understand what is going on.
The key mechanism that shields Said from the accountability that would accompany such analysis is his care to avoid making explicit claims about the need for the author to respond, or the implicit judgment associated with his comments. In any given comment thread, each question is phrased as to be ambiguous between a question driven by curiosity, and a question intended to expose the author’s hypocrisy.
This ambiguity is not without healthy precedent. I have seen healthy math departments and science departments in which a prodding question might be phrased quite politely, and phrased ambiguously between a matter of personal confusion and the intention of pointing out a flaw in a proof.
is a common kind of question. And I think overall fine, appropriate, healthy.
That said, most of the time, when I was in those environments, I could tell what was going on, and I mostly knew that other people could tell as well. If someone repeatedly asked questions in a way that did clearly indicate an understanding of a flaw in the provided proofs or arguments, but kept insisting on only getting there via Socratic questioning, they would lose points over time. And if they kept asking probing questions in each seminar that were easily answered, with each question taking up space and bandwidth, then they would quickly lose lots of points and asked to please interrupt less. And furthermore, the tone of voice would often make it clear whether the question asked was more on the genuine curiosity side, or the suggested criticism side.
But here on the internet, in the reaches of an online forum, with...
people coming in and out,
the popularity of topics waxing and waning with the fashions of the internet,
few common-knowledge creating events,
the bandwidth limited to text,
voting in most niche threads dominated by a very small subset of people who you can’t see but shape what gets approval nevertheless,
and branching comment threads with no natural limitation on the bandwidth or volume of requests that can be made of an author,
...the mechanisms that productively channeled this behavior no longer work.
And so here, where you can endlessly deflect any accusations with little risk that common-knowledge can be built that you are wasting time, or making repeated bids for social censure that fail to get accepted, by just falling back on saying “look, all I am doing is asking questions and asking for clarification, clearly that is not a crime”, there is no natural limit to how much heckling you can do. You alone could be the death of a whole subculture, if you are just persistent enough.
And so, at the heart of all of this, is either a deep obliviousness, or more likely, the strategic disarmament of opposition by denying load-bearing subtext (or anything else that might obviously allow prosecution) in these interactions.
And unfortunately, this does succeed. My guess is here on LessWrong better than most places, because we have a shared belief in the power of explicit reasoning, and we have learned an appropriate fear of focusing on subtext with a culture where debate is supposed to focus on the object level claims, not whatever status dynamics are going on, and I think this is good and healthy most of the time. But the purpose of those norms is not to completely eschew analysis and evaluation of the underlying status dynamics, but simply to separate them from the object level claims (I also think it’s good to have some norms against focusing too much on status dynamics and claims in total, so I think a lot of the generic hesitation is justified, which I elaborate on a bit later).
But I think in doing so, part by selection, part by training, we have created an environment where trying to police the subtext and status-dynamics surrounding conversations gets met with fear and counter-reactions, that make moderation and steering close very difficult.[8]
Crimes that are harder to catch should be more harshly punished
A small sidenote on a dynamic relevant to how I am thinking about policing in these cases:
A classical example of microeconomics-informed reasoning about criminal justice is the following snippet of logic.
If someone can gain in-expectation X dollars by committing some crime (which has negative externalities of Y>X dollars), with a probability p of getting caught, then in order to successfully prevent people from committing the crime you need to make the cost of receiving the punishment (Z) be greater than Xp, i.e. X<p∗Z.
Or in less mathy terms, the more likely it is that someone can get away with committing a crime, the harsher the punishment needs to be for that crime.
In this case, a core component of the pattern of plausibly deniable aggression that I think is present in much of Said’s writing is that it is very hard to catch someone doing it, and even harder to prosecute it successfully in the eyes of a skeptical audience. As such, in order to maintain a functional incentive landscape the punishment for being caught in passive or ambiguous aggression needs to be substantially larger than for e.g. direct aggression, as even though being straightforwardly aggressive has in some sense worse effects on culture and norms (though also less bad effects in some other ways)[9], the probability of catching someone in ambiguous aggression is much lower.
Concentration of force and the trouble with anonymous voting
An under-discussed aspect of LessWrong is how voting affects culture, author expectations and conversational dynamics. Voting is anonymous, even to admins (the only cases where we look at votes is when we are investigating mass-downvoting or sockpuppetting or other kinds of extreme voting abuse). Now, does that mean that everyone is free to vote however they want?
The answer is a straightforward “no”. Ultimately, voting is a form of participation on the site that can be done well and badly, and while I think it’s good for that participation to be anonymous and generally shielded from retaliation, at a broader level, it is a job of the moderators to pay attention to unhealthy vote dynamics. We cannot police what you do with your votes, but if you do abuse your votes, you will make the site worse, and we might end up having to change the voting system towards something less expressive but more robust as a result.
These are general issues, but how do they relate to this whole Said banning thing?
Well, another important dimension of how the dynamics in these threads go is roughly the following:
Said makes a top-level comment asking a question or making some kind of relatively low-effort critique
This will most of the time get upvoted, because questions or top-level critiques rarely get downvoted
The discussion continues for 3-4 replies, and with each reply the number of users voting goes down
By the time the discussion is ~3 levels deep, basically all voting is done by two groups of people: LessWrong moderators and a very small set of LessWrong users who are extremely active and tend to take a strong interest in Said’s comments
If the LessWrong moderators do not vote, the author is now in a position where any further replies by Said get upvoted and their responses get reliably downvoted. If they do vote, the variance in voting in the thread quickly goes through the roof, resulting in lots of people with strongly upvoted or strongly downvoted comments, since everyone involved has very high vote-strength.
This is bad. The point of voting is to give an easy way of aggregating information about the quality and reception of content. When voting ends up dominated by a small interest[10] group without broader site buy-in, and with no one being able to tell that is what’s going on, it fails at that goal. And in this case, it’s distorting people’s perception about the site consensus in particularly high-stakes contexts where authors are trying to assess what people on the site think about their content, and about the norms of posting on LessWrong.
I don’t really know what to do about this. It’s one of the things that makes me more interested in bans than other things, since site-wide banning also comes with removal of vote-privileges, though of course we could also rate-limit votes or find some other workaround that achieves the same aim. I also am not confident this is really what’s going on as I do not look at vote data, and nothing here alone would make me confident I would want to ban anyone, but I think the voting has been particularly whack in a lot of these threads, and that seemed important to call out.
But why ban someone, can’t people just ignore Said?
I hope the dynamics I outlined help understand why ignoring Said is not usually a socially viable option. Indeed, Said himself does not think it’s a valid option:
Now, in the comment thread in which the comment above was made, both mods and authors have clarified that no, authors do not have an obligation to respond with remotely the generality outlined here, and the philosophy of discourse Said outlines is absolutely not site consensus. However, this does little for most authors. Most people have never read the thread where those clarifications were made, and never will. And even if we made a top-level post, or added a clarification to our new user guidelines, this would do little to change what is going on.
Because every time Said leaves a top-level comment, it is clear to most authors and readers that he is implying the presence of a social obligation to respond. And because of the dynamics I elaborated on in the previous section, it is not feasible for moderators or other users to point out the underlying dynamic each time, which itself requires careful compiling of evidence and pointing out (and then probably disputing) subtext.
So, despite it being close to site-consensus that authors do not face obligations to respond to each and every one of Said’s questions, on any given post, there is basically nothing to be done to build common knowledge of this. Said can simply make another comment thread implying that if someone doesn’t respond they deserve to be judged negatively, and there will always be enough people voting who have not seen this pattern play out, or even support Said’s relationship to author obligation against the broad site consensus, to make the question and critiques be upvoted enough. And so despite their snark and judgment they will appear to be made in good standing, and so deserve to be responded to, and the cycle will begin anew.
Now in order to fix this dynamic, the moderation team has made multiple direct moderation requests of Said, summarized here as a high-level narrative (though in reality it played out over more like a decade):
First we asked Said to please stop implying such obligations to authors, which Said rejected as a thing he was doing or a coherent ask[12]
So eventually we encouraged authors to self-moderate the comments on their posts more[13], under supervision of LessWrong moderators, allowing at least individual authors to stop Said from commenting if they didn’t want to engage with him
To which Said responded by claiming that anyone who dared to use moderation tools on their posts could only possibly do so for bad-faith reasons and should face social punishment for doing so
To which we responded by telling Said to please let authors moderate as they desire and to not do that again, and gave him a 3 month rate-limit
After the rate-limit he seemed to behave better for a few months, until he did the exact things we rate-limited him for again, again calling for authors to face censure because they used their moderation tools
So ultimately, what other option do we have but a ban? We could attach a permanent mark to Said’s profile with a link to a warning that this user has a long history of asking heckling questions and implying social punishment without much buy-in for that, but that seems to me to create more of an environment of ongoing hostility, and many would interpret such a mark as cruel to have imposed.
So no, I think banning is the best option available.
Ok, but shouldn’t there be some kind of justice process?
I expect it to be uncontroversial to suggest that most moderation on LessWrong should be soft-touch. The default good outcome of a moderator showing up on a post is to leave a comment warning of some bad conversational pattern, or telling one user to change their behavior in some relatively specific way. The involved users take the advice, the thread gets better or ends, and everyone moves on with their day.
Ideally this process starts and completes within 10 minutes, from the moderator noticing something going off the rails to sending off the comment providing either advice or a warning.
However, sometimes these interactions escalate, or moderators notice a more systematic pattern with specific users causing repeated problems, and especially if someone disagrees with the advice or recommendation of a moderator, it’s less clear how things are supposed to proceed. Historically the moderation standard for LessWrong has been unilateral dictatorship awarded to a head admin. But even with that dictatorship being granted to me, it is still up to me to decide how I want myself and the LessWrong team to handle these kinds of cases.
At a high-level I think there are two reasonable clusters of approaches here:
High-stakes decisions get made by some kind of court that tries to recruit impartial judges to a dispute. The judges get appointed by the head moderator, but have a tenure and substantial independent power.
High-stakes decisions get made directly by the acting head-moderator, who takes personal responsibility for all decisions on the site
(In either case it would make sense to try to generalize any judgments or decisions made into meaningful case-law to serve as the basis for future similar decisions, and to be added to some easily searchable set of past judgments that users and authors can use to gain more transparency into the principles behind moderation decisions, and predict how future decisions are likely to be made.)
Both of these approaches are fairly common in general society. Companies generally have a CEO who can unilaterally make firing and rule-setting decisions. Older institutions and governmental bodies often tend to have courts or committees. I considered for quite a while whether as part of this moderation decision I should find and recruit a set of more impartial judges to make high-stakes decisions like this.
But after thinking about it for a while, I decided against it. There are a few reasons:
Judging cases like this is hard and takes a lot of work. It also benefits from having spent lots of time thinking about moderation and incentives and institutional design. Recruiting someone good for this role would require paying them a bunch for their high opportunity cost, and is also just kind of objectively hard.
Incentivizing strong judge performance seems difficult. The default outcome I’ve seen from committees and panels is that everyone on them half-asses their job, because they rarely have stake in the outcome being good. Even if someone cares about LessWrong, that is not the same as being generally held personally responsible for it, and having your salary and broader reputation depend on how well LessWrong is going.
I think there is a broader pattern in society of people heavily optimizing for “defensibility”, and this mostly makes things worse. I think most of the reason for the popularity of committees and panels and boards deciding high-stakes matters is that this makes the decision harder to attack externally, by creating a veneer of process. Blankfacedness as Scott Aaronson described it is also a substantial part of this. I do not want LessWrong to become a blankfaced bureaucracy that hands down judgment from on high. Whenever it’s possible I would like people to be able to talk to someone who is directly responsible for the outcome of the decision, who could change their mind in that very conversation if presented with compelling evidence.
Considerations like these are what convinced me that even high-stakes decisions like this should be made on my own personal conscience. I have a stake in LessWrong going well, and I can take the time to give these kinds of decisions the resources they deserve to get made well, and I can be available for people to complain and to push on if people disagree with them.
But in doing so, I do want to do a bunch of stuff that gets us the good parts of the more judicially oriented process. Here are some things that I think make sense, even in a process oriented around personal responsibility:
I think it’s good to publish countervailing opinions when possible. I frequently expect people on the Lightcone team to disagree with decisions I make, and when that happens, I will encourage them to write up their perspective and serve as a record that will make it easier to spot broader blind spots in my decision-making (and also reduce gaslighting dynamics where people feel forced to support decisions I make out of fear of being retaliated against).
I think it’s good to have a cool-off period before making high-stakes moderation decisions. I expect to basically never make a ban decision on any contributor with a substantial comment history without taking at least a few days to step away from any active discussion to think about it. This doesn’t mean that a ban decision might not be preceded by a heated discussion (I think it’s usually good to give whoever is being judged the chance to defend themselves, and those discussions might very well turn out heated), but the basic shape of the decision will be formed away from any specific heated thread.
I want to make it easier to find past moderation decisions the moderators made and for people to form a sense of the case-law of the site. As part of me working on this post I’ve started on a redesign of the /moderation page to make it substantially easier to see a history of moderation comments, as well as to include an overview over the basic principles of how moderation is done on the site.
I think I should very rarely[14] prevent whoever is being affected by a moderation decision from defending themselves on LessWrong. For practical reasons, I need to limit the time I personally spend replying and engaging with their defense, but I think it makes sense to allow the opposing side in basically all cases like this to publish a defense on the site, and do at least some to signal-boost it together with any moderation decisions.
So what options do I have if I disagree with this decision?
Well, the first option you always have, and which is the foundation of why I feel comfortable governing LessWrong with relatively few checks and balances, is to just not use LessWrong. Not using LessWrong probably isn’t that big of a deal for you. There are many other places on the internet to read interesting ideas, to discuss with others, to participate in a community. I think LessWrong is worth a lot to a lot of people, but I think ultimately, things will be fine if you don’t come here.
Now, I do recommend that if you stop using the site, you do so by loudly giving up, not quietly fading. Leave a comment or make a top-level post saying you are leaving. I care about knowing about it, and it might help other people understand the state of social legitimacy LessWrong has in the broader world and within the extended rationality/AI-Safety community.
Of course, not all things are so bad as to make it the right choice to stop using LessWrong altogether. You can complain to the mods on Intercom, or make a shortform, or make a post about how you disagree with some decision we made. I will read them, and there is a decent chance we will respond or try to clarify more or argue with you more, though we can’t guarantee this. I also highly doubt you will end up coming away thinking that we are right on all fronts, and I don’t think you should use that as a requirement for thinking LessWrong is good for the world.
And if the stakes are even higher, you can ultimately try to get me fired from this job. The exact social process for who can fire me is not as clear to me as I would like, but you can convince Eliezer to give head-moderatorship to someone else, or convince the board of Lightcone Infrastructure to replace me as CEO, if you really desperately want LessWrong to be different than it is.
But beyond that, there is no higher appeals process. At some point I will declare that the decision is made, and stands, and I don’t have time to argue it further, and this is where I stand on the decision this post is about.
An overview over past moderation discussion surrounding Said
I have tried to make this post relatively self-contained and straightforward to read, trying to avoid making you the reader feel like you have to wade through 100,000+ words of previous comment threads to have any idea what is going on[15], at least from my perspective. However, for the sake of completeness, and because I do think it provides useful context for the people who want to really dive into this kind of decision, here is a quick overview over past moderation discussion and decisions related to Said:
8 years ago, Elizabeth wrote a moderation warning to Said, and there was some followup discussion with him on Intercom. With habryka’s buy-in, Elizabeth said roughly “if you don’t change your behavior in some way you’ll be banned”, and Said said roughly “it’s not worth it for me to change my behavior, I would rather not participate on LW at all in that case”[16]. He did not change his behavior, we did not end up banning him at this time, and he also did not stop participating on LW.
7 years ago Ben Pace wrote a long moderation warning to Said, in a thread (with ~32,000 words in it) that expanded under Said’s comments on a post.
6 years ago on New Year’s Eve/New Year’s Day, in a thread (with ~35,000 words in it), I wrote ~40 comments arguing with Said’s that his commenting style was corrosive to a good commenting culture on LessWrong.
2 years ago there was a conflict between Said Achmiz and Duncan Sabien that spanned many posts and comment threads and led Raymond Arnold to write a moderation post about it that received 560 (!) comments, or ~64,000 words. Ray took the mod action of giving Said a rate-limit of 3-comments-per-post-per-week, for a duration of 3 months.
Last month, a user banned Said from commenting on his posts. Said took to his own shortform where (amongst other things) he and others called that author a coward for banning him. Then Wei Dai commented on a 2018 post, that announced that authors (over a certain karma threshold) have the ability to ban users from their posts, to argue against this being allowed, under which a large thread developed again defending Said from being banned from that user’s posts, a thread of ~61,000 words.
The most substantial of these is Ray’s moderation judgement from two years ago. I would recommend the average reader not read it all, but it is the result of another 100+ hour effort, and so does contain a bunch of explanation and context. You can read through the comments Ray made in the appendix to this post.
What does this mean for the rest of us?
My current best guess is that not that much has to change. My sense is Said has been a commenter with uniquely bad effects on the site, and while there are people who are making mistakes along similar lines, there are very few who are as prolific or have invested as much into the site. I think the most likely way I can imagine the considerations in this post resulting in more than just banning Said is if someone decides to intentionally pick up the mantle of Said Achmiz in order to fill the role that they perceive he filled on the site, and imitate his behavior in ways that recreate the dynamics I’ve pointed out.[17]
There are a few users who I have similar concerns about as I had about Said, and I do want this post to save me effort in future moderation disputes. I do also expect to refer back to the ideas in this post for many years in various moderation discussions and moderation judgments, but don’t have any immediate instances of that in mind.
I do think it makes sense to try to squeeze out some guidance for future moderation decisions out of this. So in case-law fashion, here are some concrete guidelines derived from this case:
You are at least somewhat responsible for the subtext other people read into your comments, you can’t disclaim all responsibility for that
Sometimes things we write get read by other people to say things we didn’t mean. Sometimes we write things that we hope other people will pick up, but we don’t want to say straight out. Sometimes we have picked up patterns of speech or metaphors that we have observed “working”, but that actually don’t work the way we think they do (like being defensive when we get negative feedback resulting in less negative feedback, which one might naively interpret as being assessed less negatively).
On LessWrong, it is okay if an occasional stray reader misreads your comments. It is even okay if you write a comment that most of the broad internet would predictably misunderstand, or view as some kind of gaffe or affront. LessWrong has its own communication culture.
But if a substantial fraction of other commenters consistently interpret your comments to mean something different than what you claim they say when asked for clarification, especially if they do so in contexts where that misinterpretation happens to benefit you in conflicts you are involved in, then that is a thing you are at least partially responsible for.[18]
This all also intersects with “decoupling vs. contextualizing” norms. A key feature of LessWrong is that people here tend to be happy to engage with any specific object-level claim, largely independently of what the truth of that claim might imply at a status or reputation or blame level about the rest of the world. This, if you treat it as a single dimension, puts LessWrong pretty far into having “decoupling” norms . I think this is good and important and a crucial component of how LessWrong has maintained its ability to develop important ideas, and help people orient to the world.
This intersection produces a tension. If you are responsible for people on LessWrong reading context and implications and associations into your contributions you didn’t intend, then that sure sounds like the opposite of the kind of decoupling norms that I think is so important for LessWrong.
I don’t have a perfect resolution to this. Zack had a post on this with some of his thoughts that I found helpful:
I disagree with Zack that the dichotomy between decoupling and contextualizing norms fails to clarify any of the underlying issues. I do think you can probably graph communities and spaces pretty well on a vector from “high decoupling” to “high contextualizing”, and this will allow you to make a lot of valid predictions.
But as Zack helpfully points out here, the key thing to understand is that of course many forms of context and implications are obviously relevant and important, and worth taking into account during a conversation. This is true on LessWrong as well as anywhere else. If your comments have a consistent subtext of denigrating authors who invoke reasoning by analogy, because you think most people who reason by analogy are confused (a potentially reasonable if contentious position on epistemics), then you better be ready to justify that denigration when asked about it.
Responses of the form “I am just asking these people for the empirical support they have for their ideas, I am not intending to make a broader epistemological point” are OK if they reflect a genuine underlying policy of not trying to shift the norms of the site towards your preferred epistemological style, and associated (bounded) efforts to limit such effects when asked. If you do intend to shift the norms of the site, you better be ready to argue for that, and it is not OK to follow an algorithm that is intending to have a denigrating effect, but that shields itself from the need for justification or inspection by invoking decoupling norms. What work is respected and rewarded on LessWrong is of real and substantial relevance to the participants of LessWrong. Sure, obsession with that dimension is unhealthy for the site, and I think it’s actively good to most of the time ignore it. But, especially if the subtext is repeated across many comments from the same author, it is the kind of thing that we need to be able to talk about, and sometimes moderate.
As as such, within these bounds, “tone” is very much a thing the LessWrong moderators will pay attention to, as are the implied connotations of the words you use, as are the metaphors you choose, and the associations that come with them. And while occasionally a moderator will take the effort to disentangle all your word choices, and pin down in excruciating detail why something you said implied something else and how you must have been aware of that on some level given what you were writing, they do not generally have the capacity to do so in most circumstances. Moderators need the authority to, at some level, police the vibe of your comments, even without a fully mechanical explanation of how that vibe arises from the specific words you chose.
Do not try to win arguments by fights of attrition
A common pattern on the internet is that whoever has the most patience for re-litigating and repeating their points ultimately wins almost any argument. As long as you avoid getting visibly frustrated, or insulting your opponents, and display an air of politeness, you can win most internet arguments by attrition. If you are someone who might have multiple hours per day available to write internet comments, you can probably eke out some kind of concession or establish some kind of norm in almost any social space, or win some edit war that you particularly care about.
This is a hard thing to combat, but the key thing that makes this tactic possible is to be in a social space in which it is assumed that comments or questions are made in good standing as long as they aren’t obviously egregious.
On LessWrong, if you make a lot of comments, or ask a lot of questions, with a low average hit-rate on providing value by the lights of the moderators, my best guess is you are causing more harm than good, especially if many of those comments are part of conversations that try to prove some kind of wrongdoing or misleadingness on behalf of your interlocutor (so that they feel an obligation to respond). And this is a pattern we will try to notice and correct (while also recognizing that sometimes it is worth pressing people on crucial and important questions, as people can be evasive and try to avoid reasonable falsification of their ideas in defense of their reputation/ego).
Building things that help LessWrong’s mission will make it less likely you will get banned
While the overall story of Said is one of him ultimately getting banned from LessWrong, it is definitely the case that having built readthesequences.com and greaterwrong.com and his contributions to gwern.net have all increased the thresholds we had for banning very substantially.
And I overall stand behind this choice. Being banned from LessWrong does affect people’s ability to contribute and participate in the broader Rationality community ecosystem, and I think it makes sense to tolerate people’s weaknesses in one domain, if that allows them to be a valuable contributor in another domain, even if those two domains are not necessarily governed by the same people.
So yeah, I do think you get to be a bit more of a dick, for longer, if you do a lot of other stuff that helps LessWrong’s broader mission. This has limits, and we will invest a bunch into limiting the damage or helping you improve, but it does also just help.
So with all that Said
And so we reach the end of this giant moderation post. I hope I have clarified at least my perspective on many things. I will aim to limit my engagement with the comments of this post to at most 10 hours. Said is also welcome to send additional commentary to me in the next 10 days, and if so, I will append it to this post and link to it somewhere high up so that people can see it if they get linked here.[19] I will also make one top-level comment below this post under which Said will be allowed to continue commenting for the next 2 weeks, and where people can ask questions.
Farewell. It’s certainly been a ride.
Appendix: 2022 moderation comments
In 2022 Ray wrote 10,000+ words the last time we took moderation action on Said, which I extracted here for convenience. I don’t recommend the average reader read them all, but I do think they were another high-effort attempt at explaining what was going on.
Overview/outline of initial comment
Okay, overall outline of thoughts on my mind here:
What actually happened in the recent set of exchanges? Did anyone break any site norms? Did anyone do things that maybe should be site norms but we hadn’t actually made it an explicit rule and we should take the opportunity to develop some case law and warn people not to do it in the future?
5 years ago, the moderation team has issued Said a mod warning about a common pattern of engagement he does that a lot of people have complained about (this was operationalized as “demanding more interpretive labor than he has given”). We said if he did it again we’d ban him for a month. My vague recollection is he basically didn’t do it for a couple years after the warning, but maybe started to somewhat over the past couple years, but I’m not sure (I think he may have not done the particular thing we asked him not to, but I’ve had a growing sense his commenting is making me more wary of how I use the site). What are my overall thoughts on that?
Various LW team members have concerns about how Duncan handles conflict. I’m a bit confused about how to think about it in this case. I think a number of other users are worried about this too. We should probably figure out how we relate to that and make it clear to everyone.
It’s Moderation Re-Evaluation Month. It’s a good time to re-evaluate our various moderation policies. This might include “how we handle conflict between established users”, as well as “are there any important updates to the Authors Can Moderate Their Posts rules/tech?
It seems worthwhile to touch on each of these at least somewhat. I’ll follow up on each topic at least somewhat.
Recap of mod team history with Said Achmiz
First, some background context. When LW2.0 was first launched, the mod team had several back-and-forths with Said over complaints about his commenting style. He was (and I think still is) the most-complained-about LW user. We considered banning him.
Ultimately we told him this:
Followed by:
I do have a strong sense of Said being quite law-abiding/honorable about the situation despite disagreeing with us on several object and meta-level moderation policy, which I appreciate a lot.
I do think it’s worth noting that LessWrong 2.0 feels like it’s at a more stable point than it was in 2018. There’s enough critical mass of people posting here I that I’m less worried about annoying commenters killing it completely (which was a very live fear during the initial LW2.0 revival)
But I am still worried about the concerns from 5 years ago, and do basically stand by Ben’s comment. And meanwhile I still think Said’s default commenting style is much worse than nearby styles that would accomplish the upside with less downside.
My summary of previous discussions as I recall them is something like:
Then we did that, and it sorta worked for awhile. But it hasn’t been wholly satisfying to me. (I do have some sense that Said has recently ended up commenting more in threads that are explicitly about setting norms, and while we didn’t spell this out in our initial mod warning, I do think it is extra costly to ban someone from discussions of moderation norms than from other discussion. I’m not 100% sure how to think about this)
Death by a thousand cuts and “proportionate”(?) response
A way this all feels relevant to current disputes with Duncan is that thing that is frustrating about Said is not any individual comment, but an overall pattern that doesn’t emerge as extremely costly until you see the whole thing. (i.e. if there’s a spectrum of how bad behavior is, from 0-10, and things that are a “3” are considered bad enough to punish, someone who’s doing things that are bad at a “2.5″ or “2.9” level don’t quite feel worth reacting to. But if someone does them a lot it actually adds up to being pretty bad.
If you point this out, people mostly shrug and move on with their day. So, to point it out in a way that people actually listen to, you have to do something that looks disproportionate if you’re just paying attention to the current situation. And, also, the people who care strongly enough to see that through tend to be in an extra-triggered/frustrated state, which means they’re not at their best when they’re doing it.
I think Duncan’s response looks very out-of-proportion. I think Duncan’s response is out of proportion to some degree (see Vaniver thread for some reasons why. I have some more reasons I plan to write about).
But I do think there is a correct thing that Duncan was noting/reacting to, which is that actually yeah, the current situation with Said does feel bad enough that something should change, and it indeed the mods hadn’t been intervening on it because it didn’t quite feel like a priority.
I liked Vaniver’s description of Duncan’s comments/posts as making a bet that Said was in fact obviously banworthy or worthy of significant mod action, and that there was a smoking gun to that effect, and if this was true then Duncan would be largely vindicated-in-retrospect.
I’ll lay out some more thinking as to why, but, my current gut feeling + somewhat considered opinion is that “Duncan is somewhat vindicated, but not maximally, and there are some things about his approach I probably judge him for.”
Maybe explicit rules against blocking users from “norm-setting” posts.
On blocking users from commenting
I still endorse authors being able to block other users (whether for principles reasons, or just “this user is annoying”). I think a) it’s actually really important for authors for the site to be fun to use, b) there’s a lot of users who are dealbreakingly annoying to some people but not others. Banning them from the whole site would be overkill. c) authors aren’t obligated to lend their own karma/reputation to give space to other people’s content. If an author doesn’t want your comments on his post, whether for defensible reasons or not, I think it’s an okay answer that those commenters make their own post or shortform arguing the point elsewhere.
Yes, there are some trivial inconveniences to posting that criticism. I do track that in the cost. But I think that is outweighed by the effect on authors being motivated to post.
That all said...
Blocking users on “norm-setting posts”
I think it’s more worrisome to block users on posts that are making major momentum towards changing site norms/culture. I don’t think the censorship effects are that strong or distorting in most cases, but I’m most worried about censorship effects being distorting in cases that affect ongoing norms about what people can say.
There’s a blurry line here, between posts that are putting forth new social concepts, and posts advocating for applying those concepts towards norms (either in the OP or in the comments), and a further blurry line between that and posts which arguing about applying that to specific people. i.e. I’d have an ascending wariness of:
Concentration of Force
Basics of Rationalist Discourse or Speaking of Stag Hunts
Killing Socrates
I think it was already a little sketchy that Basics of Rationalist Discourse went out of it’s way to call itself “The Basics” rather than “Duncan’s preferred norms” (a somewhat frame-control-y move IMO although not necessarily unreasonably so), while also blocking Zack at the time. It feels even more sketchy to me to write Killing Socrates, which AFAICT a thinly veiled “build-social-momentum-against-Said-in-particular”, where Said can’t respond (and it’s disproportionately likely that Said’s allies also can’t respond)
Right now we don’t have tech to unblock users from a specific post, who have been banned from all of a user’s posts. But this recent set of events has me learning towards “build tech to do that”, and then make it a rule that post over at the threshold of “Basics” or higher (in terms of site-norm-momentum-building), need to allow everyone to comment.
I do expect that to make it less rewarding to make that sort of post. And, well, to (almost) quote Duncan:
Okay but what do I do about Said when he shows up doing his whole pattern of subtly-missing/and/or/reframing-the-point-while-sprawling massive threads, in an impo
My answer is “strong downvote him, announce you’re not going to engage, maybe link to a place where you went into more detail about why if this comes up a lot, and move on with your day.” (I do generally wish Duncan did more of this and less trying to set-the-record straight in ways that escalate in IMO very costly ways)
(I also kinda wish gjm had also done this towards the beginning of the thread on LW Team is adjusting moderation policy)
Verdict for 2023 Said moderation decisions
Preliminary Verdict (but not “operationalization” of verdict)
tl;dr – @Duncan_Sabien and @Said Achmiz each can write up to two more comments on this post discussing what they think of this verdict, but are otherwise on a temporary ban from the site until they have negotiated with the mod team and settled on either:
credibly commit to changing their behavior in a fairly significant way,
or, accept some kind of tech solution that limits their engagement in some reliable way that doesn’t depend on their continued behavior.
or, be banned from commenting on other people’s posts (but still allowed to make new top level posts and shortforms)
(After the two comments they can continue to PM the LW team, although we’ll have some limit on how much time we’re going to spend negotiating)
Some background:
Said and Duncan are both among the two single-most complained about users since LW2.0 started (probably both in top 5, possibly literally top 2). They also both have many good qualities I’d be sad to see go.
The LessWrong team has spent hundreds of person hours thinking about how to moderate them over the years, and while I think a lot of that was worthwhile (from a perspective of “we learned new useful things about site governance”) there’s a limit to how much it’s worth moderating or mediating conflict re: two particular users.
So, something pretty significant needs to change.
A thing that sticks out in both the case of Said and Duncan is that they a) are both fairly law abiding (i.e. when the mods have asked them for concrete things, they adhere to our rules, and clearly support rule-of-law and the general principle of Well Kept Gardens), but b) both have a very strong principled sense of what a “good” LessWrong would look like and are optimizing pretty hard for that within whatever constraints we give them.
I think our default rules are chosen to be something that someone might trip accidentally, if you’re trying to mostly be good stereotypical citizen but occasionally end up having a bad day. Said and Duncan are both trying pretty hard to be good citizen in another country that the LessWrong team is consciously not trying to be. It’s hard to build good rules/guidelines that actually robustly deal with that kind of optimization.
I still don’t really know what to do, but I want to flag that the the goal I’ll be aiming for here is “make it such that Said and Duncan either have actively (credibly) agreed to stop optimizing in a fairly deep way, or, are somehow limited by site tech such that they can’t do the cluster of things they want to do that feels damaging to me.”
If neither of those strategies turn out to be tractable, banning is on the table (even though I think both of them contribute a lot in various ways and I’d be pretty sad to resort to that option). I have some hope tech-based solutions can work
(This is not a claim about which of them is more valuable overall, or better/worse/right-or-wrong-in-this-particular-conflict. There’s enough history with both of them being above-a-threshold-of-worrisome that it seems like the LW team should just actually resolve the deep underlying issues, regardless of who’s more legitimately aggrieved this particular week)
Re: Said:
One of the most common complaints I’ve gotten about LessWrong, from both new users as well as established, generally highly regarded users, is “too many nitpicky comments that feel like they’re missing the point”. I think LessWrong is less fragile than it was in 2018 when I last argued extensively with Said about this, but I think it’s still an important/valid complaint.
Said seems to actively prefer a world where the people who are annoyed by him go away, and thinks it’d be fine if this meant LessWrong had radically fewer posts. I think he’s misunderstanding something about how intellectual progress actually works, and about how valuable his comments actually are. (As I said previously, I tend to think Said’s first couple comments are worthwhile. The thing that feels actually bad is getting into a protracted discussion, on a particular (albeit fuzzy) cluster of topics)
We’ve had extensive conversations with Said about changing his approach here. He seems pretty committed to not changing his approach. So, if he’s sticking around, I think we’d need some kind of tech solution. The outcome I want here is that in practice Said doesn’t bother people who don’t want to be bothered. This could involve solutions somewhat specific-to-Said, or (maybe) be a sitewide rule that works out to stop a broader class of annoying behavior. (I’m skeptical the latter will turn out to work without being net-negative, capturing too many false positives, but seems worth thinking about)
Here are a couple ideas:
Easily-triggered-rate-limiting. I could imagine an admin feature that literally just lets Said comment a few times on a post, but if he gets significantly downvoted, gives him a wordcount-based rate-limit that forces him to wrap up his current points quickly and then call it a day. I expect fine-tuning this to actually work the way I imagine in my head is a fair amount of work but not that much.
Proactive warning. If a post author has downvoted Said comments on their post multiple times, they get some kind of UI alert saying “Yo, FYI, admins have flagged this user as somewhat with a pattern of commenting that a lot of authors have found net-negative. You may want to take that into account when deciding how much to engage”.
There’s some cluster of ideas surrounding how authors are informed/encouraged to use the banning options. It sounds like the entire topic of “authors can ban users” is worth revisiting so my first impulse is to avoid investing in it further until we’ve had some more top-level discussion about the feature.
Why is it worth this effort?
You might ask “Ray, if you think Said is such a problem user, why bother investing this effort instead of just banning him?”. Here are some areas I think Said contributes in a way that seem important:
Various ops/dev work maintaining sites like readthesequences.com, greaterwrong.com, and gwern.com. (edit: as Ben Pace notes, this is pretty significant, and I agree with his note that “Said is the person independent of MIRI (including Vaniver) and Lightcone who contributes the most counterfactual bits to the sequences and LW still being alive in the world”)
Most of his comments are in fact just pretty reasonable and good in a straightforward way.
While I don’t get much value out of protracted conversations about it, I do think there’s something valuable about Said being very resistant to getting swept up in fad ideas. Sometimes the emperor in fact really does have no clothes. Sometimes the emperor has clothes, but you really haven’t spelled out your assumptions very well and are confused about how to operationalize your idea. I do think this is pretty important and would prefer Said to somehow “only do the good version of this”, but seems fine to accept it as a package-deal.
Re: Duncan
I’ve spent years trying to hash out “what exactly is the subtle but deep/huge difference between Duncan’s moderation preferences and the LW teams.” I have found each round of that exchange valuable, but typically it didn’t turn out that whatever-we-thought-was-the-crux was a particularly Big Crux.
I think I care about each of the things Duncan is worried about (i.e. such as things listed in Basics of Rationalist Discourse). But I tend to think the way Duncan goes about trying to enforce such things extremely costly.
Here’s this month/year’s stab at it: Duncan cares particularly about things strawmans/mischaracterizations/outright-lies getting corrected quickly (i.e. within ~24 hours). See Concentration of Force for his writeup on at least one-set-of-reasons this matters). I think there is value in correcting them or telling people to “knock it off” quickly. But,
a) moderation time is limited
b) even in the world where we massively invest in moderation… the thing Duncan cares most about moderating quickly just doesn’t seem like it should necessarily be at the top of the priority queue to me?
I was surprised and updated on You Don’t Exist, Duncan getting as heavily upvoted as it did, so I think it’s plausible that this is all a bigger deal than I currently think it is. (that post goes into one set of reasons that getting mischaracterized hurts). And there are some other reasons this might be important (that have to do with mischaracterizations taking off and becoming the de-facto accepted narrative).
I do expect most of our best authors to agree with Duncan that these things matter, and generally want the site to be moderated more heavily somehow. But I haven’t actually seen anyone but Duncan argue they should be prioritized nearly as heavily as he wants. (i.e. rather than something you just mostly take-in-stride, downvote and then try to ignore, focusing on other things)
I think most high-contributing users agree the site should be moderated more (see the significant upvotes on LW Team is adjusting moderation policy), but don’t necessarily agree on how. It’d be cruxy for me if more high-contributing-users actively supported the sort of moderation regime Duncan-in-particular seems to want.
I don’t know that really captured the main thing here. I feel less resolved on what should change on LessWrong re: Duncan. But I (and other LW site moderators), want to be clear that while strawmanning is bad and you shouldn’t do it, we don’t expect to intervene on most individual cases. I recommend strong downvoting, and leaving one comment stating the thing seems false.
I continue to think it’s fine for Duncan to moderate his own posts however he wants (although as noted previously I think an exception should be made for posts that are actively pushing sitewide moderation norms)
Some goals I’d have are:
people on LessWrong feel safe that they aren’t likely to get into sudden, protracted conflict with Duncan that persists outside his own posts.
the LessWrong team and Duncan are on-the-same-page about LW team not being willing to allocate dozens of hours of attention at a moments notice in the specific ways Duncan wants. I don’t think it’s accurate to say “there’s no lifeguard on duty”, but I think it’s quite accurate to say that the lifeguard on duty isn’t planning to prioritize the things Duncan wants, so, Duncan should basically participate on LessWrong as if there is, in effect “no lifeguard” from his perspective. I’m spending ~40 hours this week processing this situation with a goal of basically not having to do that again.
In the past Duncan took down all his LW posts when LW seemed to be actively hurting him. I’ve asked him about this in the past year, and (I think?) he said he was confident that he wouldn’t. One thing I’d want going forward is a more public comment that, if he’s going to keep posting on LessWrong, he’s not going to do that again. (I don’t mind him taking down 1-2 problem posts that led to really frustrating commenting experiences for him, but if he were likely to take all the posts down that undercuts much of the value of having him here contributing)
FWIW I do think it’s moderately likely that the LW team writes a post taking many concepts from Basics of Rationalist Discourse and integrating it into our overall moderation policy. (It’s maybe doable for Duncan to rewrite the parts that some people object to, and to enable commenting on those posts by everyone. but I think it’s kinda reasonable for people to feel uncomfortable with Duncan setting the framing, and it’s worth the LW team having a dedicated “our frame on what the site norms are” anyway)
In general I think Duncan has written a lot of great posts – many of his posts have been highly ranked in the LessWrong review. I expect him to continue to provide a lot of value to the LessWrong ecosystem one way or another.
I’ll note that while I have talked to Duncan for dozens(?) of hours trying to hash out various deep issues and not met much success, I haven’t really tried negotiating with him specifically about how he relates to LessWrong. I am fairly hopeful we can work something out here.
Why spend so much time engaging with a single commenter? Well, the answer is that I do think the specific way Said has been commenting on the site had a non-trivial chance of basically just killing the site, in the sense of good conversation and intellectual progress basically ceasing, if not pushed back on and the collateral damage limited by moderator action.
Said has been by far the most complained user on the site, with many top authors citing him as a top reason for why they do not want to post on the site, or comment here, and also I personally (and the LessWrong team more broadly) would have had little interest in further investing in LessWrong if the kind of the kind of culture that Said brings had taken hold here.
So the stakes have been high, and the alternative would have been banning, which I think also in itself requires at least many dozens of hours of effort, and given that Said is a really valuable contributor via projects like greaterwrong and readthesequences.com, a choice I felt appropriately hesitant about.
Now, one might think that it seems weird for one person to be able to derail a comment thread. However, I claim this is indeed the case. As long as you can make comments that do not get reliably downvoted, you can probably successfully cause a whole comment thread to almost exclusively focus on the concern you care about. This is the result of a few different dynamics:
Commenters on LW do broadly assume that they shouldn’t ask a question or write a critique that someone else has already asked, or has indirectly already been answered in a different comment (this is good, this is what makes comment sections on LW much more coherent than e.g. on HN)
LW is high enough trust that I think authors generally assume that an upvoted comment is made in good standing and as such deserves some kind of response. This is good, since it allows people to coordinate reasonably well on requesting justification or clarification. However, voting is really quite low-fidelity, it’s not that hard to write comments that will not get downvoted, and there is little consistent reputational tracking on LessWrong across many engagements, meaning it’s quite hard for anyone to lose good standing.
Ultimately I think it’s the job of the moderators to remove or at least mark commenters who have lost good standing of this kind, given the current background of social dynamics (or alternatively to rejigger the culture and incentives to remove this exploit of authors thinking non-downvoted comments are made in good standing)
Occupy Wall Street strikes me as another instance of the same kind of popular sneer culture. Occupy Wall Street had no coherent asks, no worldview that was driving their actions. Everyone participating in the movement seemed to have a different agenda of what they wanted to get out of it. The thing that united them was a shared dislike of something in the vague vicinity of capitalism, or government, or the man, not anything that could be used as the basis for any actual shared creations or efforts.
To be clear, I think it’s fine and good for people to congratulate each other on getting new jobs. It’s a big life change. But of course if your discourse platform approximately doesn’t allow anything else, as I expand in the rest of this section, then you run into problems.
And… unfortunately… we are just getting started.
I am here trying to give a high-level gloss that tries to elucidate the central problems. Of course many individual conversations diverge, and there are variations on this, often in positive ways, but I would argue the overall tendency is strong and clear.
I switched to a different comment thread here, as a different thread made it easier to see the dynamics at hand. The Benquo thread also went meta, and you can read it here, but it seemed a bit harder to follow without reading a huge amount of additional context, and was a less clear example of the pattern I am trying to highlight.
See e.g. this comment thread with IMO a usually pretty good commenter who kept extremely strongly insisting that any analysis or evaluation of IMO clearly present subtext is fabricated or imagined.
I am not making a strong claim here that direct aggression is much worse or much better than passive aggression, I feel kind of confused about it, but I am saying that independently of that, there is one argument that passive/hidden aggression requires harsher punishment when prosecution does succeed.
Who to be clear, have contributed to the site in the past and have a bunch of karma.
For some further details on what Said means by “responding” see this comment.
A bit unclear what exactly happened, you can read the thread yourself. Mostly we argued for a long time about what kind of place LessWrong should be and how authors should relate to criticism until we gave an official warning, and then nothing about Said’s behavior changed in the future, but we didn’t have the energy to prosecute the subtext another time.
The functionality for this had been present earlier, but we hadn’t really encouraged people to use it.
Unless we are talking about weird issues that involve doxxing or infohazards or things like that.
Requiring you to read only 15,000 words of summary. :P
The full quote being:
The specific moderation message we sent at the end of that exchange was:
To be clear, I think as I point out in the earlier sections of this post, I think there are ways of doing this that would be good for the site, and functions that Said performed that are good, but I would be quite concerned about people doing a cargo-culting thing here.
And to be clear, this is all a pretty tricky topic. It is not rare for whole social groups, including the rationality community, to pretend to misunderstand something. As the moderators it’s part of our job to take into account whether the thing that is going on here is some social immune reaction that is exaggerating their misunderstandings, or maybe even genuinely preventing any real understanding from forming at all, and to adjust accordingly. This is hard.
Like, with some reasonable limit around 3000 words or so.