Thoughts on LessWrong norms, the Art of Discourse, and moderator mandate

A couple of weeks ago I asked Should LW have an official list of norms? and I appreciate the responses there. Here I want to say what I’m currently thinking following that post, and continue having a public conversation about it.

I think saying more on this topic actually gets into a bunch of interesting questions around LessWrong’s purpose, userbase, de facto norms and culture, moderation mandate, etc. Without locking in things as “Officially How It Is Forever”, I’ll opine on my current thinking on this topics and how I relate to them in practice. It’s possible that further public discussion will shift some things here, and after more back-and-forth, it’d make sense to “ratify” some of it more.

With all that said...

LessWrong and The Art of Discourse

LessWrong was founded to be a place for perfecting the Art of Human Rationality, i.e., generally thinking in ways which more reliably result in true beliefs, etc. Similarly, I think there’s a closely related “Art of Discourse”: communicating in ways that more reliably result in those conversing (and reading along) having more true beliefs. Perhaps it’s a sub-art of the Art of Human Rationality.

The real rules of which communication most efficiently gets you towards truth lives in reality. You can choose your norms, but whether those norms are conducive to truth isn’t up to you.

The LessWrong community, over its 10-15 year existence, has assembled a number of beliefs about the Art of Discourse. Things like communicating degrees of belief quantitatively, preference for asymmetric weapons, an interest in local validity, etc. We of course don’t have the complete art and may be mistaken about pieces of it, but we feel strongly about some of the pieces we believe we possess of this art.

Different people in our community have somewhat different senses of the Art of Discourse, and these even form clusters. But there’s a pretty solid common core set of norms on the site, such that if someone is not conforming to them, most people would want them to change their behavior or go elsewhere.

The core point I want to make here is: The Art of [Truth-seeking] Discourse lives in the territory, and we community members attempt to discover it and practice it.

Moderators moderate according to their own understanding of The Art

A thing you could imagine doing is the community comes together, writes down its sense of how you ought to behave, and enshrines that as The Law. The moderators (judges/​police) then interpret and enforce the law. I think this sometimes gets called “Rule of Law”.

I think that gets you some advantages, but requires infrastructure and investment LessWrong can’t realistically have, both for enshrining the initial law and then updating it over time in cases of incompleteness and ambiguity.

(edit: “Rule of Man” as an existing phrase means something crucially different from what I wanted to described. See my comment here for clarification.

Instead LessWrong operates by a “Rule of Man”[1] ~~”hybrid Rule of Law/​Man” system where the moderators apply our own understanding of the Art of Discourse to making moderation decisions about which behaviors are okay or not, and what to do with users who behave badly according to us. This has quite a few benefits: it allows us to be flexible and adaptable to new cases, it means we ask a direct question of “does this seem good or not?” rather than “did it violate the enshrined law?”, and it allows us to smoothly improve the enforced policy as our understanding of the Art of Discourse improves over time.

This approach does run the risk that moderators have bad calls (or could be corrupt or biased), which is why I favor moderation being transparent where doing so isn’t too costly , so people can call out things they think are mistakes.

Components of Decision-Making: Inside-View/​Outside-View/​Stakeholder-Game-Theory

It’d actually be imprecise to say that moderators just moderate according to our inside views of the Art of Discourse. I could possibly carve it in a few ways, but here’s one attempted breakdown at how we’ll make our site decisions:

  1. We make moderation decisions based on our inside view beliefs about what would be good for the Discourse and LessWrong’s goal. We do so both via indirect (via principles we’ve settled on) and direct consequentialist reasoning[2].

  2. We might sometimes weight the views of people we think are wrong, but generally respect their thinking. This is something like applying our “outside view” to situations.

    1. One form of this is trying to hold ourselves accountable to the people we think we should hold ourselves accountable to. I can’t currently provide you a list or clear criteria, but it’s like “these people seem to really capture the spirit of my values, by doing well in their lights, I will do well according to my own values and judgment”. Another framing might be “this includes the people that if they thought we were fucking up or were unhappy with us, I’d really really care. Eliezer and Scott would be on that list, for example.

  3. We do some “game theory” to figure out how to account for the views and preferences of people that we feel have meaningful “stake” in LessWrong. For example, if it was the case that a number of very core contributors differed from the LessWrong mod team in their beliefs about the Art of Discourse (could be something like different beliefs about what politeness-norms or psychologizing-norms are good), we would likely weigh those beliefs in the actually policy we upheld.

Legibilizing one’s understanding of the Art

There’s a bunch of encoded functions and algorithms in my brain which, for any given post or comment on LessWrong, will provide an evaluation of it. This illegible function is what is what I actually use to make moderation calls, and it would be very difficult, or really impossible, for me to make it fully legible (even to myself). The other members of the LessWrong team have their own functions, and for that matter, so does every user on LessWrong.

However, I can attempt to capture aspects of my encoded function into something explicit. Lists of principles that, while not the actual thing, point you in the right direction. Or lists of principles that I can invoke to help explain my reasoning in various cases. These legible list of principle or rules aren’t the law in the sense in which the US Constitution is law, but they’ll provide a better sense of the real rules than if you didn’t have them.

You end up with a fair bit of indirection:

written discussion principles attempt to capture LW team’s understanding of the Art of Discourse attempts to capture The Actual Art of Discourse.

The written principles/​norm are then hopefully useful by:

  • Being a useful start to learning the actual Art of Discourse for new users

  • Helping new users understand which behaviors get upvoted/​downvoted, approved/​rejected, moderated/​not-moderated

  • Helping moderators and other users explain their reactions

  • Focusing community discussions around okay/​not-okay behavior

At the same time, the written principles are not the end-all be-all. A moderator might say “while none of our existing written things capture what you’re doing, we’re pretty sure it’s bad and we’re taking moderator action to prevent more of this”.

A list of norms for LessWrong which is of the shape “here’s our understanding of the Art of Discourse (work in progress)” seems like it could be pretty good.

Towards a settled picture

I think the above picture is pretty good and it’s approx the models/​philosophy behind current moderation. But seems good to write up it up and discuss in advance of us taking bolder actions on the basis of it (e.g. writing a list of site norms). Very interested in feedback here that could result in amending the picture.

More than something framed as “site norms”, I like the idea of writing up “here’s our understanding of the Art of Discourse so far” that can be shared with new users and cited in moderation decisions would be pretty good. Also ideally it gets updated over time as we figure out more and more Art of the Discourse, and make LW more successful at its missions.

  1. ^

    This term gets used elsewhere in not quite the sense I mean it. Elsewhere, Rule of Man means something like the laws from the man, whereas I actually mean something like “the laws live in the territory but are interpreted and applied by man”. Perhaps could have a better term for it.

  2. ^

    I’ve long been a fan of R. M. Hare’s two-level utilitarianism, and think it in fact matches how we moderate – attempting to figure out general principles, but figuring out those principles and applying via more direct consequentialist reasoning.