Meta-tations on Moderation: Towards Public Archipelago

The recent moderation tools announcement represents a fairly major shift in how the site admins are approaching LessWrong. Several people noted important concerns about transparency and trust.

Those concerns deserve an explicit, thorough answer.

Summary of Concepts

  1. The Problem of Private Discussion – Why much intellectual progress in the rationalsphere has happened in hard-to-find places

  2. Public Discussion vs Intellectual Progress – Two subtly conflicting priorities for LessWrong.

  3. Healthy Disagreement – How to give authors tools to have the kinds of conversations they want, without degenerating into echo chambers.

  4. High Trust vs Functioning Low Trust environments – Different modes of feeling safe, with different costs and risks.

  5. Overton Windows, Personal Criticism – Two common conversational attractors. Tempting. Sometimes important. But rarely what an author is interested in talking about.

  6. Public Archipelago—A model that takes all of the above into account, giving people the tools to create person spaces that give them freedom to explore, while keeping all discussion public, so that it can be built upon, criticized, or refined.

i. The Problem

The issue with LessWrong that worries me the most:

In the past 5 years or so, there’s been a lot of progress – on theoretical rationality, on practical epistemic and instrumental rationality, on AI alignment, on effective altruism. But much of this progress has been on some combination of:

  • On various private blogs you need to keep track of.

  • On facebook – where discussions are often private, where searching for old comments is painful, and some people have blocked each other so it’s hard to tell what was actually said and who was able to read it.

  • On tumblr, whose interface for following a conversation is the most confusing thing I’ve ever seen.

  • On various google docs, circulated privately.

  • In person, not written down at all.

People have complained about this. I think a common assumption is something like “if we just got all the good people back on LessWrong at the same time you’d have a critical mass that could reboot the system.” That might help, but doesn’t seem sufficient to me.

I think LW2.0 has roughly succeeded at becoming “the happening place” again. But I still know several people who I intellectually respect, who find LessWrong an actively inhospitable place and don’t post here, or do so only grudgingly.

More Than One Way For Discussion To Die

I realize that there’s a very salient pathway for moderators to abuse their power. It’s easy to imagine how echo chambers could form and how reign-of-terror style moderation could lead to, well, reigns of terror.

It may be less salient to imagine a site subtly driving intelligent people away due to being boring, pedantic, or frustrating, but I think the latter is in fact more common, and a bigger threat to intellectual progress.

The current LessWrong selects somewhat for people who are thick skinned and conflict prone. Being thick-skinned is good, all being equal. Being conflict prone is not. And neither of these are the same as being able to generate useful ideas and think clearly, the most important qualities to cultivate in LessWrong participants.

The site admins don’t just have to think about the people currently here. We have to think about people who have things to contribute, but don’t find the site rewarding.

Facebook vs LessWrong

When I personally have a new idea to flesh out… well...

...I’d prefer a LessWrong post over a Facebook post. LW posts are more easily linkable, they have reasonable formatting options over FB’s plain text, and it’s easier to be sure a lot of people have seen it.

But to discuss those ideas…

In my heart of hearts, if I weren’t actively working on the LessWrong team, with a clear vision of where this project is going… I would prefer a Facebook comment thread to a LessWrong discussion.

There are certain blogs – Sarah, Zvi, Ben stick out in my mind, that are comparably good. But not many – the most common pattern is “post idea on blog, and the good discussion happens on FB, and individual comment insights only make it into the broader zeitgeist if someone mentions them in a high profile blogpost.”

On the right sort of Facebook comment thread, at least in my personal filter bubble, I can expect:

  • People I intellectually respect to show up and hash out ideas.

  • A collaborative attitude. “Let’s figure out and build a thing together.”

  • People who show up will share enough assumptions that we can talk about refining the idea to a usable state, rather than “is this idea even worth talking about?”

Beyond that, more subtle: even if I don’t know everyone, an intellectual discussion on FB usually feels like, well, we’re friends. Or at least allies.

Relatedly: the number of commenters is manageable. The comments on Slatestarcodex are reasonably good these days, but… I’m just not going to sift through hundreds or thousands of comments to find the gems. It feels like a firehose, not a conversation.

Meanwhile, the comments on LessWrong often feel… nitpicky and pointless.

If an idea isn’t presented maximally defensibly, people will focus on tearing holes in the non-loading-bearing parts of the idea, rather than help refine the idea into something more robust. And there’ll be people who disagree with or don’t understand foundational elements that the idea is supposed to be building off of, and the discussion ends up being about rehashing 101-level things instead of building 201-level knowledge.

Filter Bubbles

An obvious response to the above might be “of course you prefer Facebook over LessWrong. Facebook heavily filter bubbles you so that you don’t have to face disagreement. It’s good to force your ideas to intense scrutiny.”

And there’s important truth to that. But my two points are that:

  1. I think a case can be made that, during idea formation, the kind of disagreement I find on Facebook, Google Docs and in-person is actually better from the standpoint of intellectual progress.

  2. Whether or not #1 turns out to be true, if people prefer private conversations over public discussions (because they’re easier/​more-fun/​safer), then much discussion will tend to continue taking place in mostly private places, and no matter how suboptimal this is, it won’t change.

My experience is that my filter bubbles (whether on FB, Google Docs or in-person) do involve a lot of disagreement, and the disagreement is higher quality. When someone tells me I’m wrong, it’s often accompanied by an attempt to understand what my goals are, or what the core of a new idea was, which either lets me fix an idea, or abandon it but find something better to accomplish my original intent.

(On FB, this isn’t because the average commenter is that great, but because of a smallish number of people I deeply respect, who have different paradigms of thinking, at least 1-2 of whom will reliably show up)

There seems to be a sense that good ideas form fully polished, without any work to refine them. Or that until an idea is ready for peer review, you should keep it to yourself? Or be willing to have people poke at it with no regard how hedonically rewarding that experience is? I’m not sure what the assumption is but it’s contrary to how everyone I personally know generates insights.

The early stages work best when playful and collaborative.

Peer review is important, but so is idea formation. Idea formation often involves running with assumptions, crashing them into things and seeing if it makes sense.

You could keep idea-formation private and then share things when they’re ‘publicly presentable’, but I think this leads to people tending to keep conversation in “safe, private” zones longer than necessary. And meanwhile, it’s valuable to be able to see the generation process among respected thinkers.

Public Discussion vs Knowledge Building

Some people have a vision of Less Wrong as a public discussion. You put your idea out there. A conversation happens. Anyone is free to respond to that conversation as long as they aren’t being actively abusive. The best ideas rise to the top.

And this is a fine model, that should (and does) exist in some places. But:

  1. It’s never actually been the model or ethos LessWrong runs on. Eliezer wrote Well Kept Gardens Die By Pacifism years ago, and has always employed a Reign-of-Terror-esque moderation style. You may disagree with this approach, but it’s not new.

  2. A public discussion is not necessarily the same as the ethos Habryka is orienting around, which is to make intellectual progress.

These might seem like the same goal. And I share an aesthetic sense that in the ‘should’ world, where things are fair, public discussion and knowledge-building are somehow the same goal.

But we don’t live in the ‘should’ world.

We live in the world where you get what you incentivize.

Yes, there’s a chilling effect when authors are free to delete comments that annoy them. But there is a different chilling effect when authors aren’t free to have the sort of conversation they’re actually interested in having. The conversation won’t happen at all, or it’ll happen somewhere else (where you can’t comment on their stuff anyway).

A space cannot be universally inclusive. So the question is: is LessWrong one space, tailored for only the types of people who enjoy that space? Or do we give people tools to make their own spaces?

If the former, who is that space for, and what rules do we set? What level of knowledge do we assume people must have? We’ve long since agreed “if you show up arguing for creationism, this just isn’t the space for you.” We’ve generally agreed that if you are missing concepts in the sequences, it’s your job to educate yourself before trying to debate (although veterans should politely point you in the right direction).

What about posts written since the sequences ended?

What skills and/​or responsibilities do we assume people must have? Do we assume people have the ability to notice and speak up about their needs a la Sarah Constantin’s Hierarchy of Requests? Do we require them to be able to express those needs ‘politely’? Whose definition of polite do we use?

No matter which answer you choose for any of these questions, some people are going to find the resulting space inhospitable, and take their conversation elsewhere.

I’d much rather sidestep the question entirely.

A Public Archipelago Solution

Last year I explored applying Scott Alexander’s Archipelago idea towards managing community norms. Another quick recap:

Imagine a bunch of factions fighting for political control over a country. They’ve agreed upon the strict principle of harm (no physically hurting or stealing from each other). But they still disagree on things like “does pornography harm people”, “do cigarette ads harm people”, “does homosexuality harm the institution of marriage which in turn harms people?”, “does soda harm people”, etc.
And this is bad not just because everyone wastes all this time fighting over norms, but because the nature of their disagreement incentivizes them to fight over what harm even is.
And this in turn incentivizes them to fight over both definitions of words (distracting and time-wasting) and what counts as evidence or good reasoning through a politically motivated lens. (Which makes it harder to ever use evidence and reasoning to resolve issues, even uncontroversial ones)

And then...

Imagine someone discovers an archipelago of empty islands. And instead of continuing to fight, the people who want to live in Sciencetopia go off to found an island-state based on ideal scientific processes, and the people who want to live in Libertopia go off and found a society based on the strict principle of harm, and the people who want to live in Christiantopia go found a fundamentalist Christian commune.
This lets you test more interesting ideas. If a hundred people have to agree on something, you’ll only get to try things that you can can 50+ people on board with (due to crowd inertia, regardless of whether you have a formal democracy)
But maybe you can get 10 people to try a more extreme experiment. (And if you share knowledge, both about experiments that work and ones that don’t, you can build the overall body of community-knowledge in your social world)

Taking this a step farther is the idea of Public Archipelago, with islands that overlap.

Let people create their own spaces. Let the conversations be restricted as need be, but centralized and public, so that everyone at least has the opportunity to follow along, learn, respond and build off of each other’s ideas, instead of having to network their way into various social/​internet circles to keep up with everything.

This necessarily means that not all of LessWrong will be a comfortable place to any given person, but it at least means a wider variety of people will be able to use it, which means a wider variety of ideas can be seen, critiqued, and built off of.

Healthy Disagreement

Now, there’s an obvious response to my earlier point about “it’s frustrating to have to explain 101-level things to people all the time.”

Maybe you’re not explaining 101-level things. Maybe you’re actually just wrong about the foundations of your ideas, and your little walled garden isn’t a 201 space, it’s an echo chamber built on sand.

This is, indeed, quite a problem.

It’s an even harder problem than you might think at first glance. It’s difficult to offer an informed critique of something that’s actually useful. I’m reminded of Holden Karnofsky’s Thoughts on Public Discourse:

For nearly a decade now, we’ve been putting a huge amount of work into putting the details of our reasoning out in public, and yet I am hard-pressed to think of cases (especially in more recent years) where a public comment from an unexpected source raised novel important considerations, leading to a change in views.
This isn’t because nobody has raised novel important considerations, and it certainly isn’t because we haven’t changed our views. Rather, it seems to be the case that we get a large amount of valuable and important criticism from a relatively small number of highly engaged, highly informed people. Such people tend to spend a lot of time reading, thinking and writing about relevant topics, to follow our work closely, and to have a great deal of context. They also tend to be people who form relationships of some sort with us beyond public discourse.
The feedback and questions we get from outside of this set of people are often reasonable but familiar, seemingly unreasonable, or difficult for us to make sense of.

The obvious criticisms of an idea may have obvious solutions. If you interrupt a 301 discussion to ask “but have you considered that you might be wrong about everything?”… well, yes. They have probably noticed the skulls. This often feels like 2nd-year undergrads asking post-docs to flesh out everything they’re saying, using concepts only available to the undergrads.

Still, peer review is a crucial part of the knowledge-building process. You need high quality critique (and counter-critique, and counter-counter-critique). How do you square that with giving an author control over their conversation?

I hope (and fairly confidently believe) that most authors, even ones employing Reign-of-Terror style moderation policies, will not delete comments willy nilly – and the site admins will be proactively having conversations with authors who seem to be abusing the system. But we do need safeguards in case this turns out to be worse than we expect.

The answer is pretty straightforward: it’s not at all obvious that the public discussion of a post has to be on that particular post’s comment section.

(Among other things, this is not how most science works, AFAICT, although traditional science leaves substantial room for improvement anyhow).

If you disagree with a post, and the author deletes or blocks you from commenting, you are welcome to write another post about your intellectual disagreement.

Yes, this means that people reading the original post may come away with an impression that a controversial idea is more accepted than it really is. But if that person looks at the front page of the site, and the idea is controversial, there will be both other posts and recent comments arguing about its merits.

It also means that no, you don’t automatically get the engagement of everyone who read the original post. I see this as a feature, not a bug.

If you want your criticism to be read, it has to be good and well written. It doesn’t have to fit within the overall zeitgeist of what’s currently popular or what the locally high-status people think. Holden’s critical Thoughts on Singularity Institute is one of the most highly upvoted posts of all time. (If anything, I think LessWrong folk are too eager to show off their willingness to dissent and upvote people just for being contrarian).

It does suck that you must be good at writing and know your audience (which isn’t necessarily the same as good at thinking). But this applies just as much to being the original author of an idea, as to being a critic.

The author of a post doesn’t owe you their rhetorical strength and audience and platform to give you space to write your counterclaim. We don’t want to incentivize people to protest quickly and loudly to gain mindshare in a popular author’s comment section. We want people to write good critiques.

Meanwhile, if you’re making an effort to understand an author’s goals and frame disagreement in a way that doesn’t feel like an attack, I don’t anticipate this coming up much in the first place.

ii. Expectations and Trust

I think a deep disagreement that underlies a lot of the debate over moderation: what sort of trust is important to you?

This is a bit of a digression – almost an essay unto itself – but I think it’s important.

Elements of Trust

Defining trust is tricky, but here’s a stab at it: “Trust is having expectations of other people, and not having to worry about whether those expectations will be met.”

This has a few components:

  • Which expectations do you care about being upheld?

  • How much do you trust people in your environment to uphold them?

  • What strategies do you prefer to resolve the cognitive load that comes when you can’t trust people (or, are not sure if you can)?

Which expectations?

You might trust people…

  • to keep their promises and/​or mean what they say.

  • to care about your needs.

  • to uphold particular principles (clear thinking, transparency).

  • to be able (and willing) to perform a particular skill (including things like noticing that when you’re not saying what you mean).

Trust is a multiple-place function. Maybe you trust Alice to reliably provide all the relevant information even if it makes her look bad. You trust Bob to pay attention to your emotional state and not say triggering things. You can count on Carl to call you on your own bullshit (and listen thoughtfully when you call him on his). Eve will reliably enforce her rules even when it’s socially inconvenient to do so.

You may care about different kinds of trust in different contexts.

How much do you trust a person or space?

For the expectations that matter most to you, do you generally expect them to be fulfilled, or do you have to constantly monitor and take action to ensure them?

With a given person, or a particular place, is your guard always up?

In high trust environments, you expect other people to care about the same expectations you do, and follow through on them. This might mean looking out for each other’s interests. Or, merely that you’re focused on the same goals such that “each other’s interests” doesn’t come into play.

High trust environments require you to either personally know everyone, or to have strong reason to believe in the selection effects on who is present.


  • A small group of friends by a campfire might trust each other to care about each other’s needs and try to ensure they are met (but not necessarily to have particular skills required to do so).

  • A young ideological startup might trust each other to have skills, and to care about the vision of the company (but, perhaps not to ‘have each other’s back’ as the company grows and money/​power becomes up for grabs)

  • A small town, where families have lived there for generations and share a culture.

  • A larger military battalion, where everyone knows that everyone knows that everyone went through the same intense training. They clearly have particular skills, and would suffer punishment if they don’t follow the orders from high command.

Low trust environments are where you have no illusions that people are looking out for the things you care about.

The barriers to entry are low. People come and go often. People often represent themselves as if they are aligned with you, but this is poor evidence for whether they are in fact aligned with you. You must constantly have your guard up.


  • A large corporation where no single person knows everybody

  • A large community with no particular barrier to entry beyond showing up and talking as if you understand the culture

  • A big city, with many cultures and subcultures constantly interfacing.

Transparent Low Trust, Curated High Trust

Having to watch your back all the time is exhausting, and there’s at least two strategy-clusters I can think of to alleviate that.

In a transparent low trust environment, you don’t need to rely on anyone’s word or good intentions. Instead, you rely upon transparency and safeguards built into the system.

It’s your responsibility to make use of those safeguards to check that things are okay.

A curated high trust environment has some kind of strong barrier to entry. The advantage is that things can move faster, be more productive, require less effort and conflict, and focus only on things you care about.

It’s the owner of the space’s responsibility to kick people out if they aren’t able to live up to the norms in the space. It’s your responsibility to decide whether you trust the the space, and leave if you don’t.

The current atmosphere at LessWrong is something like “transparent medium trust.” There are rough, site-level filters on what kind of participation is acceptable – much moreso than the average internet hangout. But not much micromanaging on what precise expectations to uphold.

I think some people are expecting the new moderation tools to mean “we took a functioning medium trust environment and made it more dangerous, or just weirdly tweaked it, for the sake of removing a few extra annoying comments or cater to some inexplicable whims.”

But part of the goal here is to create a fundamental phase shift, where types of conversations are possible that just weren’t in a medium-trust world.

Why High Trust?

Why take the risk of high trust? Aren’t you just exposing yourself to people who might take advantage of you?

I know some people who’ve been repeatedly hurt, by trying to trust, and then having people either trample all over their needs, or actively betray them. Humans are political monkeys that make up convenient stories to make themselves look good all the time. If you aren’t actually aligned with your colleagues, you will probably eventually get burned.

And high trust environments can’t scale – too many people show up with too many different goals, and many of them are good at presenting themselves as aligned with you (they may even think they’re aligned with you), but… they are not.

LessWrong (most likely) needs to scale, so it’s important for there to be spaces here that are Functioning Low Trust, that don’t rely on load-bearing authority figures.

I do not recommend this blindly to everyone.

But. To misquote Umesh – “If you’re not occasionally getting backstabbed, you’re probably not trusting enough.”

If you can trust the people around you, all the attention you put into watching your back can go to other things. You can expect other people to look out for your needs, or help you in reliable ways. Your entire body physiologically changes, no longer poised for fight or flight. It’s physically healthier. In some cases it’s better for your epistemics – you’re less defensive when you don’t feel under attack, making it easier to consider opposing points of view.

I live most of my life in high trust environments these days, and… let me tell you holy shit when it works it is amazing. I know a couple dozen people who I trust to be honest about their personal needs, to be reasonably attentive to mine, who are aligned with me on how to resolve interpersonal stuff as well as Big Picture How the Universe Should Look Someday.

When we disagree (as we often do), we have a shared understanding of how to resolve that disagreement.

Conversations with those people are smooth, productive, and insightful. When they are not smooth, the process for figuring out how to resolve them is smooth or at least mutually agreed upon.

So when I come to LessWrong, where the comments assume at-most-medium trust… where I’m not able to set a higher or different standard for a discussion beyond the lowest common denominator…

It’s really frustrating and sad, to have to choose between a public-untrusted and private-but-high-trust conversation.

It’s worth noting: I participate in multiple spaces that I trust differently. Maybe I wouldn’t recommend particular friends join Alice’s space because, while she’s good stating her clear reasons for things and evaluating evidence clearly and making sure others do the same, she’s not good at noticing when you’re triggered and pausing to check in if you’re okay.

And maybe Eve really needs that. That’s usually okay, because Eve can go to Bob’s space, or run her own.

Sometimes, Bob’s space doesn’t exist, and Eve lacks the skills to attract people to a new space. This is really important and sad. I personally expect LessWrong to contain a wide distribution of preferences that can support many needs, but it probably won’t contain something for everyone.

Still, I think it’s an overall better strategy to make it easier to create new subspaces than to try to accommodate everyone at once.

Getting Burned

I expect to get hurt sometimes.

I expect some friends (or myself) to not always be at our best. Not always self-aware enough to avoid falling into sociopolitical traps that pit us against each other.

I expect that at least some of the people I’m currently aligned with, I may eventually turn out to be unaligned with, and to come into conflict that can’t be easily resolved. I’ve had friendships that turned weirdly and badly adversarial and I spent months stressfully dealing with it.

But the benefits of high trust are so great that I don’t regret for a second having spent the first few years with those friends in a high-trust relationship.

I acknowledge that I am pretty privileged in having a set of needs and interpersonal preferences that are easier to fit into a high trust environment. There are people who just don’t interface well with the sort of spaces I thrive in, who may never get the benefits of high trust, and that… really sucks.

But the benefit of the Public Archipelago model is that there can be multiple subsections of the site with different norms. You can participate in discussions where you trust the space owner. Some authors may clearly spell out norms and take the time to clearly explain why they moderate comments, and maybe you trust them the most.

Some authors may not be willing to take that time. Maybe you trust them less, or maybe you know them well enough that you trust them anyhow.

In either case, you know what to expect, and if you’re not okay with it, you either don’t participate, or respond elsewhere, or put effort into understanding the author’s goals so that you are able to write critiques that they find helpful.

iii. The Fine Details

Okay, but can’t we at least require reasons?

I don’t think many people were resistant to deleting comments – the controversial feature was “delete without trace.”

First, spam bots, and dedicated adversaries with armies of sockpuppets make it at least necessary for this tool to be an available (LW2.0 has had posts with hundreds of spam or troll comments we quietly delete and IP ban)

For non-obvious spam…

I do hope delete without trace is used rarely (or that authors send the commenter a private reason when doing so). We plan to implement the moderation log Said Achmiz recommended, so that if someone is deleting a lot of comments without trace you can at least go and check, and notice patterns. (We may change the name to “delete and hide”, since some kind of trace will be available).

All things being equal, clear reasons are better than none, and more transparency is better than less.

But all things are not equal.

Moderation is work.

And I don’t think everyone understands that the amount of work varies a lot, both by volume, and by personality type.

Some people get energized and excited by reading through confrontational comments and responding.

Some people find it incredibly draining.

Some people get maybe a dozen comments on their articles a day. Some get barely any at all. But some authors get hundreds, and even if you’re the sort of person who is energized by it, there are only so many hours in a day and there are other things worth doing.

Some comments are not just mean or dumb, but immensely hateful and triggering to the author, and simply glancing at a reminder that it existed is painful – enough to undo the personal benefit they got from having written their article in the first place.

For many people, figuring out how to word a moderation notice is stressful, and I’m not sure whether it’s more intense on average to have to say:

“Please stop being rude and obnoxiously derailing threads”


“I’m sorry, I know you’re trying your best, but you’re asking a lot of obvious questions and making subtly bad arguments in ways that soak up the other commenter’s time. The colleagues that I’m trying to attract to these discussion threads are tired of dealing with you.”

Not to mention that moderation often involves people getting angry at you, so you don’t just have to come up with the initial posted reason, but also deal with a bunch of followup that can wreck your week. Comments that leave a trace invite people to argue.

Moderation can be tedious. Moderation can be stressful. Moderation is generally unpaid. Moderators can burn out or decide “you know what, this just isn’t worth the time and bullshit.”

And this is often the worst deal for the best authors, since the best authors attract more comments, and sometimes end up acquiring a sort of celebrity status where commenters don’t quite feel like they’re people anymore, and feel justified (or even obligated) to go out of their way to take them down a peg.

If none of this makes sense to you, if you can’t imagine moderating being this big a deal… well… all I can say is it just really is a god damn big deal. It really really is.

There is a tradeoff we have to make, one way or another, on whether we want to force our best authors to follow clear, legible procedures, or to write and engage more.

Requiring the former can (and has) ended up punishing the latter.

We prioritized building the delete-and-hide function because Eliezer asked for it and we wanted to get him posting again quickly. But he is not the only author to have asked and expressed appreciation for it.

Incentivizing Good Ideas and Good Criticism

I’ll make an even stronger claim here: punishing idea generation is worse than punishing criticism.

You certainly need both, but criticism is easier. There might be environments where there isn’t enough quantity or quality of critics, but I don’t think LessWrong is one of them. Insofar as we don’t have good enough criticism, it’s because the critiques are nitpicky and unhelpful instead of trying to deeply understand unfamiliar ideas and collaboratively improve their load-bearing cruxes.

And meanwhile, I think the best critics also tend to be the best idea-generators – the two skills are in fact tightly coupled – so making LessWrong a place they feel excited to participate in seems very important.

It’s possible to go too far in this direction. There are reasonable cases for making a different tradeoffs that different corners of the internet might employ. But our decision on LessWrong is that authors are not obligated to put in that work if it’s stressful.

Overton Windows, and Personal Criticism

There’s a few styles of comments that reliably make me go “ugh, this is going to become a mess and I really don’t want to deal with it.” Comments whose substance is “this idea is bad, and should not be something LessWrong talks about.”

In that moment, the conversation stops being about whatever the idea was, and starts being about politics.

A recent example is what I’d call “fuzzy system 1 stuff.” The Kensho and Circling threads felt like they were mostly arguing about “is it even okay to talk about fuzzy system 1 intuitions in rational discourse?”. If you wanted to talk about the core ideas and how to use them effectively, you had to wade through a giant, sprawling demon thread.

Now, it’s actually pretty important whether fuzzy system 1 intuitions have a place in rational discourse. It’s a conversation that needs to happen, a question that probably has a right answer that we can converge on (albeit a nuanced one that depends on circumstances).

But right now, it seems like the only discussion that’s possible to have about them is “are these in the overton window or not?”. There needs to be space to explore ideas that aren’t currently in the accepted paradigm.

I’d even claim that doing that productively is one of the things rationality is for.

Similar issues abound with critiquing someone’s tone, or otherwise critiquing a person rather than an idea. Comments like that tend to quickly dominate the discussion and make it hard to talk about anything else. In many cases, if the comment were a private message, it could have been taken as constructive criticism instead of a personal attack that enflares people’s tribal instincts.

For personal criticism, I think the solution is to build tools that make private discussion easier.

For Overton Window political brawls, I think the brawl itself is inevitable (if someone wants to talk about a controversial thing, and other people don’t want them to talk about the controversial thing, you can’t avoid the conflict). But I think it’s reasonable for authors to say “if we’re going to have the overton discussion, can we have it somewhere else? Right here, I’m trying to talk about the ramifications of X if Y is true.”

Meanwhile, if you think X or Y are actively dangerous, you can still downvote their post. Instead of everyone investing endless energy in multiple demon threads, the issue can be resolved via a single thread, and the karma system.

I don’t think this would have helped with the most recent thread, but it’s an option I’d want available if I ever explored a controversial topic in the future.

iv. Towards Public Archipelago

This is a complicated topic, the decision is going to affect people. If you’re the sort of person for whom the status quo seemed just perfect, your experience is probably going to become worse.

I do think that is sad, and it’s important to own it, and apologize – I think having a place that felt safe and home and right become a place that feels alienating and wrong is in fact among the worst things that can happen to a person.

But the consequences of not making some major changes seem too great to ignore.

The previous iteration of LessWrong died. It depended on skilled writers continuously posting new content. It dried up as, one by one, as they decided LessWrong wasn’t best place for them to publish or brainstorm.

There’s a lot of reasons they made that choice. I don’t know that our current approach will solve the problem. But I strongly believe that to avoid the same fate for LessWrong 2.0, it will need to be structurally different in some ways.

An Atmosphere of Experimentation

We have some particular tools, and plans, to give authors the same control they’d have over a private blog, to reduce the reasons to move elsewhere. This may or may not help. But beneath the moderation tools and Public Archipelago concept is an underlying approach of experimentation.

At a high level, the LessWrong 2.0 team will be experimenting with the site design. We want this to percolate through the site – we want authors to be able to experiment with modalities of discussion. We want to provide useful, flexible tools to help them do so.

Eventually we’d like users to experiment both with their overall moderation policy and culture, as well as the norms for individual posts.

Experiments I’d personally like to see:

  • Posts where all commenters are required to fully justify their claims, such that complete strangers with no preconceptions can verify them

  • Posts where all commenters are required to take a few ideas as given, to see if they have interesting implications in 201 or 301 concept space

  • Discussions where comments must follow particular formats and will be deleted otherwise, such as the r/​AskHistorians subreddit or stackoverflow.

  • Discussions where NVC is required

  • Discussions where NVC is banned

  • Personal Blogposts where all commenters are only allowed to speak in poetry.

  • Discussions where you need to be familiar with graduate level math to participate.

  • Discussions where authors feel free to delete any comment that doesn’t seem like it’s pulling its attentional weight.

  • Discussions where only colleagues the author personally knows and trusts get to participate.

Bubbling Up and Peer Review

Experimentation doesn’t mean splintering, or that LessWrong won’t have a central ethos connecting it. The reason we’re allowing user moderation on Frontpage posts is that we want good ideas to bubble up to the top, and we don’t want it to feel like a punishment if a personal blogpost gets promoted to Frontpage or Curated. If an idea (or discussional experiment) is successful, we want people to see it, and build off it.

Still, what sort of experimentation and norms to expect will vary depending on how much exposure a given post has.

On personal blogposts, pretty much anything goes.

On Frontpage posts, we will want to have some kind of standard, which I’m not sure we can formally specify. We’re restricting moderation tools to users with high karma, so that only people who’ve already internalized what LessWrong is about have access to them. We want experimentation that productively explores rational-discussion-space. (If you’re going to ask people to only comment in haiku on a frontpage post, you should have a pretty good reason as to why you think this will foster intellectual progress).

If you’re deleting anyone who disagrees with you even slightly, or criticizing other users without letting them respond, we’ll be having a talk with you. We may remove your mod privileges or restrict them to your personal blogposts.

Curated posts will (as they already do) involve a lot of judgment calls on the sitewide moderation team.

At some point, we might explore some kind of formal peer review process, for ideas that seem important enough to include in the LessWrong canon. But exploring that in full is beyond the scope of this post.

Norms for this comment section

With this post, I’m kinda intentionally summoning a demon thread. That’s okay. This is the official “argue about the moderation overton window changing” discussion space.

Still, some types of arguing seem more productive than others. It’s especially important for this particular conversation to be maximally transparent, so I won’t be deleting anything except blatant trolling. Comments that are exceptionally hostile, I might comment-lock, but leave visible with an explicit reason why.

But, if you want your comments or concerns to be useful, some informal suggestions:

Failure modes to watch out for:

  • If the Public Archipelago direction seems actively dangerous or otherwise awful, try to help solve the underlying problem. Right now, one of the most common concerns we’ve heard from people who we’d like to be participating on LessWrong is that the comments feel nitpicky, annoying, focused on unhelpful criticism, or unsafe. If you’re arguing that the Archipelago approach is fundamentally flawed, you’ll need to address this problem in some fashion. Comments that don’t at least acknowledge the magnitude of the tradeoff are unlikely to be persuasive.

  • If other commenters seem to have vastly different experiences than you, try to proactively understand them – solutions that don’t take into account diversity of experience are less useful.

Types of comments I expect to be especially useful:

  • Considerations we’ve missed. This is a fairly major experiment. We’ve tried to be pretty thorough about exploring the considerations here, but there are probably a lot o we haven’t thought of.

  • Pareto Improvements. I expect there are a lot of opportunities to avoid making tradeoffs, instead finding third-options that get as many different benefits as once.

  • Specific tools you’d like to see. Ideally, tools that would enable a variety of experiments while ensuring that good content still gets to bubble up.

Ok. That was a bit of a journey. But I appreciate you bearing with me, and am looking forward to having a thorough discussion on this.