Meta-tations on Moderation: Towards Public Archipelago

The re­cent mod­er­a­tion tools an­nounce­ment rep­re­sents a fairly ma­jor shift in how the site ad­mins are ap­proach­ing LessWrong. Sev­eral peo­ple noted im­por­tant con­cerns about trans­parency and trust.

Those con­cerns de­serve an ex­plicit, thor­ough an­swer.

Sum­mary of Concepts

  1. The Prob­lem of Pri­vate Dis­cus­sion – Why much in­tel­lec­tual progress in the ra­tio­nal­sphere has hap­pened in hard-to-find places

  2. Public Dis­cus­sion vs In­tel­lec­tual Progress – Two sub­tly con­flict­ing pri­ori­ties for LessWrong.

  3. Healthy Disagree­ment – How to give au­thors tools to have the kinds of con­ver­sa­tions they want, with­out de­gen­er­at­ing into echo cham­bers.

  4. High Trust vs Func­tion­ing Low Trust en­vi­ron­ments – Differ­ent modes of feel­ing safe, with differ­ent costs and risks.

  5. Over­ton Win­dows, Per­sonal Crit­i­cism – Two com­mon con­ver­sa­tional at­trac­tors. Tempt­ing. Some­times im­por­tant. But rarely what an au­thor is in­ter­ested in talk­ing about.

  6. Public Archipelago—A model that takes all of the above into ac­count, giv­ing peo­ple the tools to cre­ate per­son spaces that give them free­dom to ex­plore, while keep­ing all dis­cus­sion pub­lic, so that it can be built upon, crit­i­cized, or re­fined.

i. The Problem

The is­sue with LessWrong that wor­ries me the most:

In the past 5 years or so, there’s been a lot of progress – on the­o­ret­i­cal ra­tio­nal­ity, on prac­ti­cal epistemic and in­stru­men­tal ra­tio­nal­ity, on AI al­ign­ment, on effec­tive al­tru­ism. But much of this progress has been on some com­bi­na­tion of:

  • On var­i­ous pri­vate blogs you need to keep track of.

  • On face­book – where dis­cus­sions are of­ten pri­vate, where search­ing for old com­ments is painful, and some peo­ple have blocked each other so it’s hard to tell what was ac­tu­ally said and who was able to read it.

  • On tum­blr, whose in­ter­face for fol­low­ing a con­ver­sa­tion is the most con­fus­ing thing I’ve ever seen.

  • On var­i­ous google docs, cir­cu­lated pri­vately.

  • In per­son, not writ­ten down at all.

Peo­ple have com­plained about this. I think a com­mon as­sump­tion is some­thing like “if we just got all the good peo­ple back on LessWrong at the same time you’d have a crit­i­cal mass that could re­boot the sys­tem.” That might help, but doesn’t seem suffi­cient to me.

I think LW2.0 has roughly suc­ceeded at be­com­ing “the hap­pen­ing place” again. But I still know sev­eral peo­ple who I in­tel­lec­tu­ally re­spect, who find LessWrong an ac­tively in­hos­pitable place and don’t post here, or do so only grudg­ingly.

More Than One Way For Dis­cus­sion To Die

I re­al­ize that there’s a very salient path­way for mod­er­a­tors to abuse their power. It’s easy to imag­ine how echo cham­bers could form and how reign-of-ter­ror style mod­er­a­tion could lead to, well, reigns of ter­ror.

It may be less salient to imag­ine a site sub­tly driv­ing in­tel­li­gent peo­ple away due to be­ing bor­ing, pedan­tic, or frus­trat­ing, but I think the lat­ter is in fact more com­mon, and a big­ger threat to in­tel­lec­tual progress.

The cur­rent LessWrong se­lects some­what for peo­ple who are thick skinned and con­flict prone. Be­ing thick-skinned is good, all be­ing equal. Be­ing con­flict prone is not. And nei­ther of these are the same as be­ing able to gen­er­ate use­ful ideas and think clearly, the most im­por­tant qual­ities to cul­ti­vate in LessWrong par­ti­ci­pants.

The site ad­mins don’t just have to think about the peo­ple cur­rently here. We have to think about peo­ple who have things to con­tribute, but don’t find the site re­ward­ing.

Face­book vs LessWrong

When I per­son­ally have a new idea to flesh out… well...

...I’d pre­fer a LessWrong post over a Face­book post. LW posts are more eas­ily link­able, they have rea­son­able for­mat­ting op­tions over FB’s plain text, and it’s eas­ier to be sure a lot of peo­ple have seen it.

But to dis­cuss those ideas…

In my heart of hearts, if I weren’t ac­tively work­ing on the LessWrong team, with a clear vi­sion of where this pro­ject is go­ing… I would pre­fer a Face­book com­ment thread to a LessWrong dis­cus­sion.

There are cer­tain blogs – Sarah, Zvi, Ben stick out in my mind, that are com­pa­rably good. But not many – the most com­mon pat­tern is “post idea on blog, and the good dis­cus­sion hap­pens on FB, and in­di­vi­d­ual com­ment in­sights only make it into the broader zeit­geist if some­one men­tions them in a high pro­file blog­post.”

On the right sort of Face­book com­ment thread, at least in my per­sonal filter bub­ble, I can ex­pect:

  • Peo­ple I in­tel­lec­tu­ally re­spect to show up and hash out ideas.

  • A col­lab­o­ra­tive at­ti­tude. “Let’s figure out and build a thing to­gether.”

  • Peo­ple who show up will share enough as­sump­tions that we can talk about re­fin­ing the idea to a us­able state, rather than “is this idea even worth talk­ing about?”

Beyond that, more sub­tle: even if I don’t know ev­ery­one, an in­tel­lec­tual dis­cus­sion on FB usu­ally feels like, well, we’re friends. Or at least al­lies.

Re­lat­edly: the num­ber of com­menters is man­age­able. The com­ments on Slat­estar­codex are rea­son­ably good these days, but… I’m just not go­ing to sift through hun­dreds or thou­sands of com­ments to find the gems. It feels like a fire­hose, not a con­ver­sa­tion.

Mean­while, the com­ments on LessWrong of­ten feel… nit­picky and pointless.

If an idea isn’t pre­sented max­i­mally defen­si­bly, peo­ple will fo­cus on tear­ing holes in the non-load­ing-bear­ing parts of the idea, rather than help re­fine the idea into some­thing more ro­bust. And there’ll be peo­ple who dis­agree with or don’t un­der­stand foun­da­tional el­e­ments that the idea is sup­posed to be build­ing off of, and the dis­cus­sion ends up be­ing about re­hash­ing 101-level things in­stead of build­ing 201-level knowl­edge.

Filter Bubbles

An ob­vi­ous re­sponse to the above might be “of course you pre­fer Face­book over LessWrong. Face­book heav­ily filter bub­bles you so that you don’t have to face dis­agree­ment. It’s good to force your ideas to in­tense scrutiny.”

And there’s im­por­tant truth to that. But my two points are that:

  1. I think a case can be made that, dur­ing idea for­ma­tion, the kind of dis­agree­ment I find on Face­book, Google Docs and in-per­son is ac­tu­ally bet­ter from the stand­point of in­tel­lec­tual progress.

  2. Whether or not #1 turns out to be true, if peo­ple pre­fer pri­vate con­ver­sa­tions over pub­lic dis­cus­sions (be­cause they’re eas­ier/​more-fun/​safer), then much dis­cus­sion will tend to con­tinue tak­ing place in mostly pri­vate places, and no mat­ter how sub­op­ti­mal this is, it won’t change.

My ex­pe­rience is that my filter bub­bles (whether on FB, Google Docs or in-per­son) do in­volve a lot of dis­agree­ment, and the dis­agree­ment is higher qual­ity. When some­one tells me I’m wrong, it’s of­ten ac­com­panied by an at­tempt to un­der­stand what my goals are, or what the core of a new idea was, which ei­ther lets me fix an idea, or aban­don it but find some­thing bet­ter to ac­com­plish my origi­nal in­tent.

(On FB, this isn’t be­cause the av­er­age com­menter is that great, but be­cause of a small­ish num­ber of peo­ple I deeply re­spect, who have differ­ent paradigms of think­ing, at least 1-2 of whom will re­li­ably show up)

There seems to be a sense that good ideas form fully pol­ished, with­out any work to re­fine them. Or that un­til an idea is ready for peer re­view, you should keep it to your­self? Or be will­ing to have peo­ple poke at it with no re­gard how he­do­nically re­ward­ing that ex­pe­rience is? I’m not sure what the as­sump­tion is but it’s con­trary to how ev­ery­one I per­son­ally know gen­er­ates in­sights.

The early stages work best when playful and col­lab­o­ra­tive.

Peer re­view is im­por­tant, but so is idea for­ma­tion. Idea for­ma­tion of­ten in­volves run­ning with as­sump­tions, crash­ing them into things and see­ing if it makes sense.

You could keep idea-for­ma­tion pri­vate and then share things when they’re ‘pub­li­cly pre­sentable’, but I think this leads to peo­ple tend­ing to keep con­ver­sa­tion in “safe, pri­vate” zones longer than nec­es­sary. And mean­while, it’s valuable to be able to see the gen­er­a­tion pro­cess among re­spected thinkers.

Public Dis­cus­sion vs Knowl­edge Building

Some peo­ple have a vi­sion of Less Wrong as a pub­lic dis­cus­sion. You put your idea out there. A con­ver­sa­tion hap­pens. Any­one is free to re­spond to that con­ver­sa­tion as long as they aren’t be­ing ac­tively abu­sive. The best ideas rise to the top.

And this is a fine model, that should (and does) ex­ist in some places. But:

  1. It’s never ac­tu­ally been the model or ethos LessWrong runs on. Eliezer wrote Well Kept Gar­dens Die By Paci­fism years ago, and has always em­ployed a Reign-of-Ter­ror-es­que mod­er­a­tion style. You may dis­agree with this ap­proach, but it’s not new.

  2. A pub­lic dis­cus­sion is not nec­es­sar­ily the same as the ethos Habryka is ori­ent­ing around, which is to make in­tel­lec­tual progress.

Th­ese might seem like the same goal. And I share an aes­thetic sense that in the ‘should’ world, where things are fair, pub­lic dis­cus­sion and knowl­edge-build­ing are some­how the same goal.

But we don’t live in the ‘should’ world.

We live in the world where you get what you in­cen­tivize.

Yes, there’s a chilling effect when au­thors are free to delete com­ments that an­noy them. But there is a differ­ent chilling effect when au­thors aren’t free to have the sort of con­ver­sa­tion they’re ac­tu­ally in­ter­ested in hav­ing. The con­ver­sa­tion won’t hap­pen at all, or it’ll hap­pen some­where else (where you can’t com­ment on their stuff any­way).

A space can­not be uni­ver­sally in­clu­sive. So the ques­tion is: is LessWrong one space, tai­lored for only the types of peo­ple who en­joy that space? Or do we give peo­ple tools to make their own spaces?

If the former, who is that space for, and what rules do we set? What level of knowl­edge do we as­sume peo­ple must have? We’ve long since agreed “if you show up ar­gu­ing for cre­ation­ism, this just isn’t the space for you.” We’ve gen­er­ally agreed that if you are miss­ing con­cepts in the se­quences, it’s your job to ed­u­cate your­self be­fore try­ing to de­bate (al­though vet­er­ans should po­litely point you in the right di­rec­tion).

What about posts writ­ten since the se­quences ended?

What skills and/​or re­spon­si­bil­ities do we as­sume peo­ple must have? Do we as­sume peo­ple have the abil­ity to no­tice and speak up about their needs a la Sarah Con­stantin’s Hier­ar­chy of Re­quests? Do we re­quire them to be able to ex­press those needs ‘po­litely’? Whose defi­ni­tion of po­lite do we use?

No mat­ter which an­swer you choose for any of these ques­tions, some peo­ple are go­ing to find the re­sult­ing space in­hos­pitable, and take their con­ver­sa­tion el­se­where.

I’d much rather sidestep the ques­tion en­tirely.

A Public Archipelago Solution

Last year I ex­plored ap­ply­ing Scott Alexan­der’s Archipelago idea to­wards man­ag­ing com­mu­nity norms. Another quick re­cap:

Imag­ine a bunch of fac­tions fight­ing for poli­ti­cal con­trol over a coun­try. They’ve agreed upon the strict prin­ci­ple of harm (no phys­i­cally hurt­ing or steal­ing from each other). But they still dis­agree on things like “does pornog­ra­phy harm peo­ple”, “do cigarette ads harm peo­ple”, “does ho­mo­sex­u­al­ity harm the in­sti­tu­tion of mar­riage which in turn harms peo­ple?“, “does soda harm peo­ple”, etc.
And this is bad not just be­cause ev­ery­one wastes all this time fight­ing over norms, but be­cause the na­ture of their dis­agree­ment in­cen­tivizes them to fight over what harm even is.
And this in turn in­cen­tivizes them to fight over both defi­ni­tions of words (dis­tract­ing and time-wast­ing) and what counts as ev­i­dence or good rea­son­ing through a poli­ti­cally mo­ti­vated lens. (Which makes it harder to ever use ev­i­dence and rea­son­ing to re­solve is­sues, even un­con­tro­ver­sial ones)

And then...

Imag­ine some­one dis­cov­ers an archipelago of empty is­lands. And in­stead of con­tin­u­ing to fight, the peo­ple who want to live in Science­topia go off to found an is­land-state based on ideal sci­en­tific pro­cesses, and the peo­ple who want to live in Liber­topia go off and found a so­ciety based on the strict prin­ci­ple of harm, and the peo­ple who want to live in Chris­tian­topia go found a fun­da­men­tal­ist Chris­tian com­mune.
This lets you test more in­ter­est­ing ideas. If a hun­dred peo­ple have to agree on some­thing, you’ll only get to try things that you can can 50+ peo­ple on board with (due to crowd in­er­tia, re­gard­less of whether you have a for­mal democ­racy)
But maybe you can get 10 peo­ple to try a more ex­treme ex­per­i­ment. (And if you share knowl­edge, both about ex­per­i­ments that work and ones that don’t, you can build the over­all body of com­mu­nity-knowl­edge in your so­cial world)

Tak­ing this a step farther is the idea of Public Archipelago, with is­lands that over­lap.

Let peo­ple cre­ate their own spaces. Let the con­ver­sa­tions be re­stricted as need be, but cen­tral­ized and pub­lic, so that ev­ery­one at least has the op­por­tu­nity to fol­low along, learn, re­spond and build off of each other’s ideas, in­stead of hav­ing to net­work their way into var­i­ous so­cial/​in­ter­net cir­cles to keep up with ev­ery­thing.

This nec­es­sar­ily means that not all of LessWrong will be a com­fortable place to any given per­son, but it at least means a wider va­ri­ety of peo­ple will be able to use it, which means a wider va­ri­ety of ideas can be seen, cri­tiqued, and built off of.

Healthy Disagreement

Now, there’s an ob­vi­ous re­sponse to my ear­lier point about “it’s frus­trat­ing to have to ex­plain 101-level things to peo­ple all the time.”

Maybe you’re not ex­plain­ing 101-level things. Maybe you’re ac­tu­ally just wrong about the foun­da­tions of your ideas, and your lit­tle walled gar­den isn’t a 201 space, it’s an echo cham­ber built on sand.

This is, in­deed, quite a prob­lem.

It’s an even harder prob­lem than you might think at first glance. It’s difficult to offer an in­formed cri­tique of some­thing that’s ac­tu­ally use­ful. I’m re­minded of Holden Karnofsky’s Thoughts on Public Dis­course:

For nearly a decade now, we’ve been putting a huge amount of work into putting the de­tails of our rea­son­ing out in pub­lic, and yet I am hard-pressed to think of cases (es­pe­cially in more re­cent years) where a pub­lic com­ment from an un­ex­pected source raised novel im­por­tant con­sid­er­a­tions, lead­ing to a change in views.
This isn’t be­cause no­body has raised novel im­por­tant con­sid­er­a­tions, and it cer­tainly isn’t be­cause we haven’t changed our views. Rather, it seems to be the case that we get a large amount of valuable and im­por­tant crit­i­cism from a rel­a­tively small num­ber of highly en­gaged, highly in­formed peo­ple. Such peo­ple tend to spend a lot of time read­ing, think­ing and writ­ing about rele­vant top­ics, to fol­low our work closely, and to have a great deal of con­text. They also tend to be peo­ple who form re­la­tion­ships of some sort with us be­yond pub­lic dis­course.
The feed­back and ques­tions we get from out­side of this set of peo­ple are of­ten rea­son­able but fa­mil­iar, seem­ingly un­rea­son­able, or difficult for us to make sense of.

The ob­vi­ous crit­i­cisms of an idea may have ob­vi­ous solu­tions. If you in­ter­rupt a 301 dis­cus­sion to ask “but have you con­sid­ered that you might be wrong about ev­ery­thing?”… well, yes. They have prob­a­bly no­ticed the skulls. This of­ten feels like 2nd-year un­der­grads ask­ing post-docs to flesh out ev­ery­thing they’re say­ing, us­ing con­cepts only available to the un­der­grads.

Still, peer re­view is a cru­cial part of the knowl­edge-build­ing pro­cess. You need high qual­ity cri­tique (and counter-cri­tique, and counter-counter-cri­tique). How do you square that with giv­ing an au­thor con­trol over their con­ver­sa­tion?

I hope (and fairly con­fi­dently be­lieve) that most au­thors, even ones em­ploy­ing Reign-of-Ter­ror style mod­er­a­tion poli­cies, will not delete com­ments willy nilly – and the site ad­mins will be proac­tively hav­ing con­ver­sa­tions with au­thors who seem to be abus­ing the sys­tem. But we do need safe­guards in case this turns out to be worse than we ex­pect.

The an­swer is pretty straight­for­ward: it’s not at all ob­vi­ous that the pub­lic dis­cus­sion of a post has to be on that par­tic­u­lar post’s com­ment sec­tion.

(Among other things, this is not how most sci­ence works, AFAICT, al­though tra­di­tional sci­ence leaves sub­stan­tial room for im­prove­ment any­how).

If you dis­agree with a post, and the au­thor deletes or blocks you from com­ment­ing, you are wel­come to write an­other post about your in­tel­lec­tual dis­agree­ment.

Yes, this means that peo­ple read­ing the origi­nal post may come away with an im­pres­sion that a con­tro­ver­sial idea is more ac­cepted than it re­ally is. But if that per­son looks at the front page of the site, and the idea is con­tro­ver­sial, there will be both other posts and re­cent com­ments ar­gu­ing about its mer­its.

It also means that no, you don’t au­to­mat­i­cally get the en­gage­ment of ev­ery­one who read the origi­nal post. I see this as a fea­ture, not a bug.

If you want your crit­i­cism to be read, it has to be good and well writ­ten. It doesn’t have to fit within the over­all zeit­geist of what’s cur­rently pop­u­lar or what the lo­cally high-sta­tus peo­ple think. Holden’s crit­i­cal Thoughts on Sin­gu­lar­ity In­sti­tute is one of the most highly up­voted posts of all time. (If any­thing, I think LessWrong folk are too ea­ger to show off their will­ing­ness to dis­sent and up­vote peo­ple just for be­ing con­trar­ian).

It does suck that you must be good at writ­ing and know your au­di­ence (which isn’t nec­es­sar­ily the same as good at think­ing). But this ap­plies just as much to be­ing the origi­nal au­thor of an idea, as to be­ing a critic.

The au­thor of a post doesn’t owe you their rhetor­i­cal strength and au­di­ence and plat­form to give you space to write your coun­ter­claim. We don’t want to in­cen­tivize peo­ple to protest quickly and loudly to gain mind­share in a pop­u­lar au­thor’s com­ment sec­tion. We want peo­ple to write good cri­tiques.

Mean­while, if you’re mak­ing an effort to un­der­stand an au­thor’s goals and frame dis­agree­ment in a way that doesn’t feel like an at­tack, I don’t an­ti­ci­pate this com­ing up much in the first place.

ii. Ex­pec­ta­tions and Trust

I think a deep dis­agree­ment that un­der­lies a lot of the de­bate over mod­er­a­tion: what sort of trust is im­por­tant to you?

This is a bit of a di­gres­sion – al­most an es­say unto it­self – but I think it’s im­por­tant.

Ele­ments of Trust

Defin­ing trust is tricky, but here’s a stab at it: “Trust is hav­ing ex­pec­ta­tions of other peo­ple, and not hav­ing to worry about whether those ex­pec­ta­tions will be met.”

This has a few com­po­nents:

  • Which ex­pec­ta­tions do you care about be­ing up­held?

  • How much do you trust peo­ple in your en­vi­ron­ment to up­hold them?

  • What strate­gies do you pre­fer to re­solve the cog­ni­tive load that comes when you can’t trust peo­ple (or, are not sure if you can)?

Which ex­pec­ta­tions?

You might trust peo­ple…

  • to keep their promises and/​or mean what they say.

  • to care about your needs.

  • to up­hold par­tic­u­lar prin­ci­ples (clear think­ing, trans­parency).

  • to be able (and will­ing) to perform a par­tic­u­lar skill (in­clud­ing things like notic­ing that when you’re not say­ing what you mean).

Trust is a mul­ti­ple-place func­tion. Maybe you trust Alice to re­li­ably provide all the rele­vant in­for­ma­tion even if it makes her look bad. You trust Bob to pay at­ten­tion to your emo­tional state and not say trig­ger­ing things. You can count on Carl to call you on your own bul­lshit (and listen thought­fully when you call him on his). Eve will re­li­ably en­force her rules even when it’s so­cially in­con­ve­nient to do so.

You may care about differ­ent kinds of trust in differ­ent con­texts.

How much do you trust a per­son or space?

For the ex­pec­ta­tions that mat­ter most to you, do you gen­er­ally ex­pect them to be fulfilled, or do you have to con­stantly mon­i­tor and take ac­tion to en­sure them?

With a given per­son, or a par­tic­u­lar place, is your guard always up?

In high trust en­vi­ron­ments, you ex­pect other peo­ple to care about the same ex­pec­ta­tions you do, and fol­low through on them. This might mean look­ing out for each other’s in­ter­ests. Or, merely that you’re fo­cused on the same goals such that “each other’s in­ter­ests” doesn’t come into play.

High trust en­vi­ron­ments re­quire you to ei­ther per­son­ally know ev­ery­one, or to have strong rea­son to be­lieve in the se­lec­tion effects on who is pre­sent.


  • A small group of friends by a campfire might trust each other to care about each other’s needs and try to en­sure they are met (but not nec­es­sar­ily to have par­tic­u­lar skills re­quired to do so).

  • A young ide­olog­i­cal startup might trust each other to have skills, and to care about the vi­sion of the com­pany (but, per­haps not to ‘have each other’s back’ as the com­pany grows and money/​power be­comes up for grabs)

  • A small town, where fam­i­lies have lived there for gen­er­a­tions and share a cul­ture.

  • A larger mil­i­tary bat­tal­ion, where ev­ery­one knows that ev­ery­one knows that ev­ery­one went through the same in­tense train­ing. They clearly have par­tic­u­lar skills, and would suffer pun­ish­ment if they don’t fol­low the or­ders from high com­mand.

Low trust en­vi­ron­ments are where you have no illu­sions that peo­ple are look­ing out for the things you care about.

The bar­ri­ers to en­try are low. Peo­ple come and go of­ten. Peo­ple of­ten rep­re­sent them­selves as if they are al­igned with you, but this is poor ev­i­dence for whether they are in fact al­igned with you. You must con­stantly have your guard up.


  • A large cor­po­ra­tion where no sin­gle per­son knows everybody

  • A large com­mu­nity with no par­tic­u­lar bar­rier to en­try be­yond show­ing up and talk­ing as if you un­der­stand the culture

  • A big city, with many cul­tures and sub­cul­tures con­stantly in­ter­fac­ing.

Trans­par­ent Low Trust, Cu­rated High Trust

Hav­ing to watch your back all the time is ex­haust­ing, and there’s at least two strat­egy-clusters I can think of to alle­vi­ate that.

In a trans­par­ent low trust en­vi­ron­ment, you don’t need to rely on any­one’s word or good in­ten­tions. In­stead, you rely upon trans­parency and safe­guards built into the sys­tem.

It’s your re­spon­si­bil­ity to make use of those safe­guards to check that things are okay.

A cu­rated high trust en­vi­ron­ment has some kind of strong bar­rier to en­try. The ad­van­tage is that things can move faster, be more pro­duc­tive, re­quire less effort and con­flict, and fo­cus only on things you care about.

It’s the owner of the space’s re­spon­si­bil­ity to kick peo­ple out if they aren’t able to live up to the norms in the space. It’s your re­spon­si­bil­ity to de­cide whether you trust the the space, and leave if you don’t.

The cur­rent at­mo­sphere at LessWrong is some­thing like “trans­par­ent medium trust.” There are rough, site-level filters on what kind of par­ti­ci­pa­tion is ac­cept­able – much moreso than the av­er­age in­ter­net hang­out. But not much micro­manag­ing on what pre­cise ex­pec­ta­tions to up­hold.

I think some peo­ple are ex­pect­ing the new mod­er­a­tion tools to mean “we took a func­tion­ing medium trust en­vi­ron­ment and made it more dan­ger­ous, or just weirdly tweaked it, for the sake of re­mov­ing a few ex­tra an­noy­ing com­ments or cater to some in­ex­pli­ca­ble whims.”

But part of the goal here is to cre­ate a fun­da­men­tal phase shift, where types of con­ver­sa­tions are pos­si­ble that just weren’t in a medium-trust world.

Why High Trust?

Why take the risk of high trust? Aren’t you just ex­pos­ing your­self to peo­ple who might take ad­van­tage of you?

I know some peo­ple who’ve been re­peat­edly hurt, by try­ing to trust, and then hav­ing peo­ple ei­ther tram­ple all over their needs, or ac­tively be­tray them. Hu­mans are poli­ti­cal mon­keys that make up con­ve­nient sto­ries to make them­selves look good all the time. If you aren’t ac­tu­ally al­igned with your col­leagues, you will prob­a­bly even­tu­ally get burned.

And high trust en­vi­ron­ments can’t scale – too many peo­ple show up with too many differ­ent goals, and many of them are good at pre­sent­ing them­selves as al­igned with you (they may even think they’re al­igned with you), but… they are not.

LessWrong (most likely) needs to scale, so it’s im­por­tant for there to be spaces here that are Func­tion­ing Low Trust, that don’t rely on load-bear­ing au­thor­ity figures.

I do not recom­mend this blindly to ev­ery­one.

But. To mis­quote Umesh – “If you’re not oc­ca­sion­ally get­ting back­stabbed, you’re prob­a­bly not trust­ing enough.”

If you can trust the peo­ple around you, all the at­ten­tion you put into watch­ing your back can go to other things. You can ex­pect other peo­ple to look out for your needs, or help you in re­li­able ways. Your en­tire body phys­iolog­i­cally changes, no longer poised for fight or flight. It’s phys­i­cally healthier. In some cases it’s bet­ter for your epistemics – you’re less defen­sive when you don’t feel un­der at­tack, mak­ing it eas­ier to con­sider op­pos­ing points of view.

I live most of my life in high trust en­vi­ron­ments these days, and… let me tell you holy shit when it works it is amaz­ing. I know a cou­ple dozen peo­ple who I trust to be hon­est about their per­sonal needs, to be rea­son­ably at­ten­tive to mine, who are al­igned with me on how to re­solve in­ter­per­sonal stuff as well as Big Pic­ture How the Uni­verse Should Look Some­day.

When we dis­agree (as we of­ten do), we have a shared un­der­stand­ing of how to re­solve that dis­agree­ment.

Con­ver­sa­tions with those peo­ple are smooth, pro­duc­tive, and in­sight­ful. When they are not smooth, the pro­cess for figur­ing out how to re­solve them is smooth or at least mu­tu­ally agreed upon.

So when I come to LessWrong, where the com­ments as­sume at-most-medium trust… where I’m not able to set a higher or differ­ent stan­dard for a dis­cus­sion be­yond the low­est com­mon de­nom­i­na­tor…

It’s re­ally frus­trat­ing and sad, to have to choose be­tween a pub­lic-un­trusted and pri­vate-but-high-trust con­ver­sa­tion.

It’s worth not­ing: I par­ti­ci­pate in mul­ti­ple spaces that I trust differ­ently. Maybe I wouldn’t recom­mend par­tic­u­lar friends join Alice’s space be­cause, while she’s good stat­ing her clear rea­sons for things and eval­u­at­ing ev­i­dence clearly and mak­ing sure oth­ers do the same, she’s not good at notic­ing when you’re trig­gered and paus­ing to check in if you’re okay.

And maybe Eve re­ally needs that. That’s usu­ally okay, be­cause Eve can go to Bob’s space, or run her own.

Some­times, Bob’s space doesn’t ex­ist, and Eve lacks the skills to at­tract peo­ple to a new space. This is re­ally im­por­tant and sad. I per­son­ally ex­pect LessWrong to con­tain a wide dis­tri­bu­tion of prefer­ences that can sup­port many needs, but it prob­a­bly won’t con­tain some­thing for ev­ery­one.

Still, I think it’s an over­all bet­ter strat­egy to make it eas­ier to cre­ate new sub­spaces than to try to ac­com­mo­date ev­ery­one at once.

Get­ting Burned

I ex­pect to get hurt some­times.

I ex­pect some friends (or my­self) to not always be at our best. Not always self-aware enough to avoid fal­ling into so­ciopoli­ti­cal traps that pit us against each other.

I ex­pect that at least some of the peo­ple I’m cur­rently al­igned with, I may even­tu­ally turn out to be un­al­igned with, and to come into con­flict that can’t be eas­ily re­solved. I’ve had friend­ships that turned weirdly and badly ad­ver­sar­ial and I spent months stress­fully deal­ing with it.

But the benefits of high trust are so great that I don’t re­gret for a sec­ond hav­ing spent the first few years with those friends in a high-trust re­la­tion­ship.

I ac­knowl­edge that I am pretty priv­ileged in hav­ing a set of needs and in­ter­per­sonal prefer­ences that are eas­ier to fit into a high trust en­vi­ron­ment. There are peo­ple who just don’t in­ter­face well with the sort of spaces I thrive in, who may never get the benefits of high trust, and that… re­ally sucks.

But the benefit of the Public Archipelago model is that there can be mul­ti­ple sub­sec­tions of the site with differ­ent norms. You can par­ti­ci­pate in dis­cus­sions where you trust the space owner. Some au­thors may clearly spell out norms and take the time to clearly ex­plain why they mod­er­ate com­ments, and maybe you trust them the most.

Some au­thors may not be will­ing to take that time. Maybe you trust them less, or maybe you know them well enough that you trust them any­how.

In ei­ther case, you know what to ex­pect, and if you’re not okay with it, you ei­ther don’t par­ti­ci­pate, or re­spond el­se­where, or put effort into un­der­stand­ing the au­thor’s goals so that you are able to write cri­tiques that they find helpful.

iii. The Fine Details

Okay, but can’t we at least re­quire rea­sons?

I don’t think many peo­ple were re­sis­tant to delet­ing com­ments – the con­tro­ver­sial fea­ture was “delete with­out trace.”

First, spam bots, and ded­i­cated ad­ver­saries with armies of sock­pup­pets make it at least nec­es­sary for this tool to be an available (LW2.0 has had posts with hun­dreds of spam or troll com­ments we quietly delete and IP ban)

For non-ob­vi­ous spam…

I do hope delete with­out trace is used rarely (or that au­thors send the com­menter a pri­vate rea­son when do­ing so). We plan to im­ple­ment the mod­er­a­tion log Said Ach­miz recom­mended, so that if some­one is delet­ing a lot of com­ments with­out trace you can at least go and check, and no­tice pat­terns. (We may change the name to “delete and hide”, since some kind of trace will be available).

All things be­ing equal, clear rea­sons are bet­ter than none, and more trans­parency is bet­ter than less.

But all things are not equal.

Moder­a­tion is work.

And I don’t think ev­ery­one un­der­stands that the amount of work varies a lot, both by vol­ume, and by per­son­al­ity type.

Some peo­ple get en­er­gized and ex­cited by read­ing through con­fronta­tional com­ments and re­spond­ing.

Some peo­ple find it in­cred­ibly drain­ing.

Some peo­ple get maybe a dozen com­ments on their ar­ti­cles a day. Some get barely any at all. But some au­thors get hun­dreds, and even if you’re the sort of per­son who is en­er­gized by it, there are only so many hours in a day and there are other things worth do­ing.

Some com­ments are not just mean or dumb, but im­mensely hate­ful and trig­ger­ing to the au­thor, and sim­ply glanc­ing at a re­minder that it ex­isted is painful – enough to undo the per­sonal benefit they got from hav­ing writ­ten their ar­ti­cle in the first place.

For many peo­ple, figur­ing out how to word a mod­er­a­tion no­tice is stress­ful, and I’m not sure whether it’s more in­tense on av­er­age to have to say:

“Please stop be­ing rude and ob­nox­iously de­railing threads”


“I’m sorry, I know you’re try­ing your best, but you’re ask­ing a lot of ob­vi­ous ques­tions and mak­ing sub­tly bad ar­gu­ments in ways that soak up the other com­menter’s time. The col­leagues that I’m try­ing to at­tract to these dis­cus­sion threads are tired of deal­ing with you.”

Not to men­tion that mod­er­a­tion of­ten in­volves peo­ple get­ting an­gry at you, so you don’t just have to come up with the ini­tial posted rea­son, but also deal with a bunch of fol­lowup that can wreck your week. Com­ments that leave a trace in­vite peo­ple to ar­gue.

Moder­a­tion can be te­dious. Moder­a­tion can be stress­ful. Moder­a­tion is gen­er­ally un­paid. Moder­a­tors can burn out or de­cide “you know what, this just isn’t worth the time and bul­lshit.”

And this is of­ten the worst deal for the best au­thors, since the best au­thors at­tract more com­ments, and some­times end up ac­quiring a sort of celebrity sta­tus where com­menters don’t quite feel like they’re peo­ple any­more, and feel jus­tified (or even obli­gated) to go out of their way to take them down a peg.

If none of this makes sense to you, if you can’t imag­ine mod­er­at­ing be­ing this big a deal… well… all I can say is it just re­ally is a god damn big deal. It re­ally re­ally is.

There is a trade­off we have to make, one way or an­other, on whether we want to force our best au­thors to fol­low clear, leg­ible pro­ce­dures, or to write and en­gage more.

Re­quiring the former can (and has) ended up pun­ish­ing the lat­ter.

We pri­ori­tized build­ing the delete-and-hide func­tion be­cause Eliezer asked for it and we wanted to get him post­ing again quickly. But he is not the only au­thor to have asked and ex­pressed ap­pre­ci­a­tion for it.

In­cen­tiviz­ing Good Ideas and Good Criticism

I’ll make an even stronger claim here: pun­ish­ing idea gen­er­a­tion is worse than pun­ish­ing crit­i­cism.

You cer­tainly need both, but crit­i­cism is eas­ier. There might be en­vi­ron­ments where there isn’t enough quan­tity or qual­ity of crit­ics, but I don’t think LessWrong is one of them. In­so­far as we don’t have good enough crit­i­cism, it’s be­cause the cri­tiques are nit­picky and un­helpful in­stead of try­ing to deeply un­der­stand un­fa­mil­iar ideas and col­lab­o­ra­tively im­prove their load-bear­ing cruxes.

And mean­while, I think the best crit­ics also tend to be the best idea-gen­er­a­tors – the two skills are in fact tightly cou­pled – so mak­ing LessWrong a place they feel ex­cited to par­ti­ci­pate in seems very im­por­tant.

It’s pos­si­ble to go too far in this di­rec­tion. There are rea­son­able cases for mak­ing a differ­ent trade­offs that differ­ent cor­ners of the in­ter­net might em­ploy. But our de­ci­sion on LessWrong is that au­thors are not obli­gated to put in that work if it’s stress­ful.

Over­ton Win­dows, and Per­sonal Criticism

There’s a few styles of com­ments that re­li­ably make me go “ugh, this is go­ing to be­come a mess and I re­ally don’t want to deal with it.” Com­ments whose sub­stance is “this idea is bad, and should not be some­thing LessWrong talks about.”

In that mo­ment, the con­ver­sa­tion stops be­ing about what­ever the idea was, and starts be­ing about poli­tics.

A re­cent ex­am­ple is what I’d call “fuzzy sys­tem 1 stuff.” The Ken­sho and Cir­cling threads felt like they were mostly ar­gu­ing about “is it even okay to talk about fuzzy sys­tem 1 in­tu­itions in ra­tio­nal dis­course?”. If you wanted to talk about the core ideas and how to use them effec­tively, you had to wade through a gi­ant, sprawl­ing de­mon thread.

Now, it’s ac­tu­ally pretty im­por­tant whether fuzzy sys­tem 1 in­tu­itions have a place in ra­tio­nal dis­course. It’s a con­ver­sa­tion that needs to hap­pen, a ques­tion that prob­a­bly has a right an­swer that we can con­verge on (albeit a nu­anced one that de­pends on cir­cum­stances).

But right now, it seems like the only dis­cus­sion that’s pos­si­ble to have about them is “are these in the over­ton win­dow or not?”. There needs to be space to ex­plore ideas that aren’t cur­rently in the ac­cepted paradigm.

I’d even claim that do­ing that pro­duc­tively is one of the things ra­tio­nal­ity is for.

Similar is­sues abound with cri­tiquing some­one’s tone, or oth­er­wise cri­tiquing a per­son rather than an idea. Com­ments like that tend to quickly dom­i­nate the dis­cus­sion and make it hard to talk about any­thing else. In many cases, if the com­ment were a pri­vate mes­sage, it could have been taken as con­struc­tive crit­i­cism in­stead of a per­sonal at­tack that en­flares peo­ple’s tribal in­stincts.

For per­sonal crit­i­cism, I think the solu­tion is to build tools that make pri­vate dis­cus­sion eas­ier.

For Over­ton Win­dow poli­ti­cal brawls, I think the brawl it­self is in­evitable (if some­one wants to talk about a con­tro­ver­sial thing, and other peo­ple don’t want them to talk about the con­tro­ver­sial thing, you can’t avoid the con­flict). But I think it’s rea­son­able for au­thors to say “if we’re go­ing to have the over­ton dis­cus­sion, can we have it some­where else? Right here, I’m try­ing to talk about the ram­ifi­ca­tions of X if Y is true.”

Mean­while, if you think X or Y are ac­tively dan­ger­ous, you can still down­vote their post. In­stead of ev­ery­one in­vest­ing end­less en­ergy in mul­ti­ple de­mon threads, the is­sue can be re­solved via a sin­gle thread, and the karma sys­tem.

I don’t think this would have helped with the most re­cent thread, but it’s an op­tion I’d want available if I ever ex­plored a con­tro­ver­sial topic in the fu­ture.

iv. Towards Public Archipelago

This is a com­pli­cated topic, the de­ci­sion is go­ing to af­fect peo­ple. If you’re the sort of per­son for whom the sta­tus quo seemed just perfect, your ex­pe­rience is prob­a­bly go­ing to be­come worse.

I do think that is sad, and it’s im­por­tant to own it, and apol­o­gize – I think hav­ing a place that felt safe and home and right be­come a place that feels alienat­ing and wrong is in fact among the worst things that can hap­pen to a per­son.

But the con­se­quences of not mak­ing some ma­jor changes seem too great to ig­nore.

The pre­vi­ous iter­a­tion of LessWrong died. It de­pended on skil­led writ­ers con­tin­u­ously post­ing new con­tent. It dried up as, one by one, as they de­cided LessWrong wasn’t best place for them to pub­lish or brain­storm.

There’s a lot of rea­sons they made that choice. I don’t know that our cur­rent ap­proach will solve the prob­lem. But I strongly be­lieve that to avoid the same fate for LessWrong 2.0, it will need to be struc­turally differ­ent in some ways.

An At­mo­sphere of Experimentation

We have some par­tic­u­lar tools, and plans, to give au­thors the same con­trol they’d have over a pri­vate blog, to re­duce the rea­sons to move el­se­where. This may or may not help. But be­neath the mod­er­a­tion tools and Public Archipelago con­cept is an un­der­ly­ing ap­proach of ex­per­i­men­ta­tion.

At a high level, the LessWrong 2.0 team will be ex­per­i­ment­ing with the site de­sign. We want this to per­co­late through the site – we want au­thors to be able to ex­per­i­ment with modal­ities of dis­cus­sion. We want to provide use­ful, flex­ible tools to help them do so.

Even­tu­ally we’d like users to ex­per­i­ment both with their over­all mod­er­a­tion policy and cul­ture, as well as the norms for in­di­vi­d­ual posts.

Ex­per­i­ments I’d per­son­ally like to see:

  • Posts where all com­menters are re­quired to fully jus­tify their claims, such that com­plete strangers with no pre­con­cep­tions can ver­ify them

  • Posts where all com­menters are re­quired to take a few ideas as given, to see if they have in­ter­est­ing im­pli­ca­tions in 201 or 301 con­cept space

  • Dis­cus­sions where com­ments must fol­low par­tic­u­lar for­mats and will be deleted oth­er­wise, such as the r/​AskHis­to­ri­ans sub­red­dit or stack­overflow.

  • Dis­cus­sions where NVC is required

  • Dis­cus­sions where NVC is banned

  • Per­sonal Blog­posts where all com­menters are only al­lowed to speak in po­etry.

  • Dis­cus­sions where you need to be fa­mil­iar with grad­u­ate level math to par­ti­ci­pate.

  • Dis­cus­sions where au­thors feel free to delete any com­ment that doesn’t seem like it’s pul­ling its at­ten­tional weight.

  • Dis­cus­sions where only col­leagues the au­thor per­son­ally knows and trusts get to par­ti­ci­pate.

Bub­bling Up and Peer Review

Ex­per­i­men­ta­tion doesn’t mean splin­ter­ing, or that LessWrong won’t have a cen­tral ethos con­nect­ing it. The rea­son we’re al­low­ing user mod­er­a­tion on Front­page posts is that we want good ideas to bub­ble up to the top, and we don’t want it to feel like a pun­ish­ment if a per­sonal blog­post gets pro­moted to Front­page or Cu­rated. If an idea (or dis­cus­sional ex­per­i­ment) is suc­cess­ful, we want peo­ple to see it, and build off it.

Still, what sort of ex­per­i­men­ta­tion and norms to ex­pect will vary de­pend­ing on how much ex­po­sure a given post has.

On per­sonal blog­posts, pretty much any­thing goes.

On Front­page posts, we will want to have some kind of stan­dard, which I’m not sure we can for­mally spec­ify. We’re re­strict­ing mod­er­a­tion tools to users with high karma, so that only peo­ple who’ve already in­ter­nal­ized what LessWrong is about have ac­cess to them. We want ex­per­i­men­ta­tion that pro­duc­tively ex­plores ra­tio­nal-dis­cus­sion-space. (If you’re go­ing to ask peo­ple to only com­ment in haiku on a front­page post, you should have a pretty good rea­son as to why you think this will foster in­tel­lec­tual progress).

If you’re delet­ing any­one who dis­agrees with you even slightly, or crit­i­ciz­ing other users with­out let­ting them re­spond, we’ll be hav­ing a talk with you. We may re­move your mod priv­ileges or re­strict them to your per­sonal blog­posts.

Cu­rated posts will (as they already do) in­volve a lot of judg­ment calls on the sitewide mod­er­a­tion team.

At some point, we might ex­plore some kind of for­mal peer re­view pro­cess, for ideas that seem im­por­tant enough to in­clude in the LessWrong canon. But ex­plor­ing that in full is be­yond the scope of this post.

Norms for this com­ment section

With this post, I’m kinda in­ten­tion­ally sum­mon­ing a de­mon thread. That’s okay. This is the offi­cial “ar­gue about the mod­er­a­tion over­ton win­dow chang­ing” dis­cus­sion space.

Still, some types of ar­gu­ing seem more pro­duc­tive than oth­ers. It’s es­pe­cially im­por­tant for this par­tic­u­lar con­ver­sa­tion to be max­i­mally trans­par­ent, so I won’t be delet­ing any­thing ex­cept blatant trol­ling. Com­ments that are ex­cep­tion­ally hos­tile, I might com­ment-lock, but leave visi­ble with an ex­plicit rea­son why.

But, if you want your com­ments or con­cerns to be use­ful, some in­for­mal sug­ges­tions:

Failure modes to watch out for:

  • If the Public Archipelago di­rec­tion seems ac­tively dan­ger­ous or oth­er­wise awful, try to help solve the un­der­ly­ing prob­lem. Right now, one of the most com­mon con­cerns we’ve heard from peo­ple who we’d like to be par­ti­ci­pat­ing on LessWrong is that the com­ments feel nit­picky, an­noy­ing, fo­cused on un­helpful crit­i­cism, or un­safe. If you’re ar­gu­ing that the Archipelago ap­proach is fun­da­men­tally flawed, you’ll need to ad­dress this prob­lem in some fash­ion. Com­ments that don’t at least ac­knowl­edge the mag­ni­tude of the trade­off are un­likely to be per­sua­sive.

  • If other com­menters seem to have vastly differ­ent ex­pe­riences than you, try to proac­tively un­der­stand them – solu­tions that don’t take into ac­count di­ver­sity of ex­pe­rience are less use­ful.

Types of com­ments I ex­pect to be es­pe­cially use­ful:

  • Con­sid­er­a­tions we’ve missed. This is a fairly ma­jor ex­per­i­ment. We’ve tried to be pretty thor­ough about ex­plor­ing the con­sid­er­a­tions here, but there are prob­a­bly a lot o we haven’t thought of.

  • Pareto Im­prove­ments. I ex­pect there are a lot of op­por­tu­ni­ties to avoid mak­ing trade­offs, in­stead find­ing third-op­tions that get as many differ­ent benefits as once.

  • Spe­cific tools you’d like to see. Ideally, tools that would en­able a va­ri­ety of ex­per­i­ments while en­sur­ing that good con­tent still gets to bub­ble up.

Ok. That was a bit of a jour­ney. But I ap­pre­ci­ate you bear­ing with me, and am look­ing for­ward to hav­ing a thor­ough dis­cus­sion on this.