Partial summary of debate with Benquo and Jessicata [pt 1]

Note: I’ll be try­ing not to en­gage too much with the ob­ject level dis­cus­sion here – I think my marginal time on this topic is bet­ter spent think­ing and writ­ing longform thoughts. See this com­ment.

Over the past cou­ple months there was some ex­tended dis­cus­sion in­clud­ing my­self, Habryka, Ruby, Vaniver, Jim Bab­cock, Zvi, Ben Hoff­man, Jes­si­cata and Zack Davis. The dis­cus­sion has cov­ered many top­ics, in­clud­ing “what is rea­son­able to call ‘ly­ing’”, and “what are the best ways to dis­cuss and/​or deal with de­cep­tive pat­terns in pub­lic dis­course”, “what norms and/​or prin­ci­ples should LessWrong as­pire to” and oth­ers.

This in­cluded com­ments on LessWrong, email, google-docs and in-per­son com­mu­ni­ca­tion. This post is in­tended as an eas­ier-to-read col­lec­tion of what seemed (to me) like key points, as well as in­clud­ing my cur­rent take­aways.

Part of the challenge here was that it seemed like Ben­quo and I had mostly similar mod­els, but many cri­tiques I made seemed to Ben to be in the wrong ab­strac­tion, and vice-versa. Some­times I would no­tice par­tic­u­lar differ­ences like “In my model, it’s im­por­tant that ac­cu­sa­tions be held to a high stan­dard”, whereas Ben felt “it’s im­por­tant that crit­i­cism not be held to higher stan­dards than praise.” But know­ing this didn’t seem to help much.

This post mostly sum­ma­rizes ex­ist­ing on­line con­ver­sa­tion. I’m hop­ing to do a fol­lowup where I make good on my promise to think more se­ri­ously through my cruxes-on-on­tol­ogy, but it’s slow go­ing.

Com­ment Highlights

This be­gins with some com­ment high­lights from LessWrong which seemed use­ful to gather in one place, fol­lowed by my take­aways af­ter the fact.

I at­tempt to pass Ben’s ITT

In the com­ments of “Ra­tion­al­iza­tion” and “Sit­ting Bolt Upright in Alarm”, a few things even­tu­ally clicked and I at­tempted to pass Ben’s Ide­olog­i­cal Tur­ing Test:

Let me know how this sounds as an ITT:

Think­ing and build­ing a life for yourself

  • Much of civ­i­liza­tion (and the ra­tio­nal­sphere as a sub­set of it and/​or meme­plex that’s in­fluenced and con­strained by it) is gen­er­ally pointed in the wrong di­rec­tion. This has many facets, many of which re­in­force each other. So­ciety tends to:

  • Schools sys­tem­at­i­cally teach peo­ple to as­so­ci­ate rea­son with listen­ing-to/​pleas­ing-teach­ers, or mov­ing-words-around un­con­nected from re­al­ity. [Order of the Soul]

  • So­ciety sys­tem­at­i­cally push­ing peo­ple to live apart from each other, to work un­til they need (or be­lieve they need) pal­li­a­tives, in a way that doesn’t give you space to think [Sab­bath Hard and Go Home]

  • Re­lat­edly, so­ciety pro­vides struc­ture that in­cen­tivizes you to ad­vance in ar­bi­trary hi­er­ar­chy, or to tread wa­ter and barely stay afloat, with­out re­flec­tion of what you ac­tu­ally want.

  • By con­trast, for much of his­tory, there was a much more di­rect con­nec­tion be­tween what you did, how you thought, and how your own life was bet­tered. If you wanted a nicer home, you built a nicer home. This came with many over­lap­ping in­cen­tive struc­tures re­in­forced some­thing closer to liv­ing healthily and gen­er­at­ing real value.

  • (I’m guess­ing a sig­nifi­cant con­fu­sion was me see­ing this whole sec­tion as only mod­er­ately con­nected rather than cen­tral to the other sec­tions)

We des­per­ately need clarity

  • There’s a col­lec­tion of pres­sures, in many-but-not-all situ­a­tions, to keep both facts and de­ci­sion-mak­ing prin­ci­ples obfus­cated, and to warp lan­guage in a way that en­ables that. This is of­ten part of an over­all strat­egy (some­times con­scious, some­times un­con­scious) to ma­neu­ver groups for per­sonal gain.

  • It’s im­por­tant to be able to speak plainly about forces that obfus­cate. It’s im­por­tant to lean _fully_into clar­ity and plain­speak, not just tak­ing marginal steps to­wards it, both be­cause clear lan­guage is very pow­er­ful in­trin­si­cally, and there’s a sharp dropoff as soon as am­bi­guity leaks in (mov­ing the con­ver­sa­tion to higher simu­lacrum lev­els, at which point it’s very hard to re­cover clar­ity)

[Least con­fi­dent] The best fo­cus is on your own de­vel­op­ment, rather than op­ti­miz­ing sys­tems or other people

  • Here I be­come a lot less con­fi­dent. This is my at­tempt to sum­ma­rize what­ever’s go­ing on in our dis­agree­ment about my “When co­or­di­nat­ing at scale, com­mu­ni­cat­ing has to re­duce grace­fully to about 5 words” thing. I had an im­pres­sion that this seemed deeply wrong, con­fus­ing, or threat­en­ing to you. I still don’t re­ally un­der­stand why. But my best guesses in­clude:

  • This is putting the lo­cus of con­trol in the group, at a mo­ment-in-his­tory where the most im­por­tant thing is re­assert­ing in­di­vi­d­ual agency and think­ing for your­self (be­cause many groups are do­ing the wrong-things listed above)

  • In­so­far as group co­or­di­na­tion is a lens to be looked through, it’s im­por­tant that groups a work­ing in a way that re­spects ev­ery­one’s agency and abil­ity to think (to avoid fal­ling into some of the failure modes as­so­ci­ated with the first bul­let point), and sim­plify­ing your mes­sage so that oth­ers can hear/​act on it is part of an over­all strat­egy that is caus­ing harm

  • Pos­si­bly a sim­pler “peo­ple can and should read a lot and en­gage with more nu­anced mod­els, and most of the rea­son you might think that they can’t is be­cause school and hi­er­ar­chi­cal com­pa­nies warped your think­ing about that?” And then, in light of all that, some­thing is off with my mood when I’m en­gag­ing with in­di­vi­d­ual pieces of that, be­cause I’m not prop­erly ori­ented around the other pieces? Does that sound right? Are there im­por­tant things left out or got­ten wrong?

Ben re­sponded:

This sounds re­ally, re­ally close. Thanks for putting in the work to pro­duce this sum­mary!

I think my ob­jec­tion to the 5 Words post fits a pat­tern where I’ve had difficulty ex­press­ing a class of ob­jec­tion. The literal con­tent of the post wasn’t the main prob­lem. The main prob­lem was the em­pha­sis of the post, in con­junc­tion with your other be­liefs and be­hav­ior.

It seemed like the hid­den sec­ond half of the core claim was “and there­fore we should co­or­di­nate around sim­pler slo­gans,” and not the ob­vi­ous al­ter­na­tive con­clu­sion “and there­fore we should scale up more care­fully, with an un­com­pro­mis­ing em­pha­sis on some as­pects of qual­ity con­trol.” (See On the Con­struc­tion of Bea­cons for the rele­vant ar­gu­ment.)

It seemed to me like there was some mo­ti­vated am­bi­guity on this point. The em­pha­sis seemed to con­sis­tently recom­mend pub­lic be­hav­ior that was about mo­bi­liza­tion rather than dis­course, and back-chan­nel dis­cus­sions among well-con­nected peo­ple (in­clud­ing me) that felt like they were more about es­tab­lish­ing com­pat­i­bil­ity than mak­ing in­tel­lec­tual progress. This, even though it seems like you ex­plic­itly agree with me that our cur­rent so­cial co­or­di­na­tion mechanisms are mas­sively in­ad­e­quate, in a way that (to me ob­vi­ously) im­plies that they can’t pos­si­bly solve FAI.

I felt like if I pointed this kind of thing out too ex­plic­itly, I’d just get scolded for be­ing un­char­i­ta­ble. I didn’t ex­pect, how­ever, that this scold­ing would be ac­com­panied by an ex­pla­na­tion of what spe­cific, an­ti­ci­pa­tion-con­strain­ing, al­ter­na­tive be­lief you held. I’ve been get­ting bet­ter at p_oint­ing out this pat­tern_ (e.g. my re­cent re­sponse to habryka) in­stead of just shut­ting down due to a pre­ver­bal recog­ni­tion of it. It’s very hard to write a com­ment like this one clearly and with­out ex­tra­ne­ous ma­te­rial, es­pe­cially of a point-scor­ing or whin­ing na­ture. (If it were easy I’d see more peo­ple writ­ing things like this.)

Sum­mary of Pri­vate LessWrong Thread (Me/​Ben­quo/​Jes­sica)

One ex­per­i­ment we tried dur­ing the con­ver­sa­tion was to hold a con­ver­sa­tion on LessWrong, in a pri­vate draft (i.e. where we could re­spond to each other with nested thread­ing, but only have to worry about re­spond­ing to each other)

The thread was started by Ruby, with some pro­pos­als for LessWrong mod­er­a­tion style. At first the con­ver­sa­tion was pri­mar­ily Ruby and Zvi. At some point Ruby might make the full thread pub­lic, but for now I’m fo­cus­ing on an ex­change be­tween Ben­quo, Jes­sica and I, which I found most helpful for clar­ify­ing our po­si­tions.


It might help for me to also try to make a pos­i­tive state­ment of what I think is at stake here. [...]

What I see as un­der threat is the abil­ity to say in a way that’s ac­tu­ally heard, not only that opinion X is false, but that the pro­cess gen­er­at­ing opinion X is un­trust­wor­thy, and per­haps ac­tively op­ti­miz­ing in an ob­jec­tion­able di­rec­tion. Fre­quently, at­tempts to say this are con­strued p_ri­mar­ily_ as moves to at­tack some per­son or in­sti­tu­tion, push­ing them into the out­group. Fre­quently, peo­ple sug­gest to me an “equiv­a­lent” word­ing with a softer tone, which in fact omits im­por­tant sub­stan­tive crit­i­cisms I mean to make, while claiming to un­der­stand what’s at is­sue.

I re­sponded:

My core claim is: “right now, this isn’t pos­si­ble, with­out a) it be­ing heard by many peo­ple as an at­tack, b) with­out peo­ple hav­ing to worry that other peo­ple will see it as an at­tack, even if they don’t.”

It seems like you see this some­thing as “there’s a pre­cious thing that might be de­stroyed” and I see it as “a pre­cious thing does not ex­ist and must be cre­ated, and the cir­cum­stances in which it can ex­ist are frag­ile.” It might have ex­isted in the very early days of LessWrong. But the land­scape now is very differ­ent than it was then. With billions of dol­lars available and at stake, what worked then can’t be the same thing as what works now.

[in pub­lic. In pri­vate things are much eas­ier. It’s *also* the case that pri­vate chan­nels en­able col­lu­sion – that was an up­date i’ve made over the course of the con­ver­sa­tion.]

And, while I be­lieve that you earnestly be­lieve that the quote para­graph is im­por­tant, your in­di­vi­d­ual state­ments of­ten look too op­ti­mized-as-an-obfus­cated-at­tack for me to trust that they are not. I as­sign sub­stan­tial prob­a­bil­ity to a lot of your mo­tives be­ing ba­si­cally tra­di­tional coal­i­tion-poli­ti­cal and you are just in de­nial about it, with a com­pli­cated nar­ra­tive to sup­port them. If that’s not true, I re­al­ize it must be ex­tremely in­furi­at­ing to be treated that way. But the na­ture of the so­cial land­scape makes it a bad policy for me to take you at your word in many of the cases.

Wish­ing the game didn’t ex­ist doesn’t make the game not ex­ist. We could all agree to stop play­ing at once, but a) we’d need to cred­ibly be­lieve we were all ac­tu­ally go­ing to stop play­ing at once, b) have en­force­ment mechanisms to make sure it con­tinues not be­ing played, c) have a way to en­sure new­com­ers are also not play­ing.

And I think that’s all pos­si­bly achiev­able, in­cre­men­tally. I think “how to achieve that” is a su­per im­por­tant ques­tion. But at­tempt­ing to not-play the game with­out putting in that effort looks me ba­si­cally like putting a sign that says “cold” on a bro­ken re­friger­a­tor and ex­pect­ing your food to stay fresh.


I spent a few min­utes try­ing to gen­er­ate cruxes. Get­ting to “real” cruxes here feels fairly hard and will prob­a­bly take me a cou­ple hours. (I think this con­ver­sa­tion is close to the point where I’d re­ally pre­fer us to each switch to the role of “Pass each other’s ITTs, and figure out what would make our­selves change our mind” rather than “figure out how to ex­plain why we’re right.” This may re­quire more model-shar­ing and trust-build­ing first, dunno)

But I think the clos­est prox­i­mate crux is: I would trust Ben’s world-model a lot more if I saw a lot more dis­cus­sion of how the game the­ory plays out over mul­ti­ple steps. I’m not that con­fi­dent that my in­ter­pre­ta­tion of the game the­ory and so­cial land­scape are right. But I can’t re­call any ex­plo­ra­tions of it, and I think it should be at least 50% of the dis­cus­sion here.

But the land­scape now is very differ­ent than it was then. With billions of dol­lars available and at stake, what worked then can’t be the same thing as what works now.

Jes­sica re­sponds to me:

Is this a claim that peo­ple are al­most cer­tainly go­ing to be pro­tect­ing their rep­u­ta­tions (and also be­liefs re­lated to their rep­u­ta­tions) in anti-epistemic ways when large amounts of money are at stake, in a way they wouldn’t if they were just mem­bers of a philos­o­phy club who didn’t think much money was at stake?

This claim seems true to me. We might ac­tu­ally have a lot of agree­ment. And this matches my im­pres­sion of “EA/​ra­tio­nal­ity shift from ‘that which can be de­stroyed by the truth should be’ norms to­wards ‘pro­tect feel­ings’ norms as they have grown and want to play nicely with power play­ers while main­tain­ing their own power.”

If we agree on this point, the re­main­ing dis­agree­ment is likely about the game the­ory of break­ing the bad equil­ibrium as a small group, as you’re say­ing it is.

(Also, thanks for bring­ing up money/​power con­sid­er­a­tions where they’re rele­vant; this makes the dis­cus­sion much less obfus­cated and much more likely to reach cruxes)

[Note, my im­pres­sion is that the pre­cious thing already ex­ists among a small num­ber of peo­ple, who are try­ing to main­tain and grow the pre­cious thing and are run­ning into op­po­si­tion, and enough such op­po­si­tion can cause the pre­cious thing to go away, and the pre­cious thing is cur­rently be­ing main­tained largely through will­ing­ness to force­fully push through op­po­si­tion. Note also, if the pre­cious thing used to ex­ist (among peo­ple with strong stated will­ing­ness to main­tain it) and now doesn’t, that in­di­cates that forces against this pre­cious thing are strong, and have to be op­posed to main­tain the pre­cious thing.]

I re­sponded:

An im­por­tant thing I said ear­lier in an­other thread was that I saw roughly two choices for how to do the pre­cious thing, which is some­thing like:

  • If you want to do the pre­cious thing in pub­lic (in par­tic­u­lar when billions of dol­lars are at stake, al­though also when nar­ra­tive and com­mu­nity buy-in are at stake), it re­quires a lot of spe­cial effort, and is costly

  • You can to­tally do the pre­cious thing in small pri­vate, and it’s much easier

  • And I think a big chunk of the dis­agree­ment comes from the ‘small pri­vate groups are also a way that pow­er­ful groups col­lude, and be du­plic­i­tous, and other things in that space.’

[There’s a sep­a­rate is­sue, which is that re­searchers might feel more pro­duc­tive, lo­cally, in pri­vate. But failure to write up their ideas pub­li­cly means other peo­ple can’t build on them, which is globally worse. So you also want some pres­sure on re­search groups to pub­lish more]

So the prob­lem-fram­ing as I cur­rently see it is:

  • What are the least costly ways you can have plain­spo­ken truth in pub­lic, with­out de­stroy­ing (or re­sult­ing in some­one else de­stroy­ing) the shared pub­lic space. Or, what col­lec­tion of pub­lic truth­seek­ing norms out­put the most use­ful true things per unit of effort in a sus­tain­able fashion

  • What are ways that we can cap­ture the benefits of pri­vate spaces (some­times re­cruit­ing new peo­ple into the pri­vate spaces), while hav­ing sys­tems/​norms/​coun­ter­fac­tual-threats in place to pre­vent col­lu­sion and du­plic­ity, and en­courage more fre­quent pub­lish­ing of re­search.

And the over­all strat­egy I cur­rently ex­pect to work best (but with weak con­fi­dence, haven’t thought it through) is:

  • Change the de­fault of pri­vate con­ver­sa­tions from ‘stay pri­vate for­ever’ to ’by de­fault, start in pri­vate, but with an as­sump­tion that the con­ver­sa­tion will usu­ally go pub­lic un­less there’s a good rea­son not to, with par­ti­ci­pants hav­ing veto* power if they think it’s im­por­tant not to go pub­lic.”

  • An al­ter­nate take on “the con­ver­sa­tion goes pub­lic” is “the par­ti­ci­pants write up a dis­til­la­tion of the con­ver­sa­tion that’s more op­ti­mized for peo­ple to learn what hap­pened, which both par­ti­ci­pants en­dorse.” (i.e. while I’m fine with all my words in this pri­vate thread be­ing shared, I think try­ing to read the en­tire con­ver­sa­tion might be more con­fus­ing than it needs to be. It might not be worth any­one’s time to write up a dis­til­la­tion, but if some­one felt like it I think that’d be prefer­able all else be­ing equal)

  • Have this for­mally coun­ter­bal­anced by “if peo­ple seem to be abus­ing their veto power for col­lu­sion or du­plic­i­tous pur­poses, have coun­ter­fac­tual threats to pub­li­cly harm each other’s rep­u­ta­tion (pos­si­bly be­tray­ing the veto-pro­cess*), which hope­fully doesn’t hap­pen, but the threat of it hap­pen­ing keeps peo­ple hon­est.

*Im­por­tantly, a for­mal part of the veto sys­tem is that if peo­ple get an­gry enough, or de­cide it’s im­por­tant enough, they can just ig­nore your veto. If the game is rigged, the cor­rect thing to do is kick over the game­board. But, ev­ery­one has a shared un­der­stand­ing that a game­board is bet­ter than no game­board, so in­stead, peo­ple are in­cen­tivized to not rig the game (or, if the game is cur­rently rigged, work to­gether to de-rig it)

Be­cause ev­ery­one agrees that these are the rules of the metagame, be­tray­ing the con­fi­dence of the pri­vate space is seen as a valid ac­tion (i.e. if peo­ple didn’t agree that these were the meta-rules, I’d con­sider be­tray­ing some­one’s con­fi­dence to be a deeply bad sign about a per­son’s trust­wor­thi­ness. But if peo­ple d_oa_gree to the meta-rules, then if some­one be­trays a veto it’s a sign that you should maybe be hes­i­tant to col­lab­o­rate with that per­son, but not as strong a sign about their over­all trust­wor­thi­ness)


I’m first go­ing to sum­ma­rize what I think you think:

  • $Billions are at stake.

  • Peo­ple/​or­ga­ni­za­tions are giv­ing pub­lic nar­ra­tives about what they’re do­ing, in­clud­ing ones that af­fect the $billions.

  • Peo­ple/​or­ga­ni­za­tions also have nar­ra­tives that func­tion for main­tain­ing a well-func­tion­ing, co­he­sive com­mu­nity.

  • Peo­ple crit­i­cize these nar­ra­tives some­times. Th­ese crit­i­cisms have con­se­quences.

  • Con­se­quences in­clude: Peo­ple feel the need to defend them­selves. Peo­ple might lose fund­ing for them­selves or their or­ga­ni­za­tion. Peo­ple might fall out of some “in­group” that is hav­ing the im­por­tant dis­cus­sions. Peo­ple might form coal­i­tions that tear apart the com­mu­nity. The over­all trust level in the com­mu­nity, in­clud­ing will­ing­ness to take the sen­si­ble ac­tions that would be im­plied by the com­mu­nity nar­ra­tive, goes down.

  • That doesn’t mean crit­i­cism of such nar­ra­tives is always bad. Some­times, it can be done well.

  • Crit­i­cisms are im­por­tant to make if the crit­i­cism is re­ally clear and im­por­tant (e.g. the crit­i­cism of ACE). Then, peo­ple can take ap­pro­pri­ate ac­tion, and it’s clear what to do. (See strong and clear ev­i­dence)

  • Crit­i­cisms are po­ten­tially de­struc­tive when they don’t set­tle the mat­ter. Th­ese can end up re­duc­ing co­he­sion/​trust, split­ting the com­mu­nity, tar­nish­ing rep­u­ta­tions of peo­ple who didn’t ac­tu­ally do some­thing wrong, etc.

  • Th­ese non-mat­ter-set­tling crit­i­cisms can still be im­por­tant to make. But, they should be done with sen­si­tivity to the poli­ti­cal dy­nam­ics in­volved.

  • Peo­ple mak­ing pub­lic crit­i­cisms willy-nilly would lead to a bunch of bad effects (already men­tioned). There are stan­dards for what makes a good crit­i­cism, where “it’s true/​well-ar­gued” is not the only stan­dard. (Other stan­dards are: is it clear, is it em­pa­thetic, did the critic try other chan­nels first, etc)

  • It’s still im­por­tant to get to the truth, in­clud­ing truths about ad­ver­sar­ial pat­terns. We should be do­ing this by think­ing about what norms get at these truths with min­i­mum harm caused along the way.

Here’s a sum­mary of what I think (writ­ten be­fore I sum­ma­rized what you thought):

  • The fact that $billions are at stake makes reach­ing the truth in pub­lic dis­cus­sions strictly more im­por­tant than for a philos­o­phy club. (After all, these pub­lic dis­cus­sions are af­fect­ing the back­ground facts that pri­vate dis­cus­sions, in­clud­ing ones that dis­tribute large amounts of money, as­sume)

  • The fact that $billions are at stake in­creases the like­li­hood of obfus­ca­tory ac­tion com­pared to in a philos­o­phy club.

  • The “level one” thing to do is to keep us­ing philos­o­phy club norms, like old-LessWrong. Give rea­sons for think­ing what you think. Don’t make ap­peals to con­se­quences or shut peo­ple up for say­ing in­con­ve­nient things; ar­gue at the ob­ject level. Don’t in­sult peo­ple. If you’re too sen­si­tive to hear the truth, that’s for the most part your prob­lem, with some ex­cep­tions (e.g. some per­sonal in­sults). Mostly don’t ar­gue about whether the other peo­ple are bi­ased/​ad­ver­sar­ial, and in­stead make good ob­ject-level ar­gu­ments (this could be stated some­what mis­lead­ingly as “as­sume good faith”). Have pub­lic de­bates, pos­si­bly with mod­er­a­tors.

  • A prob­lem with “level one” norms is that they rarely talk about obfus­ca­tory ac­tion. “As­sume good faith”, taken liter­ally, im­plies obfus­ca­tion isn’t hap­pen­ing, which is false given the cir­cum­stances (in­clud­ing mon­e­tary in­cen­tives). Philos­o­phy club norms have some se­cu­rity flaws.

  • The “level two” thing to do is to ex­tend philos­o­phy club norms to han­dle dis­cus­sion of ad­ver­sar­ial ac­tion. Courts don’t as­sume good faith; it would be trans­par­ently ridicu­lous to do so.

  • Courts blame and dis­pro­por­tionately pun­ish peo­ple. We don’t need to do this here, we need the truth to be re­vealed one way or an­other. Dispro­por­tionate pun­ish­ments make peo­ple re­ally defen­sive and obfus­ca­tory, un­der­stand­ably. (Law fought fraud, and fraud won)

  • So, “level two” should de­velop lan­guage for talk­ing about obfus­ca­tory/​de­struc­tive pat­terns of so­cial ac­tion that doesn’t dis­pro­por­tionately pun­ish peo­ple just for get­ting caught up in them. (Note, there are some “karmic” con­se­quences for get­ting caught up in these dy­nam­ics, like hav­ing the or­ga­ni­za­tion be less effec­tive and get­ting a rep­u­ta­tion for be­ing bad at re­sist­ing so­cial pres­sure, but these are very differ­ent from the dis­pro­por­tionate pun­ish­ments typ­i­cal of the le­gal sys­tem, which pun­ish dis­pro­por­tionately on the as­sump­tion that most crime isn’t caught)

  • I per­ceive a back­slide from “level one” norms, to­wards more diplo­matic norms, where cer­tain things are con­sid­ered “rude” to say and are “at­tack­ing peo­ple”, even if they’d be ac­cepted in philos­o­phy club. I think this is about main­tain­ing power ille­gi­t­i­mately.

Here are more points that I thought of af­ter sum­ma­riz­ing your po­si­tion:

  • I ac­tu­ally agree that in­di­vi­d­u­als should be us­ing their dis­cern­ment about how and when to be mak­ing crit­i­cisms, given the poli­ti­cal situ­a­tion.

  • I worry that say­ing cer­tain ways of mak­ing crit­i­cisms are good/​bad re­sults in peo­ple get­ting silenced/​blamed even when they’re say­ing true things, which is re­ally bad.

  • So I’m tempted to ar­gue that the norms for pub­lic dis­cus­sion should be ap­prox­i­mately “that which can be de­stroyed by the truth should be”, with some level of pri­vacy and po­lite­ness norms, the kind you’d have in a com­bi­na­tion of a philos­o­phy club and a court.

  • That said, there’s still a com­pli­cated ques­tion of “how do you make crit­i­cisms well”. I think ad­vice on this is im­por­tant. I think the cor­rect ad­vice usu­ally looks more like ad­vice to whistle­blow­ers than ad­vice for diplo­macy.

Note, my opinion of your opinions, and my opinions, are ex­pressed in pretty differ­ent on­tolo­gies. What are the cruxes?

Sup­pose fu­ture-me tells me that I’m pretty wrong, and ac­tu­ally I’m go­ing about do­ing crit­i­cisms the wrong way, and ad­vo­cat­ing bad norms for crit­i­cism, rel­a­tive to you. Here are the ex­pla­na­tions I come up with:

  • “Scis­sor state­ments” are ac­tu­ally a huge risk. Make sure to prove the thing pretty defini­tively, or there will be a bunch of com­mu­nity splits that make dis­cus­sion and co­op­er­a­tion harder. Yes, this means peo­ple are get­ting de­ceived in the mean­time, and you can’t stop that with­out caus­ing worse bad con­se­quences. Yes, this means group episte­mol­ogy is re­ally bad (re­sem­bling mob be­hav­ior), but you should try up­grad­ing that a differ­ent way.

  • You’re us­ing lan­guage that im­plies court norms, but courts dis­pro­por­tionately pun­ish peo­ple. This lan­guage is go­ing to in­crease obfus­ca­tory be­hav­ior way more than it’s worth, and pos­si­bly re­sult in dis­pro­por­tionate pun­ish­ments. You should try re­ally, re­ally hard to de­velop differ­ent lan­guage. (Yes, this means some sac­ri­fice in how clear things can be and how much mo­men­tum your re­form move­ment can sus­tain)

  • Peo­ple say­ing crit­i­cal things about each other in pub­lic (in­clud­ing not-very-blamey things like “I think there’s a dis­tor­tionary dy­namic you’re get­ting caught up in”) looks re­ally bad in a way that de­ter­minis­ti­cally makes pow­er­ful peo­ple, in­clud­ing just about ev­ery­one with money, stop listen­ing to you or giv­ing you money. Even if you get a true dis­course go­ing, the com­mu­nity’s rep­u­ta­tion will be tar­nished by the jus­tice pro­cess that led to that, in a way that locks the com­mu­nity out of power in­definitely. That’s prob­a­bly not worth it, you should try an­other ap­proach that lets peo­ple save face.

  • Ac­tu­ally, you don’t need to be do­ing pub­lic writ­ing/​crit­i­cism very much at all, peo­ple are perfectly will­ing to listen to you in pri­vate, you just have to use this strat­egy that you’re not already us­ing.

Th­ese are all pretty cruxy; none of them seem likely (though they’re all plau­si­ble), and if I were con­vinced of any of them, I’d change my other be­liefs and my over­all ap­proach.

There are a lot of sub­tleties here. I’m up for hav­ing in-per­son con­ver­sa­tions if you think that would help (recorded /​ writ­ten up or not).

Me fi­nal re­sponse in that thread:

This is an awe­some com­ment on many di­men­sions, thanks. I both agree with your sum­mary of my po­si­tion, and I think your cruxes are pretty similar to my cruxes.

There are a few ad­di­tional con­sid­er­a­tions of mine which I’ll list, fol­lowed by at­tempt­ing to tease out some deeper cruxes of mine about “what facts would have to be true for me to want to back­prop­a­gate the level of fear it seems like you feel into my aes­thetic judg­ment.” [This is a par­tic­u­lar metaframe I’m cur­rently ex­plor­ing]

[Edit: turned out to be more than a few straight­for­ward as­sump­tions, and I haven’t got­ten to the aes­thetic or on­tol­ogy cruxes yet]

Ad­di­tional con­sid­er­a­tions from my own be­liefs:

  • I define clar­ity in terms of what gets un­der­stood, rather than what gets said. So, us­ing words with non-stan­dard con­no­ta­tions, with­out do­ing a lot of up-front work to re­define your terms, seems to me to be re­duc­ing clar­ity, and/​or mix­ing clar­ity, rather than im­prov­ing it.

  • I think it’s es­pe­cially worth­while to de­velop non-court lan­guage, for pub­lic dis­course, if your in­tent is not to be puna­tive – re­pur­pos­ing court lan­guage for non-puna­tive ac­tion is par­tic­u­larly con­fus­ing. The first defi­ni­tion for “fraud” that comes up on google is “wrongful or crim­i­nal de­cep­tion in­tended to re­sult in fi­nan­cial or per­sonal gain”. The con­no­ta­tion I as­so­ci­ate it with is “the kind of ly­ing you pay fines or go to jail for or get iden­ti­fied as a crim­i­nal for”.

  • By de­fault, lan­guage-pro­cess­ing is a mix­ture of truth­seek­ing and politick­ing. The more poli­ti­cal a con­ver­sa­tion feels, the harder it will be for peo­ple to re­main in truth­seek­ing mode. I see the pri­mary goal of a ra­tio­nal­ist/​truth­seek­ing space to be to en­sure peo­ple re­main in truth­seek­ing mode. I don’t think this is com­pletely nec­es­sary but I do think it makes the space much more effec­tive (in terms of time spent get­ting points across).

  • I think it’s very im­por­tant for lan­guage re: how-to-do-poli­tics-while-truth­seek­ing be cre­ated sep­a­rately from any live poli­tics – oth­er­wise, one of the first things that’ll hap­pen is the lan­guage get coopted and dis­torted by the poli­ti­cal pro­cess. Peo­ple are right/​just to fear you de­vel­op­ing poli­ti­cal lan­guage if you ap­pear to be ac­tively try­ing to wield poli­ti­cal weapons against peo­ple while you de­velop it.

  • Fact that is (quite plau­si­bly) my true re­jec­tion – Highly tense con­ver­sa­tions that I get defen­sive at are among the most stress­ful things I ex­pe­rience, which crip­ple my abil­ity to sleep well while do­ing them. This is high enough cost that if I had to do it all the time, I would prob­a­bly just tune them out.

  • This is a self­ish per­spec­tive, and I should per­haps be quite sus­pi­cious of the rest of my ar­gu­ments in light of it. But it’s not ob­vi­ously wrong to me in the first place – hav­ing stress­ful weeks of sleep wrecked is re­ally bad. When I imag­ine a world where peo­ple are crit­i­ciz­ing me all the time [in par­tic­u­lar when they’re mi­s­un­der­stand­ing my frame, see be­low about deep model differ­ences], it’s not at all ob­vi­ous that the net benefit I or the com­mu­nity gets from peo­ple get­ting to ex­press their crit­i­cism more eas­ily out­ways the cost in pro­duc­tivity (which would, among other things, be spent on other truth­seek­ing pur­suits). When I imag­ine this mul­ti­plied across all orgs it’s not very sur­pris­ing or un­rea­son­able seem­ing for peo­ple to have learned to tune out crit­i­cism.

  • Sin­gle Most Im­por­tant Belief that I en­dorse – I think try­ing to de­velop a lan­guage for truth­seek­ing-poli­tics (or poli­tics-ad­jae­cent stuff) could po­ten­tially per­ma­nently de­stroy the abil­ity for a given space do poli­tics sanely. It’s pos­si­ble to do it right, but also very easy to fuck up, and in­stead of prop­erly trans­mit­ting truth­seek­ing-into-poli­tics, poli­tics back­pro­pogates into truth­seek­ing, causes peo­ple to view truth­seek­ing norms as a poli­ti­cal weapon. I think this is ba­si­cally what hap­pened with the Amer­i­can Right Wing and their view of sci­ence (and I think things like the March for Science are harm­ful be­cause they ex­ac­er­bate Science as Poli­tics).

  • In the same way that it’s bad to tell a lie, to ac­com­plish some lo­cally good thing (be­cause the dam­age you do to the ecosys­tem is far worse than what­ever lo­cally good thing you ac­com­plished), I think it is bad to try to in­vent truth­seek­ing-poli­tics-on-the-fly with­out ex­plain­ing well what you are do­ing while also mak­ing claims that peo­ple are (rightly) wor­ried will cost them mil­lions of dol­lars. What­ever lo­cal truth you’re out­putting is much less valuable than the risks you are play­ing with re: the pub­lic com­mons of “abil­ity to ever dis­cuss poli­tics sanely.”

  • I re­ally wish we had de­vel­oped good tools to dis­cuss poli­tics sanely be­fore we got ac­cess to billions of dol­lars. That was an un­der­stand­able mis­take (I didn’t think about it un­til just this sec­ond), but it prob­a­bly cost us deeply. Given that we didn’t, I think cre­at­ing good norms re­quires much more costly sig­nal­ing of good faith (on ev­ery­one’s part) than it might have needed. [this para­graph is all weak con­fi­dence since I just thought of it but feels pretty true to me]

  • Peo­ple have deep mod­els, in which cer­tain things seem ob­vi­ous them that are not ob­vi­ous to oth­ers. I think I dras­ti­cally dis­agree with you about what your prior should be that “Bob has a non-mo­ti­vated deep model (or, not any more mo­ti­vated than av­er­age) that you don’t un­der­stand”, rather than “Bob’s opinion or his model is differ­ent/​fright­en­ing be­cause he is mo­ti­vated, de­cep­tive and/​or non-truth-track­ing.”

  • My im­pres­sion is that ev­ery­one with a deep, weird model that I’ve en­coun­tered was overly bi­ased in fa­vor of their deep model (in­clud­ing you and Ben), but this seems suffi­ciently ex­plained by “when you fo­cus all your at­ten­tion on one par­tic­u­lar facet of re­al­ity, that facet looms much larger in your think­ing, and other facets loom less large”, with some amount of “their per­son­al­ity or cir­cum­stance bi­ased them to­wards their model” (but, not to a de­gree that seems par­tic­u­larly weird or alarm­ing).

  • See­ing “true re­al­ity” in­volves learn­ing lots of deep mod­els into nar­row do­mains and then let­ting them set­tle.

  • [For con­text/​frame, re­mem­ber that it took Eliezer 2 years of blog­ging ev­ery day to get ev­ery­one up to speed on how to think in his frame. That’s roughly the or­der-of-mag­ni­tude of effort that seems like you should ex­pect to ex­pend to ex­plain a coun­ter­in­tu­itive wor­ld­view to peo­ple]

  • In par­tic­u­lar, a lot of the things that seem alarm­ing to you (like, Givewell’s use of num­bers that seem wrong) is pretty well (but not com­pletely) ex­plained by “it’s ac­tu­ally very coun­ter­in­tu­itive to have the opinions you do about what rea­son­able num­bers are.” I have up­dated more to­wards your view on the mat­ter, but a) it took me a cou­ple years, b) it still doesn’t seem very ob­vi­ous to me. Drown­ing-Chil­dren-are-Rare is a plau­si­ble hy­poth­e­sis but doesn’t seem so overde­ter­mined that any­one thinks oth­er­wise must be deeply mo­ti­vated or de­cep­tive.

  • I’m not say­ing this ap­plies across the board. I can think of sev­eral peo­ple in EA or ra­tio­nal­ist space who seem mo­ti­vated in im­por­tant ways. My sense of deep mod­els speci­fi­cally comes from the com­bi­na­tion of “the deep model is pre­sented to me when I in­quire about it, and makes sense”, and “they have given enough costly sig­nals of trust­wor­thi­ness that I’m will­ing to give them the benefit of the doubt.”

  • I have up­dated over the past cou­ple years on how bad “PR man­age­ment” and diplo­macy are for your abil­ity to think, and I ap­pre­ci­ate the cost a bit more, but it still seems less than the penalties you get for truth­seek­ing when peo­ple feel un­safe.

  • I have (low con­fi­dence) mod­els that seem fairly differ­ent from Ben (and I as­sume your) model of what ex­actly early LessWrong was like, and what hap­pened to it. This is com­pli­cated and I think be­yond scope for this com­ment.

  • Un­known Un­knowns, and model-un­cer­tainty. I’m not ac­tu­ally that wor­ried about scis­sor-at­tacks, and I’m not sure how con­fi­dent I am about many of the pre­vi­ous mod­els. But they are all wor­ri­some enough that I think cau­tion is war­ranted.

“Reg­u­lar” Cruxes

Many of the above bul­let-points are cruxy and sug­gest nat­u­ral crux-re­frames. I’m go­ing to go into some de­tail for a few:

  • I could imag­ine learn­ing that my pri­ors on “deep model di­ver­gence” vs “nope, they’re just re­ally de­cep­tive” are wrong. I don’t ac­tu­ally have all that many data points to have longterm con­fi­dence here. It’s just that so far, most of the smok­ing guns that have been pre­sented to me didn’t seem very defini­tive.

  • The con­crete ob­ser­va­tions that would shift this are “at least one of the peo­ple that I have trusted turns out to have a smok­ing gun that makes me think their deep model was highly mo­ti­vated” [I will try to think pri­vately about what con­crete ex­am­ples of this might be, to avoid a thing where I con­fab­u­late jus­tifi­ca­tions in re­al­time.]

  • It might be a lot eas­ier than I think to cre­ate a pub­lic truth­seek­ing space that re­mains sane in the face of money and poli­tics. Re­lat­edly, I might be overly wor­ried about the risk of de­stroy­ing longterm abil­ity to talk-about-poli­tics-sanely.

  • If I saw an ex­ist­ing com­mu­nity that op­er­ated on a pub­lic fo­rum and on­boarded new peo­ple all the time, which had the norms you are ad­vo­cat­ing, and in­ter­view­ing var­i­ous peo­ple in­volved seemed to sug­gest it was work­ing sanely, I’d up­date. I’m not sure if there are eas­ier bits of ev­i­dence to find.

  • The costs that come from diplo­macy might be higher than the costs of defen­sive­ness.

  • Habryka has de­scribed ex­pe­riences where diplo­macy/​PR-con­cerns seemed bad-for-his-soul in var­i­ous ways. [not 100% sure this is quite the right char­ac­ter­i­za­tion but seems about right]. I think so far I haven’t re­ally been “play­ing on hard mode” in this do­main, and I think there’s a de­cent chance that I will be over the next few years. I could imag­ine up­dat­ing about how badly diplo­macy crip­ples thought af­ter hav­ing that ex­pe­rience, and for it to turn out to be greater than defen­sive­ness.

  • I might be the only per­son that suffers from sleep loss or other stress-side-effects as badly as I do.

Th­ese were the eas­ier ones. I’m try­ing to think through the “on­tol­ogy dou­ble­crux” thing and think about what sorts of things would change my on­tol­ogy. That may be an­other while.

Crit­i­cism != Ac­cu­sa­tion of Wrongdoing

Later on, dur­ing an in-per­son con­ver­sa­tion with Jes­sica, some­one else (leav­ing them anony­mous) pointed out an ad­di­tional con­sid­er­a­tion, which is that crit­i­cism isn’t the same as ac­cu­sa­tions.

[I’m not sure I fully un­der­stood the origi­nal ver­sion of this point, so the fol­low­ing is just me speak­ing for my­self about things I be­lieve]

There’s an im­por­tant so­cial tech­nol­ogy, which is to have norms that peo­ple roughly agree on. The costs of ev­ery­one hav­ing to figure out their own norms are enor­mous. So most com­mu­ni­ties have at least some ba­sic things that you don’t do (such as blatantly ly­ing)

Sev­eral im­por­tant prop­er­ties here are:

  • You can os­tra­cize peo­ple who con­tin­u­ously vi­o­late norms.

  • If some­one ac­cuses you of a norm vi­o­la­tion, you feel obli­gated to defend your­self. (Which is very differ­ent from get­ting crit­i­cized for some­thing that’s not a norm vi­o­la­tion)

  • If Alice makes an ac­cu­sa­tion of some­one vi­o­lat­ing norms, and that ac­cu­sa­tion turns out to be ex­ag­ger­ated or ill-founded, that Alice loses points, and peo­ple are less quick to be­lieve her or give her a plat­form to speak next time.

I think one as­pect of the deep dis­agree­ments go­ing on here is some­thing like “what ex­actly are the costs of ev­ery­one hav­ing to de­velop their own the­ory of good­ness”, and/​or what are the benefits of the “there are norms, that get en­forced and defended” model.

I un­der­stand Ben­quo and Jes­sica are ar­gu­ing that we do not in fact have such norms, we just have the illu­sion of such norms, and in fact what we have are weird poli­ti­cal games that benefit the pow­er­ful. And they see their ap­proach as helping to dis­pel that illu­sion.

Whereas I think we do in fact have those norms – there’s a de­gree of ly­ing that would get you ex­pel­led from the ra­tio­nal­sphere and EA­sphere , and this is im­por­tant. And so in­sist­ing on be­ing able to dis­cuss, in pub­lic, whether Bob lied [a norm vi­o­la­tion], while claiming that this is not an at­tack on Bob, just an earnest dis­cus­sion of the truth or model-build­ing of ad­ver­sar­ial dis­course… is de­grad­ing not only the spe­cific norm of “don’t lie” but also “our gen­eral abil­ity to have norms.”

My cur­rent state

I’m cur­rently in the pro­cess of mul­ling this all over. The high level ques­tions are some­thing like:

  • [Within my cur­rent on­tol­ogy] What sorts of ac­tions by EA lead­ers would shift my po­si­tion from “right now we ac­tu­ally have a rea­son­ably good foun­da­tion of trust­wor­thi­ness” to “things are not okay, to the point where it makes more sense to kick the game board over rather than im­prove things.” Or, al­ter­nately “things are not okay, and I need to re­vise my on­tol­ogy in or­der to ac­count for it.”

  • How ex­actly would/​should I shift my on­tol­ogy if things were suffi­ciently bad?

I ex­pect this to be a fairly lengthy pro­cess, and re­quire a fair amount of back­ground pro­cess­ing.

There are other things I’m con­sid­er­ing here, and writ­ing them up turned out to take more time than I have at the mo­ment. Will hope­fully have a Pt 2 of this post.