# jessicata comments on Comment section from 05/​19/​2019

• I think my ac­tual con­cern with this line of ar­gu­men­ta­tion is: if you have a norm of “If ‘X’ and ‘X im­plies Y’ then ‘Y’, EXCEPT when it’s net bad to have con­cluded ‘Y’”, then the were­wolves win.

The ques­tion of whether it’s net bad to have con­cluded ‘Y’, is much, much more com­pli­cated than the ques­tion of whether, log­i­cally, ‘Y’ is true un­der these as­sump­tions (of course, it is). There are many, many more op­por­tu­ni­ties for were­wolves to gum up the works of this pro­cess, mak­ing the calcu­la­tion come out wrong.

If we’re hav­ing a dis­cus­sion about X and Y, some­one moves to pro­pose ‘Y’ (be­cause, as it has already been agreed, ‘X’ and ‘X im­plies Y’), and then some­one else says “no, we can’t do that, that has nega­tive con­se­quences!”, that sec­ond per­son is prob­a­bly play­ing a were­wolf strat­egy, gum­ming up the works of the epistemic sub­strate.

If we are go­ing to have the ex­cep­tion to the norm at all, then there has to be a pretty high stan­dard of ev­i­dence to prove that adding ‘Y’ to the dis­course, in fact, has bad con­se­quences. And, to get the right an­swer, that dis­cus­sion it­self is go­ing to have to be up to high epistemic stan­dards. To be trust­wor­thy, it’s go­ing to have to make log­i­cal in­fer­ences much more com­plex than “if ‘X’ and ‘X im­plies Y’, then ‘Y’”. What if some­one ob­jects to those log­i­cal in­fer­ence steps, on the ba­sis that they would have nega­tive con­se­quences? Where does that dis­cus­sion hap­pen?

In prac­tice, these ques­tions aren’t ac­tu­ally an­swered. In prac­tice, what hap­pens is that so­cial episte­mol­ogy doesn’t hap­pen, and in­stead ev­ery­thing be­comes about coal­i­tional poli­tics. Say­ing ‘Y’ doesn’t mean ‘Y is liter­ally true’, it means you’re part of the coal­i­tion of peo­ple who wants con­se­quences re­lated to (but not even nec­es­sar­ily di­rectly im­plied by!) the state­ment ‘Y’ to be put into effect, and that makes you blame­wor­thy if those con­se­quences hurt some­one sym­pa­thetic, or that coal­i­tion is bad. Un­der such con­di­tions, it is a ma­jor challenge to re-es­tab­lish epistemic dis­course, be­cause ev­ery­thing is about vi­o­lence, in­clud­ing at­tempts to talk about the “we don’t have episte­mol­ogy and ev­ery­thing is about vi­o­lence” prob­lem.

We have some­thing ap­proach­ing epistemic dis­course here on LessWrong, but we have to defend it, or it, too, be­comes all about coal­i­tional poli­tics.

• If we are go­ing to have the ex­cep­tion to the norm at all, then there has to be a pretty high stan­dard of ev­i­dence to prove that adding ‘Y’ to the dis­course, in fact, has bad con­se­quences.

I want to note that LW definitely has ex­cep­tions to this norm, if only be­cause of the bor­ing, nor­mal ex­cep­tions. (If we would get in trou­ble with law en­force­ment for host­ing some­thing you might put on LW, don’t put it on LW.) We’ve had in the works (for quite some time) a post ex­plain­ing our po­si­tion on less bor­ing cases more clearly, but it runs into difficulty with the sort of is­sues that you dis­cuss here; gen­er­ally these ques­tions are an­swered in pri­vate in a way that con­nects to the judg­ment calls be­ing made and the par­tic­u­lars of the case, as op­posed to through trans­par­ent prin­ci­ples that can be clearly un­der­stood and pre­dicted in ad­vance (in part be­cause, to ex­tend the anal­ogy, this em­pow­ers the were­wolves as well).

• Another com­mon were­wolf move is to take ad­van­tage of strong norms like epistemic hon­esty, and use them to drive wedges in a com­mu­nity or push their agenda, while know­ing they can’t be called out be­cause do­ing so would be akin to at­tack­ing the com­mu­nity’s norms.

I’ve seen the meme el­se­where in the ra­tio­nal­ity com­mu­nity that strong and rigid epistemic norms are a good so­ciopath re­pel­lent, and it’s ALMOST right. The truth is that com­pe­tent so­ciopaths (in the Venkat Rao sense) are ac­tu­ally great at us­ing rigid norms for their own ends, and are great at us­ing the truth for their own ends as well. The rea­son it might work well in the ra­tio­nal­ity com­mu­nity (be­sides the ob­vi­ous fact that so­ciopaths are even bet­ter at us­ing lies to their own ends than the truth) is that strong epistemics are very close to what we’re ac­tu­ally fight­ing for—and re­mem­ber­ing and always ori­ent­ing to­wards the mis­sion is ACTUALLY an effec­tive first line defense against so­ciopaths (nec­es­sary but not suffi­cient IMO).

99 times out of a 100, the cor­rect way to re­mem­ber what we’re fight­ing for is to push for stronger epistemics above other con­sid­er­a­tions. I knew that when I made the origi­nal post, and I made it know­ing I would get push­back for at­tack­ing a core value of the com­mu­nity.

How­ever, 1 time out of 100 the cor­rect way to re­mem­ber what you’re fight­ing for is to re­al­ize that you have to sac­ri­fice a sa­cred value for the greater good. And when you see some­one ex­plic­itly push­ing the gray area by try­ing to get you to ac­cept harm­ful situ­a­tions by ap­peal­ing to that sa­cred value, it’s im­por­tant to make clear (mostly to other peo­ple in the com­mu­nity) that sac­ri­fic­ing that value is an op­tion.

• And when you see some­one ex­plic­itly push­ing the gray area by try­ing to get you to ac­cept harm­ful situ­a­tions by ap­peal­ing to that sa­cred value

Um, in con­text, this sounds to me like you’re ar­gu­ing that by writ­ing “Where to Draw the Boundaries?” and my se­cret (“se­cret”) blog, I’m try­ing to get peo­ple to ac­cept harm­ful situ­a­tions? Am I in­ter­pret­ing you cor­rectly? If so, can you ex­plain in de­tail what spe­cific harm you think is be­ing done?

• Sorry, I was try­ing to be re­ally care­ful as I was writ­ing of not ac­cus­ing you speci­fi­cally of bad in­ten­tions, but ob­vi­ously it’s hard in a con­ver­sa­tion like this where you’re jump­ing be­tween the meta and the ob­ject-level.

It’s im­por­tant to dis­t­in­guish a cou­ple things.

1. Jes­sica and I were talk­ing about peo­ple with nega­tive in­ten­tions in the last two posts. I’m not claiming that you’re one of those peo­ple that is de­liber­ately us­ing this type of ar­gu­ment to cause harm.

2. I’m not claiming that it was the writ­ing of those two posts that were harm­ful in the way we were talk­ing about. I was claiming that the long post you wrote at the top of the thread where you made sev­eral analo­gies about your re­sponse, were ex­actly the sort of gray area situ­a­tions where, de­pend­ing on con­text, the com­mu­nity might de­cide to sac­ri­fice it’s sa­cred value. At the same time, you were bank­ing on the fact that it was a sa­cred value to say “even in this case, we would up­hold the sa­cred value.” This has the same struc­ture as the were­wolf move men­tioned above, and it was im­por­tant for me to speak up, even if you’re not a were­wolf.

• Thanks for clar­ify­ing!

peo­ple with nega­tive in­ten­tions [...] deliberately

So, it’s ac­tu­ally not clear to me that de­liber­ate nega­tive in­ten­tions are par­tic­u­larly im­por­tant, here or el­se­where? Al­most no one thinks of them­selves as de­liber­ately caus­ing avoid­able harm, and yet avoid­able harm gets done, prob­a­bly by peo­ple fol­low­ing in­cen­tive gra­di­ents that pre­dictably lead to­wards harm, against truth, &c. all while main­tain­ing a perfectly sincere sub­jec­tive con­scious nar­ra­tive about how they’re do­ing God’s work, on the right side of his­tory, toiling for the greater good, do­ing what needs to be done, max­i­miz­ing global util­ity, act­ing in ac­cor­dance with the moral law, prac­tic­ing a virtue which is name­less, &c.

it was im­por­tant for me to speak up, even if you’re not a were­wolf.

Agreed. If I’m caus­ing harm, and you ac­quire ev­i­dence that I’m caus­ing harm, then you should pre­sent that ev­i­dence in an ap­pro­pri­ate venue in or­der to ei­ther per­suade me to stop caus­ing harm, or per­suade other peo­ple to coör­di­nate to stop me from caus­ing harm.

I was claiming that the long post you wrote at the top of the thread where you made sev­eral analo­gies about your re­sponse, were ex­actly the sort of gray area situ­a­tions where, de­pend­ing on con­text, the com­mu­nity might de­cide to sac­ri­fice it’s sa­cred value.

So, my cur­rent guess (which is only a guess and which I would have strongly dis­agreed with ten years ago) is that this is a suici­dally ter­rible idea that will liter­ally de­stroy the world. Sound like an un­re­flec­tive ap­peal to sa­cred val­ues? Well, maybe!—you shouldn’t take my word for this (or any­thing else) ex­cept to the ex­act ex­tent that you think my word is Bayesian ev­i­dence. Un­for­tu­nately I’m go­ing to need to defer sup­port­ing ar­gu­men­ta­tion to fu­ture Less Wrong posts, be­cause men­tal and fi­nan­cial health re­quire­ments force me to fo­cus on my dayjob for at least the next few weeks. (Oh, and group the­ory.)

(End of thread for me.)

• So, it’s ac­tu­ally not clear to me that de­liber­ate nega­tive in­ten­tions are par­tic­u­larly im­por­tant, here or el­se­where?

(re­spond­ing, and don’t ex­pect an­other re­sponse back be­cause you’re busy).

I used to think this, but I’ve since re­al­ized that in­ten­tions STRONGLY mat­ter. It seems like a sys­tem is frac­tal, the goals of the sub­parts/​sub­agents get re­flected in the goal of the broader sys­tem. Peo­ple with al­igned in­ten­tions will tend to shift the in­cen­tive gra­di­ents, as well peo­ple with un­al­igned in­ten­tions (of course, this isn’t a one way re­la­tion­ship, the in­cen­tive gra­di­ents will also shift the in­ten­tions).

• I deny that your ap­proach ever has an ad­van­tage over rec­og­niz­ing that defi­ni­tions are tools which have no truth val­ues, and then dig­ging into goals or de­sires.

• What speci­fi­cally do you mean by “were­wolf” here & how do you think it re­lates to the way Jes­sica was us­ing it? I’m wor­ried that we’re get­ting close to just re­defin­ing it as a generic term for “en­e­mies of the com­mu­nity.”

• By were­wolf I meant some­thing like “some­one who is pre­tend­ing be work­ing for the com­mu­nity as a mem­ber, but is ac­tu­ally work­ing for their own self­ish ends”. I thought Jes­sica was us­ing it in the same way.

• That’s not what I meant. I meant speci­fi­cally some­one who is try­ing to pre­vent com­mon knowl­edge from be­ing cre­ated (and more gen­er­ally, to gum up the works of “so­cial de­ci­sion­mak­ing based on cor­rect in­for­ma­tion”), as in the Were­wolf party game.

• Worth not­ing: “were­wolf” as a jar­gon term strikes me as some­thing that is in­evitably go­ing to get col­lapsed into “generic bad ac­tor” over time, if it gets used a lot. I’m as­sum­ing that you’re think­ing of it sort of as in the “pre­for­mal” stage, where it doesn’t make sense to over-op­ti­mize the ter­minol­ogy. But if you’re go­ing to keep us­ing it I think it’d make sense to come up with a term that’s some­what more ro­bust against get­ting in­ter­preted that way.

(ran­dom de­fault sug­ges­tion: “obfus­ca­tor”. Other op­tions I came up with re­quired mul­ti­ple words to get the point across and ended up too con­voluted. There might be a fun short­hand for a type of an­i­mal or mytholog­i­cal figure that is a) a preda­tor or par­a­site, b) re­lies on mak­ing things cloudy. So far I could just come up with “squid” due to ink jets, but it didn’t re­ally have the right con­no­ta­tions)

• That is a bit more spe­cific than what I meant. In this case though, the sec­ond more broad mean­ing of “some­one who’s try­ing to gum up the works of so­cial de­ci­sion­mak­ing” still works in the con­text of the com­ment.