Speaking of Stag Hunts

This is an essay about the current state of the LessWrong community, and the broader EA/​rationalist/​longtermist communities that it overlaps and bridges, inspired mostly by the dynamics around these three posts. The concepts and claims laid out in Concentration of Force, which was originally written as part one of this essay, are important context for the thoughts below.


Summary/​thesis, mostly cribbed from user anon03′s comment below: In many high-importance and high-emotion discussions on LessWrong, the comments and vote distribution seem very soldier-mindset instead of scout-mindset, and the overall soundness and carefulness of reasoning and discourse seems to me to be much lower than baseline, which already felt a smidge too low. This seems to indicate a failure of the LW community further up the chain (i.e. is a result of a problem, not the problem itself) and I think we should put forth real effort to fix it, and I think the most likely target is something like a more-consistent embrace and enforcement of some very basic rationality discourse norms.


(And somewhere in the back of his mind was a small, small note of confusion, a sense of something wrong about that story; and it should have been a part of Harry’s art to notice that tiny note, but he was distracted. For it is a sad rule that whenever you are most in need of your art as a rationalist, that is when you are most likely to forget it.)


I claim that something has gone a little bit wrong.

And as readers of many of my/​other/​essays/​know, I claim that things going a little bit wrong is often actually quite a big problem.

I am not alone in thinking that the small scale matters. Tiny mental flinches, itty bitty little incentives, things thrown ever so slightly off course (and then never brought back). That small things often have outsized or cumulative effects is a popular view, either explicitly stated or discernible as an underlying assumption in the writings of Eliezer Yudkowsky, Nate Soares, Logan Brienne Strohl, Scott Alexander, Anna Salamon, and Andrew Critch, just to name a few.

Yet I nevertheless feel that I encounter resistance of various forms when attempting to point at small things as if they are important. Resistance rather than cooperative disagreement—impatience, dismissal, often condescension or sneering, sometimes projection and strawmanning.

This is absolutely at least in part due to my own clumsiness and confusion. A better version of me, more skilled at communication and empathy and bridging inferential gaps, would undoubtedly run into these problems less. Would better be able to recruit people’s general enthusiasm for even rather dull and tedious and unsexy work, on that split-second level.

But it seems to me that I can’t locate the problem entirely within myself. That there’s something out there that’s Actually Broken, and that it fights back, at least a little bit, when I try to point at it and fix it.

Here’s to taking another shot at it.


Below is a non-exhaustive list of things which my brain will tend to do, if I don’t put forth strategic effort to stop it:

  • Make no attempt to distinguish between what it feels is true and what is reasonable to believe.

  • Make no attempt to distinguish between what it feels is good and what is actually good.

  • Make wildly overconfident assertions that it doesn’t even believe (that it will e.g. abandon immediately if forced to make a bet).

  • Weaponize equivocation and maximize plausible deniability à la motte-and-bailey, squeezing the maximum amount of wiggle room out of words and phrases. Say things that it knows will be interpreted a certain way, while knowing that they can be defended as if they meant something more innocent.

  • Neglect the difference between what things look like and what they actually are; fail to retain any skepticism on behalf of the possibility that I might be deceived by surface resemblance.

  • Treat a 70% probability of innocence and a 30% probability of guilt as a 100% chance that the person is 30% guilty (i.e. kinda guilty).

  • Wantonly project or otherwise read into people’s actions and statements; evaluate those actions and statements by asking “what would have to be true inside my head, for me to output this behavior?” and then just assume that that’s what’s going on for them.

  • Pretend that it is speaking directly to a specific person while secretly spending the majority of its attention and optimization power on playing to some imagined larger audience.

  • Generate interventions that will make me feel better, regardless of whether or not they’ll solve the problem (and regardless of whether or not there even is a real problem to be solved, versus an ungrounded anxiety/​imaginary injury).

I actually tend not to do these things. I do them fairly rarely, and ever more rarely as time goes on and I improve my cognitive software and add to my catalogue of mental checks and balances and discover more of my brain’s loopholes and close them up one by one.

But it’s work. It’s hard work. It takes up a nontrivial portion of my available spoons every single day.

And more—it requires me to leave value on the table. To not-win fights that I could have won, had I been more willing to kick below the belt. There’s a reason human brains slide toward those shortcuts, and it’s because those shortcuts tend to work.

But the cost—

My brain does not understand the costs, most of which are distant and abstract. My brain was not evolved to understand, on an intuitive, reflexive level, things like:

  • The staggeringly high potential value of the gradual accumulation of truth

  • The staggeringly high potential value of expanding my circle of cooperation

  • The staggeringly high potential value of a world in which people are not constantly subject to having their thoughts and emotions yanked around and manipulated against their will and without their knowledge or consent

All it sees is a chance to win.

“Harry,” whispered Dumbledore, “phoenixes do not understand how winning a battle can lose a war.” Tears were streaming down the old wizard’s cheeks, dripping into his silver beard. “The battle is all they know. They are good, but not wise. That is why they choose wizards to be their masters.”

My brain is a phoenix. It sees ways to win the immediate, local confrontation, and does not understand what it would be sacrificing to secure that victory.


I spend a lot of time around people who are not as smart as me.

(This is a rude statement, but it’s one I’m willing to spend the points to make. It’s right that we generally dock people points for rudeness; rudeness tracks a real and important set of things and our heuristics for dealing with it are actually pretty decent. I hereby acknowledge and accept the regrettable cost of my action.)

I spend a lot of time around people who are not as smart as me, and I also spend a lot of time around people who are as smart as me (or smarter), but who are not as conscientious, and I also spend a lot of time around people who are as smart or smarter and as conscientious or conscientiouser, but who do not have my particular pseudo-autistic special interest and have therefore not spent the better part of the past two decades enthusiastically gathering observations and spinning up models of what happens when you collide a bunch of monkey brains under various conditions.

(Repetition is a hell of a drug.)

All of which is to say that I spend a decent chunk of the time being the guy in the room who is most aware of the fuckery swirling around me, and therefore the guy who is most bothered by it. It’s like being a native French speaker and dropping in on a high school French class in a South Carolina public school, or like being someone who just learned how to tell good kerning from bad keming. I spend a lot of time wincing, and I spend a lot of time not being able to fix The Thing That’s Happening because the inferential gaps are so large that I’d have to lay down an hour’s worth of context just to give the other people the capacity to notice that something is going sideways.

(Note: often, what it feels like from the inside when you are incapable of parsing some particular distinction is that the other person has a baffling and nonsensical preference between two things that are essentially indistinguishable. To someone with colorblindness, there’s just no difference between those two shades. Sometimes, when you think someone is making a mountain out of a molehill, they are in fact making a mountain out of a molehill. But sometimes there’s a mountain there, and it’s kind of wild that you can’t see it. It’s wise to keep this possibility in mind.)


I don’t like the fact that my brain undermines my ability to see and think clearly, if I lose focus for a minute.

I don’t like the fact that my brain undermines other people’s ability to see and think clearly, if I lose focus for a minute.

I don’t like the fact that, much of the time, I’m all on my own to maintain focus, and keep my eye on these problems, and notice them and nip them in the bud.

I’d really like it if I were embedded in a supportive ecosystem. If there were clear, immediate, and reliable incentives for doing it right, and clear, immediate, and reliable disincentives for doing it wrong. If there were actual norms (as opposed to nominal ones, norms-in-name-only) that gave me hints and guidance and encouragement. If there were dozens or even hundreds of people around, such that I could be confident that, when I lose focus for a minute, someone else will catch me.

Catch me, and set me straight.

Because I want to be set straight.

Because I actually care about what’s real, and what’s true, and what’s justified, and what’s rational, even though my brain is only kinda-sorta halfway on board, and keeps thinking that the right thing to do is Win.

Sometimes, when people catch me, I wince, and sometimes, I get grumpy, because I’m working with a pretty crappy OS, here. But I try to get past the wince as quickly as possible, and I try to say “thank you,” and I try to make it clear that I mean it, because honestly, the people that catch me are on my side. They are helping me live up to a value that I hold in my own heart, even though I don’t always succeed in embodying it.

I like it when people save me from the mistakes I listed above. I genuinely like it, even if sometimes it takes my brain a moment to catch up.


I’ve got a handful of metaphors that are trying to triangulate something important.

One of them is “herd immunity.” In particular, those nifty side-by-side time lapses that show the progression of virulent illness in populations with different rates of vaccination or immunity. The way that the badness will spread and spread and spread when only half the population is inoculated, but fizzle almost instantly when 90+% is.

If it’s safe to assume that most people’s brains are throwing up the bad stuff at least as often as mine does, then it seems to matter a lot how infect-able the people around you are. How quickly their immune systems kick in, before the falsehoods take root and replicate and spread.

And speaking of immune systems, another metaphor is “epistemic hygiene.” There’s a reason that phrase exists. It exists because washing your hands and wearing a mask and using disinfectant and coughing into your elbow makes a difference. Cleaner people get sick less, and propagate sickness less, and cleanliness is made up of a bunch of tiny, pre-emptive actions.

I have no doubt that you would be bored senseless by therapy, the same way I’m bored when I brush my teeth and wipe my ass, because the thing about repairing, maintaining, and cleaning is: it’s not an adventure. There’s no way to do it so wrong you might die. It’s just work, and the bottom line is, some people are okay going to work, and some people—

Well, some people would rather die. Each of us gets to choose.

(There was a decent chance that there was going to be someone in the comments using the fact that this essay contains a Rick & Morty quote to delegitimize me and the point that I’m making, but then I wrote this sentence and that became a harder trick to pull off. Not impossible, though.)

Another metaphor is that of a garden.

You know what makes a garden?

Weeding.

Gardens aren’t just about the thriving of the desired plants. They’re also about the non-thriving of the non-desired plants.

And weeding is hard work, and it’s boring, and it’s tedious, and it’s unsexy.


What I’m getting out of LessWrong these days is readership. It’s a great place to come and share my thoughts, and have them be seen by people—smart and perceptive people, for the most part, who will take those thoughts seriously, and supply me with new thoughts in return, many of which I honestly wouldn’t have ever come to on my own.

That’s valuable.

But it’s not what I really want from LessWrong.

What I really want from LessWrong is to make my own thinking better, moment to moment. To be embedded in a context that evokes clearer thinking, the way being in a library evokes whispers. To be embedded in a context that anti-evokes all those things my brain keeps trying to do, the way being in a church anti-evokes coarse language.

I’d like an environment that takes seriously the fact that the little things matter, and that understands that standards and principles that are only enforced 90% of the time aren’t actually enforced.

I think LessWrong actually does a pretty good job of toeing the rationality line, and following its own advice, if you take the sum total of all of its conversations.

But if you look at the conversations that matter—the times when a dose of discipline is most sorely needed, and when its absence will do the most damage—

In the big, important conversations, the ones with big stakes, the ones where emotions run high—

I don’t think LessWrong, as a community, does very well in those conversations at all. When the going gets tough, the number of people who are steadfastly unwilling to let their brains do the things, and steadfastly insistent that others not get away with it either feels like it dwindles to almost nothing, and as a result, the entirely predictable thing happens: people start using symmetric weapons, and they work.

(I set aside a few minutes to go grab some examples—not an exhaustive search, just a quick skim. There’s the total vote count on this comment compared to these two, and the fact that it took nearly three weeks for a comment like this one to appear, and the fact that this is in negative territory, and this comment chain which I discussed in detail in another recent post, and this and its child being positive while this and this hover around zero, and this still not having incorporated the extremely relevant context provided in this, and therefore still being misleading to anyone who doesn’t get around to the comments, and the lack of concrete substantiation of the most radioactive parts of this, and so on and so forth.)

To be clear: there are also many examples of the thing going well. If you count up from nothing, and just note all the places where LessWrong handled these conversations better than genpop, there are many! More, even, than what I’m highlighting as the bad stuff.

But gardens aren’t just about the thriving of the desired plants. They’re also about the non-thriving of the non-desired plants.

There’s a difference between “there are many black ravens” and “we’ve successfully built an environment with no white ravens.” There’s a difference between “this place substantially rewards black ravens” and “this place does not reward white ravens; it imposes costs upon them.” It should be possible—no, it should be easy to have a conversation about whether the incidence of white ravens has been sufficiently reduced, separate from the question of the total incidence of black ravens, and to debate what the ratio of white ravens to black ravens needs to be, and how long a white raven should hang out before being chased away, and what it would cost to do things differently, and whether that’s worth it, and I notice that this very sentence is becoming pretty defensive, and is emerging in response to past experiences, and a strong expectation that my attempt at nuance and specificity is likely to fail, because the culture does not sufficiently disincentivize projection and strawmanning and misrepresentation, and so attempts-to-be-clear cannot simply be offhand but must be preemptively fortified and made proof against adversarial interpretation and geez, this kind of sucks, no?

In Concentration of Force, which was originally part one of this essay, I mention the process of evaporative cooling, and I want to ask: who is being evaporatively cooled out of LessWrong these days, and is that the feedback loop we want to set up?

I think it isn’t. I think that a certain kind of person—

(one who buys that it’s important to stick to the rationality 101 basics even when it’s inconvenient, and that even a small percentage of slips along this axis is a pretty big deal)

—is becoming less prevalent on LessWrong, and a certain other kind of person—

(one who doesn’t buy the claim that consistency-on-the-small-stuff matters a lot, and/​or thinks that there are other higher goals that supersede approximately-never-letting-the-standards-slip)

—is becoming more prevalent, and while I have nothing against the latter in general, I really thought LessWrong was for the former.


Here’s my vision of LessWrong:

LessWrong should be a place where rationality has reliable concentration of force.

Where rhetorical trickery does not work. Where supposition does not get mistaken for fact. Where people’s words are treated as if they mean what they say, and if there seems to be another layer of implication or inference, that is immediately surfaced and made explicit so the hypothesis can be checked, rather than the assumption run with. Where we are both capable of distinguishing, and careful to distinguish, our interpretations from our observations, and our plausible hypotheses from our justified conclusions. Where we hold each other to that standard, and receive others holding us to that standard as prosocial and cooperative, because we want help holding the line. Where bad commentary is not highly upvoted just because our monkey brains are cheering, and good commentary is not downvoted or ignored just because our monkey brains boo or are bored.

Perhaps most importantly, where none of the above is left on the level of “c’mon, we all know.” Where bad stuff doesn’t go unmentioned because it’s just assumed that everyone knows it’s bad. That just results in newcomers not knowing the deal, and ultimately means the standards erode over time.

(A standard people are hesitant or embarrassed or tentative about supporting, or that isn’t seen as cool or sophisticated to underline, is not one that endures for very long.)

“Professor Quirrell,” said Harry gravely, “all the Muggle-raised students in Hogwarts need a safety lecture in which they are told the things so ridiculously obvious that no wizardborn would ever think to mention them. Don’t cast curses if you don’t know what they do, if you discover something dangerous don’t tell the world about it, don’t brew high-level potions without supervision in a bathroom, the reason why there are underage magic laws, all the basics.”

I spend a decent chunk of my time doing stuff like upvoting comments that are mostly good, but noting in reply to them specific places in which I think they were bad or confused or norm-violating. I do this so that I don’t accidentally create a social motte-and-bailey, and erode the median user’s ability to tell good from bad.

This is effortful work. I wish more people pitched in, more of the time, the way this user did here and here and here and here.

In my opinion, the archetype of the Most Dangerous Comment is something like this one:

One of the things that can feel like gaslighting in a community that attracts highly scrupulous people is when posting about your interpretation of your experience is treated as a contractual obligation to defend the claims and discuss any possible misinterpretations or consequences of what is a challenging thing to write in the first place.

This is a bad comment (in context, given what it’s replying to). It’s the kind of thing my brain produces, when I lose focus for a minute.

But it sounds good. It makes you Feel Like You’re On The Right Team as you read it, so long as you’re willing to overlook the textbook strawmanning it does, of the comment it’s replying to.

It’s a Trojan horse. It’s just such good-thoughts-wrapped-in-bad-forms that people give a pass to, which has the net effect of normalizing bad forms.

It’s when the people we agree with are doing it wrong that we are most in need of standards, firmly held.

(I have a few theories about why people are abandoning or dismissing or undermining the standards, in each of a few categories. Some people, I think, believe that it’s okay to take up banned weapons as long as the person you’re striking at is in the outgroup. Some people seem to think that suffering provides a justification for otherwise unacceptable behavior. Some people seem to think that you can skip steps as long as you’re obviously a good guy, and others seem to think that nuance and detail are themselves signs of some kind of anti-epistemic persuadery. These hypotheses do not exhaust the space of possibility.)

It is an almost trivial claim that there are not enough reasonable people in the world. There literally never will be, from the position of a group that’s pushing for sanity—if the quality of thought and discourse in the general population suddenly rose to match the best of LessWrong, the best of the LessWrongers would immediately set their sights on the next high-water mark, because this sure ain’t enough.

What that means is that, out there in the broader society, rationality will approximately always lose the local confrontations. Battles must be chosen with great care, and the forces of reason meticulously prepared—there will be occasional moments of serendipity when things go well, and the rare hero that successfully speaks a word of sanity and escapes unscathed, but for the most part those victories won’t come by accident.

Here, though—here, within the walls of the garden—

A part of me wants to ask “what’s the garden for, if not that? What precisely are the walls trying to keep out?”

In my post on moderating LessWrong, I set forth the following principle:

In no small part, the duty of the moderation team is to ensure that no LessWronger who’s trying to adhere to the site’s principles is ever alone, when standing their ground against another user (or a mob of users) who isn’t

I no longer think that’s sufficient. There aren’t enough moderators for reliable concentration of force. I think it’s the case that LessWrongers trying to adhere to the site’s principles are often alone—and furthermore, they have no real reason, given the current state of affairs, to expect not to be alone.

Sometimes, people show up. Often, if you look on a timescale of days or weeks. But not always. Not quickly. Not reliably.

(And damage done in the meantime is rarely fully repaired. If someone has broadcast falsehoods for a week and they’ve been strongly upvoted, it’s not enough to just say “Oops, sorry, I was wrong.” That comes nowhere close to fixing what they broke.)

Looking at the commentary in the threads of the last month—looking at the upvotes and the downvotes—looking at what was said, and by whom, and when—

It’s not promising.

It’s not promising in the sense that the people in the parking lot consider it their responsibility to stand in defense of the parent who left their kids in the car.

That they do so reliably, and enthusiastically. That they show up in force. That they show up in force because they expect to be backed up, the way that people in a city expect to be backed up if they push back against someone shouting racist slurs. That they consider themselves obligated to push back, rather than considering it not-their-problem.

It is a defining characteristic of stag hunts that when everybody actually buys in, the payoff is pretty huge.

It is also a defining characteristic of stag hunts that when critical mass fails to cohere, those who chose stag get burned, and feel cheated, and lose big.

This post is nowhere close to being a sufficient coordination mechanism to cohere a stag hunt. No one should change their behavior in response to this alone.

But it’s a call for a stag hunt.


The elephant in the room, which deserves its own full section but I wasn’t able to pull it together:

Standards are not really popular. Most people don’t like them. Or rather, most people like them in the abstract, but chafe when they get in the way, and it’s pretty rare for someone to not think that their personal exception to the standard is more justified than others’ violations. Half the people here, I think, don’t even see the problem that I’m trying to point at. Or they see it, but they don’t see it as a problem.

I think a good chunk of LW’s current membership would leave or go quiet if we actually succeeded at ratcheting the standards up.

I don’t think that’s a bad thing. I’d like to be surrounded by people who are actually trying. And if LW isn’t going to be that place, and it knows that it isn’t, I’d like to know that, so I can go off and found it (or just give up).


Terrible Ideas

… because I don’t have better ones, yet.

The target is “rationality has reliable concentration of force.”

The current assessment is “rationality does not have reliable concentration of force.”

The vector, then, is things which either increase the number of people showing up in true rationalist style, relative to those who are not, or things which increase the power of people adhering to rationalist norms, relative to those who are not.

More of the good thing, and/​or less of the bad thing.

Here are some terrible ideas for moving in that direction—for either increasing or empowering the people who are interested in the idea of a rationality subculture, and decreasing or depowering those who are just here to cargo cult a little.

  • Publish a set of absolute user guidelines (not suggestions) and enforce them without exception instead of enforcing them like speed limits. e.g. any violation from a new user, or any violation from an established user not retracted immediately upon pushback = automatic three-day ban. If there are Special Cool People™ who are above the law, be explicit about that fact in a way that could be made clear to an autistic ten-year-old.

  • Create a pledge similar to (but better and more comprehensive than) this pledge, and require all users to sign in order to be able to post, comment, and vote. Alternately, make such a pledge optional, but allow users who have pledged and are living up to it some kind of greater power—larger vote strength, or the ability to flag comments as yellow or orange (or blue or green!) regardless of their popularity.

  • Hire a team of well-paid moderators for a three-month high-effort experiment of responding to every bad comment with a fixed version of what a good comment making the same point would have looked like. Flood the site with training data.

  • Make a fork of LessWrong run by me, or some other hopeless idealist that still thinks that there might be something actually good that we can get if we actually do the thing (but not if we don’t).

  • Create an anonymous account with special powers called TheCultureCurators or something, and secretly give the login credentials to a small cadre of 3-12 people with good judgment and mutual faith in one another’s good judgment. Give TheCultureCurators the ability to make upvotes and downvotes of arbitrary strength, or to add notes to any comment or post à la Google Docs, or to put a number on any comment or post that indicates what karma TheCultureCurators believe that post should have.

  • Give up, and admit that we’re kinda sorta nominally about clear thinking and good discourse, but not actually/​only to the extent that it’s convenient and easy, because either “the community” or “some model of effectiveness” takes priority, and put that admission somewhere that an autistic ten-year-old would see it before getting the wrong idea.

These are all terrible ideas.

These are all

terrible

ideas.

I’m going to say it a third time, because LessWrong is not yet a place where I can rely on my reputation for saying what I actually mean and then expect to be treated as if I meant the thing that I actually said: I recognize that these are terrible ideas.

But you have to start somewhere, if you’re going to get anywhere, and I would like LessWrong to get somewhere other than where it was over the past month. To be the sort of place where doing the depressingly usual human thing doesn’t pay off. Where it’s more costly to do it wrong than to do it right.

Clearly, it’s not going to “just happen.” Clearly, we need something to riff off of.

The guiding light in front of all of those terrible ideas—the thing that each of them is a clumsy and doomed attempt to reach for—is making the thing that makes LessWrong different be “LessWrong is a place where rationality has reliable concentration of force.”

Where rationality is the-thing-that-has-local-superiority-in-most-conflicts. Where the people wielding good discourse norms and good reasoning norms always outnumber the people who aren’t—or, if they can’t outnumber them, we at least equip them well enough that they always outgun them.

Not some crazy high-tower thing. Just the basics, consistently done.

Distinguish inference from observation.

Distinguish feeling from fact.

Expose cruxes, or acknowledge up front that you haven’t found them yet, and that this is kind of a shame.

Don’t weaponize motte-and-bailey equivocation.

Start from a position of charity and good faith, or explain why you can’t in concrete and legible detail. Cooperate past the first apparent “defect” from your interlocutor, because people have bad days and the typical mind fallacy is a hell of a drug, as is the double illusion of transparency.

Don’t respond to someone’s assertion of [A] with “But [B] is abhorrent!” Don’t gloss over the part where your argument depends on the assumption that [A→B].

And most importantly of all: don’t actively resist the attempts of others to do these things, or to remind others to do them. Don’t sneer, don’t belittle, don’t dismiss, don’t take-it-as-an-attack. Act in the fashion of someone who wants to be reminded of such things, even when it’s inconvenient or triggers a negative emotional reaction.

Until users doing the above (and similar) consistently win against users who aren’t, LessWrong is going to miss out on a thing that clearly a lot of us kind of want, and kind of think might actually matter to some other pretty important goals.

Maybe that’s fine. Maybe all we really need is the low-effort rough draft. Maybe the 8020 is actually the right balance. In fact, we’re honestly well past the 80/​20—LessWrong is at least an 8537 by this point.

But it’s not actually doing the thing, and as far as I can tell it’s not really trying to do the thing, either—not on the level of “approximately every individual feels called to put forth a little extra effort, and approximately every individual feels some personal stake when they see the standards being degraded.”

Instead, as a collective, we’ve got one foot on the gas and the other on the brake, and that probably isn’t the best strategy for any worthwhile goal. One way or another, I think we should actually make up our minds, here, and either go out and hunt stag, or split up and catch rabbits.


Author’s note: this essay is not as good as I wished it would be. In particular, it’s falling somewhat short of the very standard it’s pulling for, in a way that I silver-line as “reaching downward across the inferential gap” but which is actually just the result of me not having the spoons to do this or this kind of analysis on each of a dozen different examples. Having spent the past six days improving it, “as good as I wish it would be” is starting to look like an asymptote, so I chose now over never.