I’ve only skimmed the post, but I’ve strong-upvoted it for the civil tone. Some EY-critical posts here are written in such an inflammatory manner that they put me off reading them, and even make me suspicious of the epistemics that produced this criticism. In contrast, I really appreciate the ability to write about strong factual disagreements without devolving into name-calling.
MondSemmel
Yikes, despite Duncan’s best attempts at disclaimers and clarity and ruling out what he doesn’t mean, he apparently still didn’t manage to communicate the thing he was gesturing at. That’s unfortunate. (And also worries me whether I have understood him correctly.)
I will try to explain some of how I understand Duncan.
I have not read the first Leverage post and so cannot comment on those examples, but I have read jessicata’s MIRI post.
and this still not having incorporated the extremely relevant context provided in this, and therefore still being misleading to anyone who doesn’t get around to the comments, and the lack of concrete substantiation of the most radioactive parts of this, and so on and so forth.
As I understand it: This post criticized MIRI and CFAR by drawing parallels to Zoe Curzi’s experience of Leverage. Having read the former but not the latter, the former seemed… not very substantive? Making vague parallels rather than object-level arguments? Merely mirroring the structure of the other post? In any case, there’s a reason why the post sits at 61 karma with 171 votes and 925 comments, and that’s not because it was considered uncontroversially true. Similarly, there’s a reason why Scott Alexander’s comment in response has 362 karma (6x that of the original post; I don’t recall ever seeing anything remotely like that on the site): the information in the original post is incomplete or misleading without this clarification.
The problem at this point is that this ultra-controversial post on LW does not have something like a disclaimer at the top, nor would a casual reader notice that it has lots of downvotes. All the nuance is in the impenetrable comments. So anyone who just reads that post without wading into the comments will get misinformed.
As for the third link in Duncan’s quote, it’s pointing at an anonymous comment supposedly by a former CFAR employee, which was strongly negative of CFAR. But multiple CFAR employees replied and did not have the same impressions of their employer. Which would have been a chance for dialogue and truthseeking, except… that anonymous commenter never followed up to reply, so we ended up with a comment thread of 41 comments which started with those anonymous and unsubstantiated claims and never got a proper resolution (and yet that original comment is strongly upvoted).
Does that make things a bit clearer? In all those cases Duncan (as I understand him) is pointing at things where the LW culture fell far short of optimal; he expects us to do better. (EDIT: Specifically, and to circle back on the Leverage stuff: He expects us to be truthseeking period, to have the same standards of rigor both for critics and defenders, etc. I think he worries that the culture here is currently too happy to upvote anything that’s critical (e.g. to encourage the brave act of speaking out), without extending the same courtesy to those who would speak out in defense of the thing being criticized. Solve for the equilibrium, and the consequences are not good.)
Personally I’m not so sure to which extent “better culture” is the solution (as I am skeptical of the feasibility of anything which requires time and energy and willpower), but have posted several suggestions for how “better software” could help in specific situations (e.g. mods being able to put a separate disclaimer above sufficiently controversial / disputed posts).
Have you considered putting these on Youtube, just to see what happens?
What a quote:
At the end of Simmons’s unpublished post, he writes, “An influential portion of our literature is effectively a made-up story of human-like creatures who are so malleable that virtually any intervention administered at one point in time can drastically change their behavior.” He adds that a “field cannot reward truth if it does not or cannot decipher it, so it rewards other things instead. Interestingness. Novelty. Speed. Impact. Fantasy. And it effectively punishes the opposite. Intuitive Findings. Incremental Progress. Care. Curiosity. Reality.”
I agree to some extent, but I think it would’ve been much better if you’d posted this on the original post, not on the reply. The current phrasing of “I would like to see these sorts of posts receive substantially less attention.” really doesn’t work well when it’s in a response post, rather than the original post. The current setup makes it sound (unintentionally) like accusation posts are fine, and only the responses should receive less attention, which I doubt anyone endorses.
Also, it’s absolutely silly that this is the top comment on this post. Imagine being on the receiving end of drama, responding to it, and then having the top comment be yours, rather than one which engages with the object-level claims.
So I suggest your comment might be better-suited as a top-level meta post of the form “People spend too much time on community drama” or something, where you’d probably get some interesting back and forth and pushback on that claim, and without taking up oxygen in a post where someone is trying to defend their reputation. If you did make it a full post, you could also do a Fermi estimate of the advantages and disadvantages of engaging in community drama; I’m particularly interested in your estimate, not of participating in the drama, but of posting the drama in the first place.
1) I agree with the spirit of this. To re-quote my comment on Elizabeth’s Butterfly Ideas post:
Another avenue to something related to this concept is Babble and Prune (and a third one is de Bono’s Six Thinking Hats): we have different algorithms for creating vs. criticizing ideas. These algorithms don’t mix well, so if you want to come up with new ideas, it’s better to first generate ideas and only later criticize them. IIRC this is also the advice for group brainstorming.
2) That said, I really don’t like the Socrates analogy here. The loose analogy between execution and moderation seems entirely unnecessary, and sounds like a call for violence. I think that makes the discussion about moderation more emotionally charged, and I don’t see how that helps anyone.
Furthermore, Wikipedia is unclear to which extent Socrates’ execution was for political vs. religious reasons. Insofar as it was for religious reasons, that would make him the victim of a religious dispute, or even a martyr; I think this interpretation works against your thesis.
In 399 BC, Socrates went on trial for corrupting the minds of the youth of Athens, and for impiety… The official charges were: (1) corrupting youth; (2) worshipping false gods; and (3) not worshipping the state religion.
...
The question of what motivated Athenians to convict Socrates remains controversial among scholars. There are two theories. The first is that Socrates was convicted on religious grounds; the second, that he was accused and convicted for political reasons. Another, more recent, interpretation synthesizes the religious and political theories, arguing that religion and state were not separate in ancient Athens.
The argument for religious persecution is supported by the fact that Plato’s and Xenophon’s accounts of the trial mostly focus on the charges of impiety. In those accounts, Socrates is portrayed as making no effort to dispute the fact that he did not believe in the Athenian gods. Against this argument stands the fact that many skeptics and atheist philosophers during this time were not prosecuted. According to the argument for political persecution, Socrates was targeted because he was perceived as a threat to democracy. It was true that Socrates did not stand for democracy during the reign of the Thirty Tyrants and that most of his pupils were against the democrats.
3) Finally, I would like to suggest a norm whereby, if you criticize specific active LW users, to mention that they’re banned from commenting on your posts. (I guess you could alternatively mention that in your commenting guidelines.) And that’s coming from someone who has seen some of the exchanges these users are involved in, and so very much understands why they’re banned.
A few parts of this OP seem in bad faith:
Here he dunks on Metaculus predictors as “excruciatingly predictable” about a weak-AGI question
No, the original Yudkowsky quote is:
To be a slightly better Bayesian is to spend your entire life watching others slowly update in excruciatingly predictable directions that you jumped ahead of 6 years earlier so that your remaining life could be a random epistemic walk like a sane person with self-respect.
I wonder if a Metaculus forecast of “what this forecast will look like in 3 more years” would be saner. Is Metaculus reflective, does it know what it’s doing wrong?
This quote does not insult Metaculus predictors as “excruciatingly predictable”.
It doesn’t call out individual Metaculus predictors at all.
And regarding this:
But Eliezer seems willing to format his message as blatant fearmongering like this. For years he’s been telling people they are doomed, and often suggests they are intellectually flawed if they don’t agree. To me, he doesn’t come across like he’s sparing me an upsetting truth. To me he sounds like he’s catastrophizing, which isn’t what I expect to see in a message tailored for mental health.
If OP had extended the same rigor to their own post that they demand from others, there’s no way this paragraph would have remained in this post. If Yudkowsky’s tame comments are considered “dunking”, what is this supposed to be? Insinuation? Libel?
Has OP considered for the briefest moment that Yudkowsky may have had a different motivation in mind than “blatant fearmongering” when writing the referenced post?
Yeah, I stumbled over the price-gouging example for similar reasons. After two background examples of inconceivable worlds, the world of that story sounded similarly incoherent to me—I could not write it in 2021.
Mainly, a world where lawmakers frequently ban pricegouging is a world where it’s probably in their interest to do so. So to posit that they ban it because they’re somehow mistaken about the consequences sounds wrong to me.
Rather than the options in the story, in my model they follow Asymmetric Justice, social reality, dysfunctional incentives in bureaucracies, taboo tradeoffs, etc.: voters see an action they don’t like (pricegouging) and respond with outrage, and then lawmakers respond to this outrage by banning the action and getting rewarded by positive press or something. (Whereas if they instead argue against banning the bad action, they’re accused of supporting it.) From the perspective of the lawmakers, it doesn’t matter one bit what happens as a consequence of the ban, because these consequences are in some sense invisible.
For instance, institutions like the FDA provide constant real-life examples of this dynamic, and Zvi’s Covid posts feature multiple such stories every month.
Documents like these seem among the most important ones to get right.
If this is among the first essays a new user is going to see, then remember that they might have little buy-in to the site’s philosophy, and furthermore don’t know any of the jargon. Furthermore, not all users will be native English speakers.
So my recommendations and feedback come from this perspective.
Regarding the writing:
Be more concise. Most LW essays are way way way too long, and an essay as important as a site introduction should strive to be exemplary in regards like this. It should value its readers’ time more than the median essay on this site does. (To be clear, this comment of mine does not satisfy this standard either.)
Use simpler language. XKCD made the Simple Writer at one time, which IIRC only uses the 1000 most common English words. That’s overkill, but aim for that end of the spectrum, rather than the jargon end.
Aim for a tone that’s enjoyable to read, rather than sounding dry or technical. Reconsider the title for the same reason; it sounds like a manual.
To make the essay more enjoyable to read, consider writing it with personality and character and your quirks as writers and individuals, and signing it with “By the LW Moderation team: X, Y, Z” or some such.
Regarding the content:
I have the overall impression that this document reads like “Here’s how you can audition for a spot in our prestigious club”. But new users assess the site at the same time as the site assesses them. So a better goal for such a document is, in my opinion, to be more invitational. More like courtship, or advertisement. A more reciprocal relationship. “Here’s what’s cool and unique about this site. If you share these goals or values, then here are some tips so we’ll better get along with each other.”
Also, the initial section makes it seem like LW’s rationality discourse is unique, when it’s merely rare. How about referencing some other communities which also do this well, communities which the new user might already be familiar with, so they know what to expect? E.g. other Internet communities which aim more in the direction of collaborative and truth-seeking discourse like reddit’s ELI5 or Change My View; adjacent communities like Astral Codex Ten; discourse in technical communities like engineers or academics; etc. Also stress that all this stuff is merely aspirational: These standards of discourse are just goals we strive towards, and almost everyone falls short sometimes.
Re: the section “How to get started”: There must be some way for new users to actively participate that does not require hours or days of prep work.
Re: the section “How to ensure your first post or comment is approved”: This currently starts “in medias res”, without properly explaining the context of content moderation or why new users would be subject to extra scrutiny. I would begin with something like a brief reference to the concepts from Well-Kept Gardens Die By Pacifism: LW is aiming for a certain standard of discourse, and standards degrade over time unless they’re intentionally maintained. So the site requires moderation. And just like a new user might be unfamiliar with LW, so LW is unfamiliar about the new user and whether they’re here to participate or to troll or spam (potentially even with AI assistance). Hence the extra scrutiny. “We’re genuinely sorry that we have to put new users through hoops and wish it wasn’t necessary (moderation takes time and effort which we would rather put somewhere else).” Here’s how to get through that initial period of getting to know each other in the quickest way possible.
Missing stuff:
Explain the karma system, and what it means for a post to have lots or little karma. Explain agreement karma. Explain that votes by long-time users have more karma power. Explain that highly upvoted posts can still be controversial; I wish we had some <controversial> flag for posts that have tons of upvotes and downvotes. Explain the meaning of downvotes, and how (not) to act when one of your posts or comments has received lots of downvotes.
Try to have Multiple Hypotheses
This section is begging for a reference to Duncan’s post on Split and Commit.
IIRC Duncan has also written lots of other stuff about topics like how to assess accusations, community health stuff, etc. Though I’m somewhat skeptical to which extent his recommendations can be implemented by fallible humans with limited time and energy.
A Cached Belief
I find this Wired article an important exploration of an enormous wrong cached belief in the medical establishment: namely that based on its size, Covid would be transmitted exclusively via droplets (which quickly fall to the ground), rather than aerosols (which hang in the air). This justified a bunch of extremely costly Covid policy decisions and recommendations: like the endless exhortations to disinfect everything and to wash hands all the time. Or the misguided attempt to protect people from Covid by closing public parks and playgrounds, which pushed people to socialize indoors instead.[1]
According to the medical canon, nearly all respiratory infections transmit through coughs or sneezes: Whenever a sick person hacks, bacteria and viruses spray out like bullets from a gun, quickly falling and sticking to any surface within a blast radius of 3 to 6 feet. If these droplets alight on a nose or mouth (or on a hand that then touches the face), they can cause an infection. Only a few diseases were thought to break this droplet rule. Measles and tuberculosis transmit a different way; they’re described as “airborne.” Those pathogens travel inside aerosols, microscopic particles that can stay suspended for hours and travel longer distances. They can spread when contagious people simply breathe.
The distinction between droplet and airborne transmission has enormous consequences. To combat droplets, a leading precaution is to wash hands frequently with soap and water. To fight infectious aerosols, the air itself is the enemy. In hospitals, that means expensive isolation wards and N95 masks for all medical staff.
Finally, here’s a 2006 paper by Lidia Morawska, who features prominently in the article, on droplet transmission:
This paper reviews the state of knowledge regarding mechanisms of droplet spread and solutions available to minimize the spread and prevent infections.
Practical implications: Every day tens of millions of people worldwide suffer from viral infections of different severity at immense economic cost. There is, however, only minimal understanding of the dynamics of virus-laden aerosols, and so the ability to control and prevent virus spread is severely reduced, as was clearly demonstrated during the recent severe acute respiratory syndrome epidemic. This paper proposes the direction to significantly advance fundamental and applied knowledge of the pathways of viral infection spread in indoor atmospheric systems, through a comprehensive multidisciplinary approach and application of state-of-the-art scientific methods. Knowledge gained will have the potential to bring unprecedented economical gains worldwide by minimizing/reducing the spread of disease.
This potential proved harder to realize than expected.
On Trusting the Experts
This story is one of the lessons from the Covid years which I come back to most often. The screw-up informs how I think about questions of expertise, like to which extent I can trust experts and whether experiences from the Covid pandemic should reduce that trust.[2]
And what does it even mean to “trust the experts”, when there are multiple factions which claim expertise on a topic?
From the article, about a Zoom meeting in April 3, 2020:
[The] new coronavirus looked as if it could hang in the air, infecting anyone who breathed in enough of it… But the WHO didn’t seem to have caught on. Just days before, the organization had tweeted “FACT: #COVID19 is NOT airborne.” That’s why … [36 aerosol scientists were] trying to warn the WHO it was making a big mistake.
Over Zoom, they laid out the case. They ticked through a growing list of superspreading events in restaurants, call centers, cruise ships, and a choir rehearsal, instances where people got sick even when they were across the room from a contagious person. The incidents contradicted the WHO’s main safety guidelines of keeping 3 to 6 feet of distance between people and frequent handwashing. If SARS-CoV-2 traveled only in large droplets that immediately fell to the ground, as the WHO was saying, then wouldn’t the distancing and the handwashing have prevented such outbreaks? Infectious air was the more likely culprit, they argued. But the WHO’s experts appeared to be unmoved. If they were going to call Covid-19 airborne, they wanted more direct evidence—proof, which could take months to gather, that the virus was abundant in the air. Meanwhile, thousands of people were falling ill every day.
On the video call, tensions rose. At one point, Lidia Morawska, a revered atmospheric physicist who had arranged the meeting, tried to explain how far infectious particles of different sizes could potentially travel. One of the WHO experts abruptly cut her off, telling her she was wrong, Marr recalls. His rudeness shocked her...
Morawska had spent more than two decades advising a different branch of the WHO on the impacts of air pollution. When it came to flecks of soot and ash belched out by smokestacks and tailpipes, the organization readily accepted the physics she was describing—that particles of many sizes can hang aloft, travel far, and be inhaled. Now, though, the WHO’s advisers seemed to be saying those same laws didn’t apply to virus-laced respiratory particles. To them, the word airborne only applied to particles smaller than 5 microns. Trapped in their group-specific jargon, the two camps on Zoom literally couldn’t understand one another.
So the article obviously takes the side of the aerosol scientists, including calling one “revered”, and calling a WHO expert “rude”. And given that the WHO eventually relented on this issue[3], that makes sense. But an article which takes sides more than a year after the fact doesn’t help us as much to decide which experts to trust in the moment.
That being said, when a big organisation like the WHO makes factual claims with far-reaching economic consequences, and is then very slow to change its mind, and to my knowledge neither apologizes for the mistakes nor fires or even just reprimands anyone responsible, that certainly makes me trust it a lot less.
Conversely, I studied physics, so I can just follow my own tribal instincts and decide to trust the physicists over the doctors.
Maybe it’s easy to decide which experts to trust, after all!
- ^
I wonder how a what-if scenario would’ve worked out where everything about Covid stayed the same, except that this cached belief had been corrected before 2020.
- ^
Unfortunately I don’t have any great answers here. Experiences like these mostly push me towards more skepticisim and epistemic learned helplessness.
- ^
This article in Nature has a timeline of the slooowly evolving WHO statements:
From 9 July 2020:
… short-range aerosol transmission, particularly in specific indoor locations, such as crowded and inadequately ventilated spaces over a prolonged period of time with infected persons cannot be ruled out.
From 20 October 2020:
“Current evidence suggests that the main way the virus spreads is by respiratory droplets among people who are in close contact with each other. Aerosol transmission can occur in specific settings, particularly in indoor, crowded and inadequately ventilated spaces, where infected person(s) spend long periods of time with others, such as restaurants, choir practices, fitness classes, nightclubs, offices and/or places of worship.”
From 23 December 2021:
“Current evidence suggests that the virus spreads mainly between people who are in close contact with each other, for example at a conversational distance … The virus can also spread in poorly ventilated and/or crowded indoor settings, where people tend to spend longer periods of time. This is because aerosols can remain suspended in the air or travel farther than conversational distance (this is often called long-range aerosol or long-range airborne transmission).”
- 20 Jan 2023 20:54 UTC; 8 points) 's comment on The 2021 Review Phase by (
- 27 Jan 2023 23:35 UTC; 2 points) 's comment on Highlights and Prizes from the 2021 Review Phase by (
- ^
This is fantastic! Are you still collecting feature requests for Karma 3.0? I propose adjusting the default font of each comment based on some combination of karma, upvote ratio, and whether an ML algorithm considers it insightful.
The possibility space of this new feature is endless! To give just one example, if a comment is figuratively incomprehensible, Karma 3.0 could make it literally so, by changing its default font to Wingdings.
Fuck you every doctor who told me my digestive problems were in my head or my fault for being a bad patient and you couldn’t help me until I solved the problem that drove me to you. You were factually incorrect and you should feel terrible.
I sympathize so much with this and other sections of this post.
Here’s a somewhat related story of my own.
Part 1
I developed sudden strong stomach cramps in 2017, and while I did get a relatively quick appointment for an endoscopy, it was still a few weeks of suffering. In the meantime I was told that my problem was likely work-related stress or something. And it ultimately turned out to have been
work-related stresshelicobacter pylori, a stomach bacterium for which there is a well-known treatment (taking two different antibiotics and a proton pump inhibitor) which worked quickly and completely.Side note: Supposedly a significant chunk of the developing world has this bacterium. (Wikipedia: “In 2015, it was estimated that over 50% of the world’s population had H. pylori in their upper gastrointestinal tracts[6] with this infection (or colonization) being more common in developing countries.”) But if that’s true, the vast majority of cases must be asymptomatic or mild; the world presumably doesn’t look like debilitating stomach issues are anywhere near that common.
Part 2
Around two years later, I again experienced stomach cramps. I figured that I was well-prepared this time, knew exactly what the problem was, and how to get rid of it. Unfortunately, this time I tested negative for helicobacter and other obvious problems, so I had no idea what to do. (Helpful diagnoses included stuff like irritable bowel syndrome, which essentially means “we don’t know what’s wrong with you, but we still needed a label to bill health insurance”.) My stomach issues lasted for months and got worse until a strategy eventually worked (probably probiotics, maybe assisted by removing lactose from my diet; I never entirely found out).
Also unfortunately, the beginning of my new stomach issues coincided with visiting some new mental health professionals for unrelated reasons. Several of them were all but convinced that my problems were psychosomatic. Again, infuriating and unhelpful. But not surprising—if all you have is a hammer (i.e. a therapist can’t diagnose stomach problems), then everything looks like a nail (i.e. psychosomatic).
And none of those doctors will have or could have learned anything from this episode. After all, it’s not like there’s any feedback channel that would have informed them that their hypothesis was wrong.
- 4 Oct 2023 15:59 UTC; 3 points) 's comment on Stomach Ulcers and Dental Cavities by (
When someone walking by you casually suggests you can do something you might find more pleasant, that is ‘imposing their beliefs on others.’ I think this is a lot of why such folks don’t see a problem actually imposing their beliefs on others and forcing them to engage in physical actions. They do not see a difference between ‘hey man you’d be better off if you did X’ and ‘do X or else.’
I am curious if there is a way to make such people notice this difference.
This is an interesting point. As a teenager I was invited to parties a few times, and there were very strong social expectations and peer pressure to drink alcohol. Having to consistently resist that stuff was utterly exhausting, and I very quickly lost all interest in the whole concept of parties. A similar dynamic occurs during group meals when some people have dietary restrictions, and these restrictions then become the topic of discussion.
On the one hand, I would absolutely call people in those situations “imposing their beliefs on others”. I utterly detested these experiences, and I responded by becoming more contrarian and disagreeable.
On the other hand, I wonder what would have happened if the peer-pressure positions (“must drink alcohol at this party”, “must eat meat at this restaurant”, “must not eat meat at this family gathering”, whatever) had been not only socially mandated, but also legally mandated. I guess I would’ve had to resist at an earlier point, when being invited to a party, rather than when I was already there?
Anyway, my point here is that what seems like an innocuous comment like ‘hey man you’d be better off if you did X’ for one person, can feel like an attack to another. As if it’s part of a relentless barrage of attempts to make you conform to whatever is the current social consensus. (To give another example, I imagine this may also be a part of how women experience catcalling.)
As in previous years, thanks a lot to the Lightcone team for taking the time to organize this yearly review!
Meta: Here is a link to the crosspost on the EA Forum.
Epistemic Virtue
Taking a stab at the crux of this post:
The two sides have different ideas of what it means to be epistemically virtuous.
Yudkowsky wants people to be good Bayesians, which e.g. means not over-updating on a single piece of evidence; or calibrating to the point that whatever news of new AI capabilities appears is already part of your model, so you don’t have to update again. It’s not so important to make publically legible forecasts; the important part is making decisions based on an accurate model of the world. See the LW Sequences, his career, etc.
The OP is part of the Metaculus community and expects people to be good… Metaculeans? That is, they must fulfill the requirements for “forecaster prestige” mentioned in the OP. Their forecasts must be pre-registered, unambiguous, numeric, and numerous.
So it both makes perfect sense for Yudkowsky to criticize Metaculus forecasts for being insufficiently Bayesian (it made little sense that a forecast would be this susceptible to a single piece of news; compare with the LW discussion here), and for OP to criticize Yudkowsky for being insufficiently Metaculean (he doesn’t have a huge public catalog of Metaculean predictions).
So far, so good. However, this post fails to make the case that being Metaculean is more epistemically virtuous than being Bayesian. All it does is illustrate that these are different ways to measure this virtue. And if someone else doesn’t accept your standards, it makes little sense to criticize them for not adhering to yours. It’s not like you’re adhering to theirs!
Metaculus
I do think the Metaculean ideal deserves some criticism of its own.
Metaculus rewards good public forecasts with play points rather than real-world outcomes, and (last I heard, at least) has some issues like demanding that users update their predictions all the time or lose points; or granting people extra points for making predictions, irrespective of the outcome.
And that’s without mentioning that what we actually care about (e.g. when will we get AGI) is sometimes highly dependent on fiddly resolution criteria, to the point that you can’t even interpret some Metaculus forecasts, or make meaningful predictions about them, without reading walls of text.
(And from personal experience with a related platform: I tried the play money prediction market Manifold Markets for a bit. I earned some extra play money via some easy predictions, but then lost interest once it appeared that I couldn’t cash out of my biggest bet due to lack of liquidity. So now all my play money is frozen for a year and I don’t use the platform anymore.)
All in all, making predictions on playpoint sites like Metaculus sounds like a ton of work for little reward. I guess OP is an attempt to make people use it via social shaming, but I doubt the efficacy of this strategy. If it’s so important that Yudkowsky make Metaculean predictions, have you considered offering bags of money for good forecasts instead?
Final Thoughts
Finally, I find it a bit weird that OP complains about criticism of Metaculus forecasts. The whole point of public forecasting, or so it seemed to me, is for there to be a public record which people can use as a resource, to learn from or to critique. And why would it be necessary for those critics to be Metaculus users themselves? Metaculus is a tiny community; most critics of their forecasts will not be community members; and to disregard those critics would be a loss for Metaculus, not for the critics themselves.
It’s bad that comments which are good along three different axes, and bad along none as far as I can see, are ranked way below comments that are much worse along those three axes and also have other flaws
I have an alternative and almost orthogonal interpretation for why the karma scores are the way they are.
Both in your orthonormal-Matt example, and now in this meta-example, the shorter original comments require less context to understand and got more upvotes, while the long meandering detail-oriented high-context responses were hardly even read by anyone.
This makes perfect sense to me—there’s a maximum comment length after which I get a strong urge to just ignore / skim a comment (which I initially did with your response here; and I never took the time to read Matt’s comments, though I also didn’t vote on orthonormal’s comment one way or another, nor vote in the jessicata post much at all), and I would be astonished if that only happened to me.
Also think about how people see these comments in the first place. Probably a significant chunk comes from people browsing the comment feed on the LW front page, and it makes perfect sense to scroll past a long sub-sub-sub-comment that might not even be relevant, and that you can’t understand without context, anyway.
So from my perspective, high-effort, high-context, lengthy sub-comments intrinsically incur a large attention / visibility (and therefore karma) penalty. Things like conciseness are also virtues, and if you don’t consider that it in your model of “good along three different axes, and bad along none as far as I can see”, then that model is incomplete.
(Also consider things like: How much time do you think the average reader spends on LW; what would be a good amount of time, relative to their other options; would you prefer a culture where hundreds of people take the opportunity cost to read sub-sub-sub-comments over one where they don’t; also people vary enormously in their reading speed; etc.)
Somewhat related: My post in this thread on some of the effects of the existing LW karma system: If we grant the above, one remaining problem is that the original orthonormal comment was highly upvoted but looked worse over time:
What if a comment looks correct and receives lots of upvotes, but over time new info indicates that it’s substantially incorrect? Past readers might no longer endorse their upvote, but you can’t exactly ask them to rescind their upvotes, when they might have long since moved on from the discussion.
Suggestions for LW features that could shape its culture by (dis)incentivizing certain behavior (without thinking about how hard they would be to implement):
On How LW Appears to Outside Readers
There’s a certain kind of controversial post that inevitably generates meta-discussion of whether people should be allowed to post it here (most recently in this book review). Crucially, the arguments I see there are not “I don’t like this” but usually “I’m afraid of what will happen when people who don’t like this see it, and associate LW with it”. (I found this really tedious, and would prefer a culture where we stick our own heads out and stick to “I don’t like this” rather than appealing to third parties.) Also, I wish there were a way to preempt this objection, so as to not fight the same battle over and over.
Other posts are in dispute (as in, have an unusually high fraction of downvotes), like jessicata’s post, but a casual reader might only see the post (and maybe its positive karma score), with all the controversy and nuance happening in the utterly impenetrable comments section.
So what might one do about that?
Some Reddit-style sites compute a controversy score (percentage of downvotes). Then one could sort by controversy, or by default filter controversial posts from the frontpage, or visibly flag controversial posts in a way people not familiar with LW would understand, or tag as “controversial”, or something.
If controversial posts are rare in number, this problem can also be tackled by mod intervention. For instance, mods could have the power to put a disclaimer (/ disclaimer template) above a post, or these could be triggered automatically by some specific metrics. Some (bad) examples:
“This post has a notable fraction of downvotes, and a high number of nested comment threads. This indicates that it’s in dispute. Check the comments for details.”
Or: a manual or automatic disclaimer on high-traffic non-frontpaged private blog posts (like the book review) to indicate to readers who come from elsewhere: This post is not frontpaged. LW has less strict moderation standars for private blogs. The karma score (upvotes) of posts reflect positive sentiment towards the contribution by the poster (e.g. taking the time to write a review), they’re not automatically an endorsement of <the reviewed item>. And so on.
Handling High-Stakes Controversies
Duncan’s post notes various ways in which recent controversies were not handled optimally. Some (probably bad) suggestions for site features to help handle such situations better:
The disclaimer thing, so new readers know that a post is controversial before they read it and take its claims at face value. This might also be warranted for some high-karma controversial comments.
Moderation
Mods could try to “turn down the heat” by setting stricter commenting guidelines, or temporarily prevent users below some karma threshold from posting, or something.
Flagging important clarifying comment threads so they appear higher in the comment order, or with extra highlighting.
Mark comments by moderators-acting-as-moderators with a flair or other highlighting.
On-site low-friction ways to post anonymously, with the option of leaking some non-identifying information like “I’ve been a LW user for >3 years with >1k karma”, or to ping specific LW users with “user X can vouch for my identity”, which user X could then confirm with a single click. Though I would not want this anonymity feature available on non-controversial posts.
A feature for a user to “request moderation for this post”, or “request stricter commenting guidelines” or to indicate “this post seems controversial to me” or something.
Of course such features don’t solve a problem by themselves, but they can help, and I’m more optimisic of attempts to improve a culture if the site infrastructure supports those attempts and incentivizes that improved culture.
Rewarding Exceptional Content
As noted in the LW book review bounty program, exceptional content on LW is potentially very valuable, so it makes sense to incentivize it. The karma system helps, but it’s not enough—as can be seen when extra incentives come into play, e.g. the extra reviews generated by the bounty program.
Some features beyond the karma system that could help here:
Reddit nowadays has a separate “awards” system which users can use to reward exceptional content. I don’t like the specific implementation at all—it’s full of one-upmanship of progressively more expensive awards, and posts with lots of awards just look cluttered—but one could imagine an implementation that would work here.
For instance, there are a number of active bounties on LW, but one could imagine a smaller-scale version of setting and rewarding bounties that would work better if built into the site, e.g. for Question posts (to reward the best answer), or just to gift someone money for writing a particularly important post or comment.
That said, this is the kind of thing that, if implemented suboptimally, could easily incentivize detrimental behavior, instead.
Or (like mentioned in the Controversies section), what about a system for users or mods to flag particularly high-quality or high-importance comments so they get some extra highlighting or something?
Related:
Because comments are much less discoverable than posts, lots of high-effort high-value comments, even if they’re very-high-karma, get lost in the masses of LW comments and are hard to find or refer to later on.
What could be done about that?
For instance, if users or mods see a comment thread of exceptional and enduring value, they could flag it as such (I already occasionally see follow-up comments of the form “This is good enough for a top-level post!”), and then others (volunteers or paid contributors) could turn the best ones of those into top-level posts, with karma going to the original posters.
(To end on a meta comment: I spent >2.5h on three high-effort comments in this thread, and would be disappointed if they got lost in the shuffle. Conversely, I’m more likely to make the effort in the future if I have a sense that it paid off in some way.)
- 6 Nov 2021 15:58 UTC; 33 points) 's comment on Speaking of Stag Hunts by (
- 7 Nov 2021 11:51 UTC; 13 points) 's comment on Speaking of Stag Hunts by (
- 6 Nov 2021 15:26 UTC; 8 points) 's comment on Speaking of Stag Hunts by (
Donated 40€. I was going to donate to MIRI or CFAR, and chose CFAR due to this Facebook discussion.