Improving on the Karma System

Raelifin14 Nov 2021 18:01 UTC

108 points

Site Meta Community LW Moderation Mechanism Design

TL;DR: I think it would improve LW to switch from the current karma system to one that distinguishes [social reward] from [quality of writing/thought] by adding a 5-star scale to posts and comments. I also explore other options.

Karma

Back in 2008, LessWrong was created by forking the codebase for Reddit. This was pretty successful, and served the community for a few years, but ultimately it became clear that Reddit’s design wasn’t a great fit for our community. In 2017 Habryka and others at (what is now) Lightcone rebuilt the site from scratch, improving not only the code and aesthetic, but also the basic design. The sequences were prioritized, short-form-posts and question-posts were invented, and the front-page was made into a curated collection of (ideally) high-quality writing.

But the primary mechanism that Reddit uses to promote content was kept: Karma.

Karma is a fairly easy system to understand: A post (or a comment, which I’ll be lumping in with top-level posts for brevity) has a score that starts around zero. Users can upvote a post to increase its score, or downvote it to decrease the score. Posts with a high score are promoted to the attention of other users, both by algorithms that decide the order of posts (often taking into account other factors, such as the time it was posted), and by auto-minimizing low-scoring posts or simply by the social-proof of big-number-means-good drawing the eye.

The simplicity/familiarity of karma is one of its major selling points. As Raemon points out, it takes work to evaluate a new system, and the weirder that system is, the more suspicious you should be that it’s a bad scheme. Voting is familiar enough that new users (especially in 2021!) will have no difficulty in understanding how to engage with the content on the site.

Giving positive/negative karma is also very low-effort, which is good in many circumstances! Low-effort engagement is a way to draw passive readers into participating in the community and feeling some investment in what they’re reading. I may not be able/interested to write a five-page rebuttal to someone’s sloppy argument, but I can sure as heck downvote it. Similarly, the opportunity to simply hit a button to approve of quality content can reduce a lot of noise of people patting each other on the back.

It feels good to get karma, despite the fake-internet-points thing. This encourages people to post and Make Number Get Big.

And perhaps, most of all, karma serves as a quantitative way of tracking what’s good and what’s not, both to create common knowledge across the site/community, but also to help the site promote good content and suppress bad content.

These are the primary reasons that Habryka had when carrying over the karma system onto the new LessWrong.

We can do better.

The Problems of Karma

Ben Pace recently said to me that karma is usually seen as pretty good, but not perfect. And “in a world where everyone agrees karma isn’t perfect, they’re less upset when it’s wrong.”

Karma is a popularity contest. It’s essentially democratic. LessWrongers understand that democracy isn’t perfect, but (in principle) the glaring imperfection of ranking things purely on popularity means that when a user sees a karma score they’ll understand it as only a very rough measure of quality.

But I don’t buy that this is the best we can do.

And I certainly don’t buy that we shouldn’t try to improve things.

The problems with karma, as far as I can see are:

The popularity of a post is not the same as its quality.
- A movie star might be naturally more popular than a curmudgeonly economist, but that doesn’t mean they’ll be better at running a country.
Some people are better at voting thanks to knowledge/wisdom/etc; which an egalitarian system suppresses.
- An investment fund democratically managed by eight random people and Peter Thiel will vastly under-perform compared to one managed solely by Thiel. Sometimes a crowd is less wise than an expert.
Karma is one-dimensional.
- If a post is informative but poorly written, should I upvote it? If it’s adding to the discourse, but also spreading misinformation, should I downvote it?
Karma is opaque.
- Something being highly up-voted doesn’t actually tell me much about it. It could be informative, funny, or simply be preaching to the choir.
- The guidelines about what to vote for (ie “vote for what you want promoted”) are very generic.
Karma score treats controversy as equivalent to ambivalence.
- This could be seen as a specific case of opacity being bad, but I think it’s worth calling out how a karma score doesn’t cleanly highlight controversy.
It can feel bad to be downvoted, discouraging users from posting.
- This may seem like a non-issue to frequent users of the site, especially those who frequently post. But I hope it’s easy to see the selection pressure, there. LessWrong has a reputation as a place where people are quite harsh, and I think this repels some (good) people, and the karma system contributes to this reputation.
Voting’s value is external/communal, with very little value captured directly by the voter.
- I think this problem isn’t very big in our case (for a variety of reasons), but it still seems worth listing.
Karma is prone to bandwagonning.
- Humans are biased to want to support the most popular/powerful/influential people in the tribe, and the karma signal reinforces this. I suspect that highly-upvoted posts become more highly upvoted by the nature of having lots of upvotes, even independent of all the other confounders (e.g. being prioritized by the algorithms).
- I’m less confident here when I think about how many contrarians we have, but on net I think I still believe this.
The same number means very different things depending on readership.
- Joe Biden received more votes during 2020 than any American politician ever. Does this mean he’s the most popular politician in history? No! He got more votes because there were more voters and high turnout. The raw number is far less informative than one might assume.
- Sudden influxes of readers can distort things even on a day-to-day basis.
Karma rewards fast-thoughts, and punishes challenging ideas.
- In a long video (relevant section), Vihart makes the point that if there are two types of readers on a site—one that skims and makes snap votes on everything, and one that carefully considers each point while only handing out votes after careful consideration—then because the skimmer can vote orders of magnitude more, they have (by default) orders of magnitude more influence on the reaction around a post. Things which immediately appeal to the reader and offer no challenge get a quick upvote, while subtle points that require deep thought can get skipped for being too hard.
- This phenomenon is why, I think, most comment sections on the web are trash. But it goes beyond just having fast thinkers be the lion’s share of respondents, because it also pressures creators to appeal to existing biases and surface-level thoughts. I’ve certainly skimmed more than a few egregiously long LW comments, and when I write in public some part of my mind keeps saying “No. You need to make it shorter and snappier. People are just going to skim this if it’s a wall of text, even if that wall of text is more nuanced and correct.”

If there are other problems that I’ve missed, please let me know in a comment.

Other Options

Many people have written about other systems for doing the things that karma is meant to do with fewer (or better) problems.

Eigenkarma

Habryka wrote in 2017:

I am currently experimenting with a karma system based on the concept of eigendemocracy by Scott Aaronson, which you can read about here, but which basically boils down to applying Google’s PageRank algorithm to karma allocation. How trusted you are as a user (your karma) is based on how much trusted users upvote you, and the circularity of this definition is solved using linear algebra.

And then in a 2019 update:

There are some problems with this. The first one is whether to assign any voting power to new users. If you don’t you remove a large part of the value of having a low-effort way of engaging with your site.
It also forces you to separate the points that you get on your content, from your total karma score, from your “karma-trust score” which introduces some complexity into the system. It also makes it so that increases in the points of your content, no longer neatly correspond to voting events, because the underlying reputation graph is constantly shifting and changing, making the social reward signal a lot weaker.
In exchange for this, you likely get a system that is better at filtering content, and probably has better judgement about what should be made common-knowledge or not.

Predicting/Modeling the Reader

Also from Habryka (2019):

[...]The basic idea is to just have a system that tries to do its best to predict what rating you are likely to give to a post, based on your voting record, the post, and other people’s votes.
In some sense this is what Youtube and Facebook are doing in their systems[...]

And problems:

The biggest sacrifice I see in creating this system, is the loss in the ability to create common knowledge, since now all votes are ultimately private, and the ability for karma to establish social norms, or just common knowledge about foundational facts that the community is built around, is greatly diminished.
I also think it diminishes the degree to which votes can serve as a social reward signal, since there is no obvious thing to inform the user of when their content got votes on. No number that went up or down, just a few thousand weights in some distant predictive matrix, or neural net.

These systems can also be frustrating to users because it’s unclear why something is being recommended, regardless of the common-knowledge issues.

FB/Discord Style Reacts

Both Habryka and Raemon have expressed interest in augmenting the karma system with a secondary tier of low-effort responses like angry, sad, heart, etc. These could be drawn from a small pool, like on Facebook, or a very broad pool, like Discord.

A major motivator here would be the idea that disentangling approval from all sorts of other nuanced responses would allow people to flag things like “I disagree with this, but am upvoting it because I think it’s bringing an important perspective” without adding the noise or effort of commenting.

A “This changed my mind” button or the like would also allow more common-knowledge about what’s important in LW culture.

Major issues with this proposal include:

Added complexity, both for the UI and the learning curve for newbies
Rounding off reactions to an oversimplified system of labels
It doesn’t actually solve many of the problems with karma, since it’d need to be in addition to a karma system

My Proposal

All the above stuff seems pretty good in some ways, but ultimately I think those solutions aren’t where we should go. Instead, let me introduce how I think LessWrong should change.

(To start, this could be implemented on a select-few posts, and gradually rolled out. I’ll talk more about this and other details in the last section, but I wanted to frame this as an proposed experiment. There’s no need to change everything all at once!)

Here’s the picture:

Specific design is flexible, of course. If you have a better design sense than me, speak up!

Each post (including comments) would have a “quality rating” instead of a karma score. This rating would be a 5-star scale, going between 0.5 and 5 stars, in half-star increments. All users can rate posts just like they’d rate a product on Amazon or whatever. On the front page, and other places where posts are listed but not intended to be voted on/rated, only the star rating is displayed.

In addition to the stars, on the post itself (where rating happens) would be a plus (+) button and a “gratitude number”. Pushing that plus button makes the number go up and is a way of simply saying “thank you for writing this”. On hover/longpress, there’s a tooltip that says “Gratitude from: …” and then a list of people who pressed the (+) button.

A hover/longpress on the star rating would give a barchart showing the distribution of ratings, a link with the text “(Quality Guidelines)”, and either text that says “This comment has not been reviewed.” or a link saying “This comment was reviewed by [Moderator]” that goes to a special comment by a moderator discussing the quality of the post.

To assist with clarity/transparency/objectivity, there would be a set of guidelines (linked to on the hover) as to how people should assess quality. I’m not going to presume to be able to name the exact guidelines in this post, but my suggestion is something like picking 5 main “targets” and assessing posts on each target. For example: Clarity (easy to read, not verbose, etc), Interestingness (pointing in novel and useful directions), Validity/Correctness (possessing truth, good logic, and being free from bias), Informativeness (contributing meaningfully, citing sources, etc), and Friendliness (being anti-inflammatory, charitable, fun, kind, etc). Then, each post can score up to one star for each, and the total rating is the sum of the score for each target.

My inner Ben Pace is worried that trying to make an explicit guide for measuring quality will be inflammatory and divisive. I suspect that it’s better to have a guide than no guide, and that this will decrease flame wars by creating common knowledge about what’s desired. I’m curious what others think (including the real Ben).

Now, the last major aspect of the proposal is that users don’t have equal weight in contributing to the quality rating. Instead, each user has a “quality judge reputation” (QJR) number (better names are welcome), which gives their ratings more weight, akin to how on the current LW a user’s votes are weighted by their karma. Notably, however, QJR does not go up from posting, but only increases as the user gives good ratings.

A rating is good if it’s close to what a moderator would give, if that moderator was tasked with carefully evaluating the quality of a post. In this way, my proposal is to move towards a system of augmented experts, where the users of the site are trying to predict what a moderator would judge. As long as the mods are fair and stick to the guidelines, this anchors the system towards the desired criteria for quality, and de-anchors popularity.

I’ll cover the details in the final section, but the basic idea for allocating QJR is that moderators have a button that says “give me a post to evaluate” which samples randomly across the site, with a weighting towards a-couple-days-old and highly-controversial. Once given that post, the moderator then evaluates its quality for themselves, writing up an explanation of their rating for the public. Then, the site looks at all the ratings given by users, and moves QJR from those whose ratings didn’t match the mod, to those whose ratings did. This nitty-gritty dynamics of this movement are specified in detail at the end of the post.

The Benefits and Costs

If the details of QJR and how it moves still doesn’t quite make sense, let me compare it with something that I think is very analogous: a prediction market. In a way, I am suggesting a system similar to that proposed by Vitalik Buterin and Robin Hanson. When a user rates a post, they are essentially making a bet as to how a reasonable person (the moderator) would assess the quality of that post. The star rating that a post currently has is something like a market price for futures on the assessment. If I think a post is being undervalued, I can make expected profit (in QJR, anyway) by rating it higher.

As a result, we can use the collective predictions/bets of users of the site to promote content, even though moderators will only be able to carefully assess a small minority of posts.

But, if all that sounds too complicated and abstract: I’m basically just saying that people should judge post quality on a 5-star scale and have their rating weighted by (roughly) how much time they’ve actively spent on the site and how similar their views are to the moderators.

The primary reason not to make a change like this, I think, is that UI clutter and new systems are really expensive to a userbase. Websites, in my experience, tend to rot as companies jam in feature after feature until they’re bloated with junk. But I expect this change to be pretty simple. Everyone knows how to rate something on a 5-star scale, and hitting a big plus symbol when you like something seems just as intuitive as upvoting.

There is some concern that making a quality assessment stops being low-effort, as users would need to think about the guidelines and the judgment of the average moderator. But I think in practice most people won’t bend over backwards to do the social modeling, and will instead gain an intuition for “LW quality” that they can then use to spot posts that are over/undervalued. (Again, I suggest simply running the experiment and seeing.)

Ok, so how well does this handle the primary goals of karma?

Serves as a good content filter? Yes!
Social reward? Yes!
- I recommend making the gratitude number the most prominent number attached to a person’s profile, followed by their post count (with average post quality in parens), comment count (again with parenthetical quality), and then QJR and number of wiki edits. I think the gratitude increase should be what a user sees on the top-bar of the site when they log back in, etc.
- Gratitude only goes up, and I think even if someone writes something that is judged to be low-quality, they’ll still get some warm-fuzzies from seeing a list of people who liked their post, and seeing Number Go Up.
Easy to understand? Yep!
- 5-star ratings and “Like” buttons are everywhere nowadays.
Low effort? I think so.
- It takes two clicks to both rate the quality and give thanks, rather than just one click for voting up/down, but mostly I expect to be able to give ratings and thanks without much effort.
Common Knowledge? Yes.
- By disentangling quality from appreciation we move towards a better signal of what ideal LessWrong content looks like.

Now let’s return to the ten problems identified earlier and see how this system fares.

Popularity =/= Quality? Yes!
- The gratitude button lets me say “Yay! You rock! I like you!” in a quantitative way, without giving the false impression that my endorsement means that the post was well-created.
Differential voting power? Yes!
- Thanks to QJR weighting ratings, veteran users who consistently make good bets will have a great deal of sway on which posts are prioritized.
Multi-dimensionality? Better than Karma, but still not great.
- Pulling out gratitude into its own number is a good start.
Transparency/opacity? Better than Karma, but still not great.
- Having a set of guidelines and an explanation of moderator decisions seems good for having a sense of why things have the quality rating that they do.
- Having gratitude be public also probably matters here, but I don’t have a good story for it.
Controversy =/= Ambivalence? Just as bad as Karma.
Risk of bad feels? Different than Karma, and potentially worse.
- I expect users, including new users, to enjoy seeing their gratitude number go up. But I expect many people to hate the idea of having a low-rated post.
- Ultimately I fear that having high community standards is just naturally linked to people fearing the judgment of the crowd. Gratitude is an attempt to soften that.
Externalities are internalized? Yes.
- “Voters” gain a resource by making good bets, so the same capturing-externalities dynamics as Futarchy apply.
Risk of bandwagonning? Likely about as bad as Karma.
Post score is timeless? Yes! Or at least much better than Karma.
- Because things are rated on a consistent scale, 5-star posts can theoretically be compared across time more consistently than high-karma posts.
- Gratitude will be apples-and-oranges, however.
Rewarding deep thoughts? Yes. Or at least much better than Karma.
- Because users are punished for giving bad ratings to posts, there is a natural pressure to keep people from skimming-and-rating quickly. Those people will naturally lose their QJR.
- The system still encourages people to write posts that get them a lot of gratitude, but low-quality + high-gratitude posts will be naturally de-prioritized by the site in ways I think are healthy.

I somewhat worry that the system will put too much pressure on posts to be high-quality. Throwaway comments on obscure posts become much more expensive when they lower an average score. Not sure what to do about that; suggestions are welcome.

In general, however, it seems to me that a system like this could be a massive improvement to how things are done now. In particular, I think it could make our community much better at concentrating our mental force towards Scout-Mindset stances over Soldier-Mindset voting when things get hard, thanks to the disproportionate way in which individuals can stake their reputation on a post being high/low-quality, and how the incentives should significantly reduce Soldier-ish pressures on writers and voters.

I recommend you skip/skim the next section if you’re not interested in the nitty-gritty. Thanks for reading! Tell me what you think in the comments, and upvote me if you either want to signal appreciation for the brainstorming or you approve of the idea. Surely nothing will go wrong, common-knowledge wise. 😛

Monotonous Details

To start, users with over 100 karma who have opted into experimental features on their settings page gain a checkbox when they create a new post. This checkbox defaults to empty and has a label that says “Use quality rating system instead of karma”. Those posts use stars+gratitude instead of karma, including for the comments. Once a post is published, the checkbox is locked.

All users gain a box on the settings page labeled as “Make my reactions anonymous” that defaults to unchecked. When checked, hitting the plus button adds to the gratitude score, but doesn’t reveal the user who pressed it.

Old posts don’t get ported over, and my proposal is simply to leave things heterogeneous for now. Down the line we could try estimating the quality of posts based on their karma, and move karma score into gratitude to make a full migration, but I also just don’t see that as necessary.

I think it’s important that the UI continue to communicate the rating of the broader community even after a user has rated a post. That’s why in my mockup I chose full star color for community rating and outlines for the user rating.

New users start with no QJR. This prevents people from creating new accounts for the purpose of abusing the rating system. Users accumulate QJR by rating posts (keep reading for details on that). Existing users are given QJR retroactively based on the logarithm of how many posts they’ve up/downvoted.

Comments are by default ordered by quality, but can be reordered by time or by gratitude. I’m not sure exactly how the site should compare karma scores with quality ratings for ordering top-level posts “by Magic”. Possibly this would require creating an estimator for karma-given-quality or vice-versa.

When a post is made, the site automatically submits two “fake” ratings of 1.75 and 3.75 with a weight of 1 QJR each (note: this averages 2.75, which is rounded to 3 and is the average on a 5-star scale when 0-stars isn’t allowed). These fake ratings helps to prevent the first rating from having undue influence, and also makes things nice for the edge-cases. These fake ratings are not shown on the rating histogram, but otherwise count as ratings for computing the weighted average for the community.

The community rating for a post is the weighted arithmetic mean, i.e. the sum of all ratings multiplied by those users’ current QJR divided by the total QJR of all participating users. Community ratings are rounded to the nearest half-star for visual purposes.

EDIT: GuySrinivasan points out in the comments that many people are tempted to put down as extreme a rating as they can in order to sway the average more, even if their rating is more extreme than they truly believe. In light of this, I now think that the system would be best as a weighted median, rather than a weighted mean. The consequence of this would be that 1-star ratings are exactly as strong as 3-star ratings at moving the average, assuming the current community rating is above 3-stars.

QJR Gains/Losses

Users do not gain or lose QJR by rating posts unless a post they’ve rated is reviewed by a moderator. This prevents users who rate lots of posts, especially obscure/old posts from accumulating QJR.

When a moderator submits a review score, the site computes a 1:1 bet with each user who rated the post. To do so, it models two normal distributions based on the ratings that the post had gotten when the user rated it. One normal distribution $C (x)$ has the mean and variance of the community rating not including the user’s vote, and the other Gaussian $D (x)$ has the mean and variance including the user’s vote. The bet is to whether the moderator review is above or below the intersection $y$ of $C$ and $D$ . (Technically there are two intersections; it’s the one in between the means.)

C (x) = \frac{1}{\sqrt{v_{c}} \sqrt{2 π}} e^{- \frac{1}{2} (\frac{x - μ_{c}}{\sqrt{v_{c}}})^{2}}; D (x) = \frac{1}{\sqrt{v_{d}} \sqrt{2 π}} e^{- \frac{1}{2} (\frac{x - μ_{d}}{\sqrt{v_{d}}})^{2}}

e^{l o g (\frac{1}{\sqrt{v_{c}}}) - \frac{1}{2} (\frac{y - μ_{c}}{\sqrt{v_{c}}})^{2}} = e^{l o g (\frac{1}{\sqrt{v_{d}}}) - \frac{1}{2} (\frac{y - μ_{d}}{\sqrt{v_{d}}})^{2}}

l o g (\frac{1}{v_{c}}) - (\frac{y - μ_{c}}{\sqrt{v_{c}}})^{2} = l o g (\frac{1}{v_{d}}) - (\frac{y - μ_{d}}{\sqrt{v_{d}}})^{2}

\frac{(y - μ_{d})^{2}}{v_{d}} - \frac{(y - μ_{c})^{2}}{v_{c}} = l o g (\frac{v_{c}}{v_{d}})

And in standard quadratic form:

\frac{y^{2}}{v_{d}} - \frac{2 y μ_{d}}{v_{d}} + \frac{μ_{d}^{2}}{v_{d}} - (\frac{y^{2}}{v_{c}} - \frac{2 y μ_{c}}{v_{c}} + \frac{μ_{c}^{2}}{v_{c}}) - l o g (\frac{v_{c}}{v_{d}}) = 0

(\frac{1}{v_{d}} - \frac{1}{v_{c}}) y^{2} + 2 (\frac{μ_{c}}{v_{c}} - \frac{μ_{d}}{v_{d}}) y + (\frac{μ_{d}^{2}}{v_{d}} - \frac{μ_{c}^{2}}{v_{c}} - l o g (\frac{v_{c}}{v_{d}})) = 0

For example, let’s say that a user with 3 QJR makes a post and offers a 4-star review as the first real review of their post. The community rating for the post prior to the user’s rating was $μ_{c} = (1.75 * 1 + 3.75 * 1) / (1 + 1) = 5.5 / 2 = 2.75$ and it had a variance of $v_{c} = ((1.75 - 2.75)^{2} * 1 + (3.75 - 2.75)^{2} * 1) / (1 + 1) = 1$ . After the user’s rating, the weighted mean has shifted to $μ_{d} = (5.5 + 4 * 3) / (2 + 3) = 17.5 / 5 = 3.5$ and the variance has become $v_{d} = ((1.75 - 3.5)^{2} * 1 + (3.75 - 3.5)^{2} * 1 + (4 - 3.5)^{2} * 3) / (1 + 1 + 3) = 0.775$ . We can then calculate $y \approx 3$ . Thus the site makes a bet with the user that says “I bet you that the moderator gives this post a rating of 3-stars or less” and the user takes them up on it.

The site then calculates how much of the user’s QJR to wager based on the Kelly criterion, where we estimate the odds of (user) success by approximating the fraction of area of the Gaussian that “is a win”. (In our example, $y < μ_{d}$ , so we find the probability of winning by taking $\int_{y}^{\infty} D (x) d x$ . If $y > μ_{d}$ , then we instead approximate $\int_{- \infty}^{y} D (x) d x$ .) The details of taking approximations here and getting a probability out are messy and dumb, so I’m skipping them. In our example the modeled probability turns out to be approximately 71.5%. Thus the Kelly criterion says the optimal wager is $2 * 71.5 % - 100 % = 43 %$ , which would be 1.29 QJR, since the user currently has 3 QJR.

The minimum a user can wager is 1 QJR; if they don’t have enough, they are gifted how ever much QJR is needed to make the bet. In this way, a user with 0 QJR should make as many ratings as they can, because even though they have no weight, and thus don’t impact the community rating, they only have upside in expected winnings. (With the exception that if they win some QJR, then poor bets that have already been made will expose them to risk.)

QJR is not conserved. The site has an unlimited supply, and because many comments will obviously be higher quality than 3-stars, users will be able to earn easy QJR by making posts and rating them. This is a feature meant to encourage participation on the site, and reward long-term judges. There’s also a bias where the people who make posts and rate them highly get more QJR the higher their quality is, which is obviously good. The flip side is that a person can theoretically game the system by making low quality posts and immediately rating them low, but this has a self-sabotaging protective effect where the site will de-prioritize such posts and make others unlikely to read them, thus unlikely to rate them, thus unlikely to have those posts selected for moderator review.

If a user changes their rating before another user has rated a post, only the second rating is used. If a user changes their rating on a post after others have also rated it, the user that changed their mind makes both bets, but their net QJR gain/loss is capped at the maximum gain/loss of any particular bet. This is to prevent gaming the system by re-rating the same post many times. Only the most recent rating is counted for the purposes of calculating the (current) community rating, however.

Posts can be rated my (different) moderators multiple times. When two moderators review a post, both of their review comments are linked to in the star-rating hover/longpress tooltip. When a new moderator reviews a post that has already been reviewed by a mod, only the reviews since the last mod review are turned into bets. This means that even after a moderator has reviewed a post, there is still incentive for users to rate it.

I picked the above system of making bets so as to incentivize users to rate a post with their true expectation of what a moderator would say. I think if a user expects moderator judgments to be non-Gaussian, they can rate things strategically and gain some QJR over the desired strategy, but I personally expect moderator decisions to be approximately Gaussian and don’t think this is much of an issue.

I also don’t think that one needs to clip the distributions around 0.5 and 5 in order for things to work right or to model the way that moderators will only rate in half-star increments, but I haven’t run exhaustive proofs that things actually work as they should. If someone wants to try and prove things about the system, I’d be grateful.

Moderator Interface

Moderators are just users, most of the time. They have QJR and can rate posts, as normal.

Moderators are also encouraged to interface with a specific part of the site that lets them do quality reviews. In this interface they are automatically served a post on LessWrong, sampled randomly with a bias towards posts that have been reviewed, where the reviews differ from one-another, with a mild bias towards posts made a few days ago. I think it is important that moderators don’t choose which posts to review themselves, to reduce bias.

As part of a review, a moderator has to give a star-rating to a post, and explain their judgment in a special comment. If a moderator has already given a quality rating to a post, the moderator only needs to write the comment. When giving a review, the moderator’s QJR doesn’t change.

By default I think moderator quality-comments should be minimized, with the moderator able to check a box when they make their comment that says something like “Don’t minimize this quality review; it contributes to the discussion.” Moderator comments explaining their quality review should be understood to be an objective evaluation of the post, rather than an object-level response to the post’s content.

Moderator quality reviews don’t disproportionately influence the rating of a post—the overall star rating of a post is purely a measure of what the community as a whole has said. (Moderators still have QJR, and their review still contributes, as weighted by their QJR.)

The sampling system ignores posts where the moderator has a conflict of interest. This defaults to only posts written by the mod, but can be extended to blacklisting the moderator from reviewing posts by a set of other users (e.g. romantic partners, bosses, etc).

That’s All, Folks

If there are details that I’ve omitted that you are curious about, please leave a comment. I also welcome criticism of all kinds, but don’t forget that the perfect can be the enemy of the good! My stance is that this proposed system isn’t perfect, but it’s superior to where we currently are.

Oh, also, I’m a fairly competent web developer/software engineer, and I potentially have room for some side-work. If we’re bottlenecked on developer time, I can probably fix that. ¯\_(ツ)_/¯

What links here?