Boo votes, Yay NPS

TL;DR

Many votes on LW are “boos” and “yays”, and consequently they aren’t very useful for determining what is worth reading. A modified version of a Net Promoter Score (NPS) on each post may provide a better metric for determining read worthiness.

Motivation

It’s come up a couple time in my recent comments that I’ve expressed a theory that votes on LW, AF, and EAF are “boos” and “yays”. I have an idea about how we could do better assuming the purpose of votes is not to jeer and cheer but to provide information about the post, specifically how much the post is worth reading, so I’m finally writing it up so others can, yes, boo or applaud my effort, but more importantly so we might discuss ways to improve the system. If you don’t like my proposal and agree we could do better than votes, I encourage you to write up your ideas and share them.

So, there are many things votes could be for, but I view votes as a solution to a problem, so what’s the problem votes are trying to solve? The number one question I want answered about every post is some version of “should I read this?”. There’s subtly different ways to phrase this question: “is this worth engaging with?”, “should I read this carefully or just skim it?”, “is this worth my time and energy?”, etc.

I want a solution to this problem because when I come to LW/​AF/​EAF every day I want a reliable signal about what it’s worth me spending my energy engaging with (I generally don’t want to just read, but also comment, discuss, understand, grow). Right now votes don’t provide this to me, as I’ll explain below, but they do provide other things. So keep in mind that my goal in this proposal is primarily to solve the particular problem of “should I read this?” and not the many other problems votes might be solutions to like “how to deliver simple positive/​negative feedback?”, “how can I express my pleasure or displeasure with a post?”, “how do we determine status within the forum?”, or “how do we increase platform engagement?”. I don’t ignore these other purposes, but I take them as secondary—and maybe there’s other purposes I forgot to list and so forgot to take into account! The point being I want it to be clear I’m making a proposal that’s trying to solve a particular problem, and if you complain “but wait, it doesn’t solve this other problem” my response will be “yep, sure doesn’t”, so any discussion of this sort should be sure to explain why we should care about this other thing.

Okay, all that out of the way, let’s talk about votes, and then NPS.

Boo Votes

Up/​down voting is very simple and has a long history on LW, thanks to its presence on Reddit (from which, if I recall correctly, the original forum’s codebase was forked). It has a number of nice features, and LW has made them nicer:

  • everyone knows how it works

  • it lets you express yourself in two ways (unlike on Twitter where the only option is to vote up something, and a “downvote” requires writing your own tweet expressing dislike)

  • the aggregate votes on a post can be used to generate a user score (karma)

  • the user score can be used to meter access to various site features

  • votes are proportional the status of users, as measured by karma

And of course lots of popular forums of all sorts use votes: Facebook, Twitter, Reddit, Tumblr. Even when votes aren’t present something like voting is in the form of “reacts” where a person can choose from a list of named images/​sounds/​etc. to express something and that something generally can include a simple vote (usually using a universally recognized vote react, like thumbs up/​down); cf. Slack, Discord, most massively multiplayer games, Twitch. So it would seem that people like votes a lot and they are used to some effect in lots of places.

Unfortunately for our purposes of trying to figure out “should I read this?”, most of what votes are doing is only indirectly engaged with this question. Votes, especially if we think of them as a degenerate case of reacts, are more used to express an opinion on the content than to determine whether or not the content is worth reading, and when there are two voting options they tend to be rounded off to down = boo and up = yay. If you have any doubts about this, just spend more time on social media and let me know if you still disagree in general, i.e. you disagree that most people do this, not that you don’t do this or your small group of friends don’t do this.

On that point of using votes for something else, it’s tempting to think “hey, this is LW; we’re rational AF; we know better than to use votes as boos and yays”. To which I say “please, tell us more about how you’ve managed to create a community of perfectly rational agents”.

Joking aside, my point is that I’ve been on the receiving end of all kinds of voting patterns, so I’ve gotten a chance to see how people use votes on LW. Further, I’ve talked to people about my posts (either in comments or elsewhere) and in some cases explicitly learned how they voted on my posts and why, and it’s lead me to a few conclusions about how people use votes here.

  • Sometimes votes are attempts to increase or decrease visibility of something, regardless of how someone feels about what’s in a post or comment.

  • Sometimes votes are a genuine expression of “you should/​shouldn’t read this”.

  • Most often votes say “yay, I like this” or “boo, I don’t like this” in response to one of several thing:

    • like/​dislike the author

    • like/​dislike the subject matter

    • like/​dislike the content

    • like/​dislike the presentation

The result is what I consider a lot of voting anomalies from the perspective of trying to answer the question “should I read this?”. Some claims of things I’ve seen (I won’t link specific posts because I don’t want to risk applying shame to anyone for what happened to their post in the votes, and also it’s a lot of work to dig up all the examples that caused me to form these beliefs):

  • Low content/​quality posts voted highly because people like the author

  • High content/​quality posts voted lowly because people dislike the author

  • Posts voted down for heresy, regardless of quality

  • Posts voted up for applause lights, regardless of quality

My personal experience is mainly with writing heretical posts of good quality such that I get more up votes than down but also a lot of down votes (maybe 13 down and 23 up), and it caused me to pay more attention to voting patterns, engage more with low score posts, and try to figure out just what was going on when posts got low scores that I gave upvotes. What I learned lead me to surmise what I’ve presented above.

So votes seem to be largely used to signal approval and disapproval of posts, which I suggest is only weakly correlated with telling me whether or not I should read a post. As a result I basically ignore votes and have to skim everything to figure out where the good stuff is. But what if we could do something better...?

Yay NPS

Net Promoter Score (NPS) is a simple metic many companies use to evaluate questions of customer satisfaction. To calculate it people are asked “how likely are you to recommend our product or service to a friend or colleague?” and asked for a number from 0 to 10, 0 meaning “not likely at all” and 10 meaning “already have”. I really like NPS because it asks people to imagine recommending something and then asking them for something like a probability of how likely they are to do it, although I’ve never seen a version that did this explicitly.

Responses are then converted into a score by first segmenting respondents into detractors, passives, and promoters, and then taking percent promoters minus percent detractors. I find this metric to be of limited value, and more prefer to engage directly with the full distribution of responses, but if you really needed a single scalar this is one way to get it.

What I imagine doing is asking people to score posts like this:

How likely are you to recommend a friend or colleague read this post?

|--0%-------------50%-------------100%--|

So they are asked the question and given a slider to mark their likelihood, which includes 100% because they may have already shared it (but there’s probably some UI work here to make it clear that 100% and 99% are drastically different responses).

Does this answer our question “should I read this?”? I think it may do a better job than votes, to be sure. Rather than an ambiguous vote, people are now at least being asked to respond directly to a question and give their response to it. Also, we could better use the distribution of responses to make reading decisions. For example, heretical posts might get bimodal distributions of scores, with clusters of strong detractors and strong promoters, and maybe you choose to read a post when it has at least n promoters, regardless of detractors. Maybe you choose to filter out posts with more than n detractors because you don’t like controversy or low quality content. Maybe you filter on NPS or mean or median or something else, or sort based on it. And every post, rather than showing a simple number for its score like we do now you show a box-plot or some other suitable visualization showing the distribution of responses.

Now unfortunately NPS is more complicated than votes, so it may work against other problems people are trying to solve with votes. How does NPS help us deal with the problems addressed by karma? How do we prevent NPS from devolving into a binary where people always vote 100% to upvote and everything else is a downvote (the eBay/​Uber/​Lyft voting problem, where anything less than 5-stars is considered a downvote)? And do we measure comment quality with NPS, or keep votes there, or do something else?

I also don’t really expect the LW team to drop everything and implement NPS. Heck, if I were working on LW I probably wouldn’t jump all over this. My goal in writing this, maybe more than anything, is to get us thinking about how to better answer the question “should I read this?” and I wanted to provide at least one solution I’ve thought of and think could be better in some ways. I mostly think we could do more to give better signals of quality on LW and make them less distorted by and engaged with other signals people try to send with votes.

So, what do you think of the current state of votes? What problems do you want to solve on LW that votes or something else may be solutions to? And how would you improve votes or something else to solve those problems?