# Good Heart Week: Extending the Experiment

Yesterday we launched Good Heart Tokens, and said they could be exchanged for 1 USD each.

Today I’m here to tell you: this is actually happening and it will last a week. You will get a payout if you give us a PayPal/​ETH address or name a charity of your choosing.

Note that voting rings and fundraising are now out of scope, we will be removing and banning users who do that kind of thing starting now. More on this at the end of the post.

Also, we’re tentatively changing posts to be worth 4x the Good Heart Tokens of comments (Update: we decided on 3x instead, and it just went live, at around 4:20PM PT on April 2nd).

## Why is this experiment continuing?

Let me state the obvious: if this new system were to last for many months or years, I expect these financial rewards would change the site culture for the worse. It would select on pretty different motives for being here, and importantly select on different people who are doing the voting, and then the game would be up.

(Also I would spend a lot of my life catching people explicitly trying to game the system.)

However, while granting this, I suspect that in the short run giving LessWrong members and lurkers a stronger incentive than usual to write well-received stuff has the potential to be great for the site.

For instance, I think the effect yesterday on site regulars was pretty good. I’ll quote AprilSR who said:

I am not very good at directing my monkey brain, so it helped a lot that my System 1 really anticipated getting money from spending time on LessWrong today.

...There’s probably better systems than “literally give out $1/​karma” but it’s surprisingly effective at motivating me in particular in ways that other things which have been tried very much aren’t. I think lots of people wrote good stuff, much more than a normal day. Personally my favorite thing that happened due to this yesterday was when people published a bunch of their drafts that had been sitting around, some of which I thought were excellent. I hope this will be a kick for many people to actually sit down and write that post they’ve had in their heads for a while. (I certainly don’t think money will be a motivator for all people, but I suspect it is true for enough that it will be worth it for us given the Lightcone Infrastructure team’s value of money.) I’m really interested to find out what happens over a week, I have a hope it will be pretty good, and the Lightcone Infrastructure team has the resources that makes the price worth it to us. So I invite you into this experiment with us :) ## Info and Rules Here’s the basic info and rules: • Date: Good Heart Tokens will continue to be accrued until EOD Thursday April 7th (Pacific Time). I do not expect to extend it beyond then. • Scope: We are no longer continuing with “fun” uses of the karma system. Voting rings, fundraising posts, etc, are no longer within scope. Things like John Wentworth’s and Aphyer’s voting ring, and G Gordon Worley III’s Donation Lottery were both playful and fine uses of the system on April 1st, but from now I’d like to ask these to stop. • Moderation: We’ll bring mod powers against accounts that are abusing the system. We’ll also do a pass over the votes at the end of the week to check for any suspicious behavior (while aiming to minimize any deanonymization). • Eligible: LW mods and employees of the Center for Applied Rationality are not eligible for prizes. • Votes: Reminder that only votes from pre-existing accounts are turned into Good Heart Tokens. (But new accounts can still earn tokens!) And of course self-votes are not counted. • Cap Change: We’re lifting the 600 token cap to 1000. (If people start getting to 1000, we will consider raising it further, but no promises.) • Weight Change: We’re tentatively changing it so that votes on posts are now worth 4x votes on comments. (Update: we decided on 3x instead, and it just went live, at around 4:20PM PT on April 2nd.) • Getting Money: To receive your funds, please log in and enter your payment info at lesswrong.com/​payments/​account. The minimum amount you can get is$25.

• Catch-all: From some perspectives this is a crazy experiment, so I want to acknowledge up-front that if something pretty bad and unexpected happens we’ll do what seems best to us. Nothing is certain, and there are some worlds where we don’t end up paying out things that some of you had hoped. We’ll adapt as we go.

Go forth and write excellent posts and comments!

• Add me to the list of people for whom the incentives go the exact other way: I see this obviously-April-fool thing being done with LW karma, I see it’s even called “Goodhart”, and it all makes me feel rather ugh about posting or commenting. (This isn’t about the money for me, which I see it is for a couple of other people on Team Disincentive; I thought yesterday that more likely than not the bit about paying out money was part of the joke, and that didn’t stop me feeling ugh about the whole thing.)

To be clear, this doesn’t mean I think it was a bad April fool, and it doesn’t mean I think continuing it a few days would have bad outcomes overall. Almost all of what I do is commenting rather than posting, and while I think my comments are generally pretty good they’re never game-changing; if the tradeoff is that I comment a lot less and someone else posts a bit more then it’s probably a good tradeoff.

• Yeah, to be clear, I would also feel pretty terrible about the site if this is the kind of thing that would happen normally. But I feel like April Fool’s is a good time to gather data on some more out-there ideas, and my guess is a short-lived experiment is unlikely to destroy the whole site, or leave too much lasting damage.

• In case there’s any doubt, I repeat: I’m not saying this was a bad idea, either as an April fool or as a thing to continue for a few more days.

• Since it turns out that there is a bit of real money on the line here, I am consciously attempting to go against that ugh-field and be reasonably active on LW while the experiment is ongoing. It is possible that the net effect is that I am commenting more. (I don’t think I have anything I’ve contemplated posting that’s little enough effort to get done during the week.)

• I cross-posted two posts from the EA forum that were written and published before the start of the experiment. Let me know if that’s out of scope and I can take them down or you can make them not count.

• I think this is great—the monetary incentive yesterday inspired me to write a new post and post a draft from a year ago, and tonight inspired me to write a post I’ve had in my head for months. (In the spirit of Goodhart’s Law, I’m waiting till 2:01am CST to post it, so that it will show up in the list of Sunday posts). Cheers!

• I am here from the future with an urgent message: this incentivized a lot of posts that were kind of shit.

• If it was urgent, why would you come back and say that at the end of the week rather than the beginning? There’s not much we can do about it now.

• I mean, he couldn’t really have been from the future then, could he?

• snort

• I do agree with you. What would have been a better incentive, or do you think the prior system was better?

Personally, it actually motivated me to be a bit more active and finish my post. But I have also noticed a bit of “farming” for points (which was very much a consideration I’m sure, hence “good heart token”).

I think the reason it appealed to me was that the feedback mechanism was tangible and (somewhat) immediate. Contrast that with, say, pure upvotes, which feel non-impactful to me.

I think an incentive is good, but one that is less than pure dollar values and more than ego-filling-warm-fuzzy-feeling upvotes.

• Is there a way I can completely opt out of this, such that I do not have to concern myself as to what precisely is or is not considered a voting ring /​ etc?

To be clear:

Moderation: We’ll bring mod powers against accounts that are abusing the system. We’ll also do a pass over the votes at the end of the week to check for any suspicious behavior (while aiming to minimize any deanonymization).

This is a strong negative for me, to the point I am considering leaving the site as a result.

• I appreciated your comment on my post earlier today! Don’t leave!

• (Responding to your “To be clear” edit.)

I see.

Insofar as your point is about deanonymization, I’ll say that so far in the LW team we’ve tried hard to not share data about identities within the team and definitely not outside of it. When I wrote that sentence I was primarily meaning “while aiming to minimize any deanonymization within the team” e.g. if we have to, one person checks an identity and doesn’t tell it to the other team members. I almost didn’t even think about public deanonymization as a possible outcome. When I said “pass over the votes” I’m primarily expecting us to look at userIDs which are long number/​digit strings and avoid looking at actual usernames.

I only expect us to look into identities if something obvious and egregious is happening. I’m not sure what to say other than this is how it’s kind of been most of the time (e.g. this has happened in the past when we’ve looked into sockpuppets) and I don’t think you need to worry about this if you’re not trying to game the system and you’re not actively trying to exchange votes.

Same goes for other mod powers (e.g. bans), I think it’s very unlikely you’ll be affected by these things this week unless you’re consciously engaging in vote trading.

(I’ve also opened a PM chat with you if there’s things you’d like to say/​ask privately.)

• This is a strong negative for me, to the point I am considering leaving the site as a result.

Just for the record, we generally monitor anonymized voting activity and do very rare deanonymizing spot-checks if things look suspicious, since LessWrong has a long history of having to deal with people abusing the voting system in one way or another. I don’t think the above really implies anything out of the ordinary, so if this is an issue, maybe it’s been an issue for a long time?

• If I had to summarize: Good Heart normalizes using this for ‘trivial’ purposes to an extent that I am uncomfortable with.

• Does it? I don’t think it’s a trivial concern if someone is trying to just take hundreds or potentially thousands of dollars. That’s the sort of concern that I think is pretty reasonable for a LW mod to check anonymized (and if it’s looking pretty bad some deanonymized) voting records. As a datapoint I expect financial companies like Stripe would do this stuff for much lower levels of fraud.

A nearby argument you could make is “having a week where you substantially increase the reward for attacking the site is not worth it because the measures you’ll need to use to defend the site will infringe upon users’ privacy”. To which I say this is a cost-benefit calculation, I think the value is quite high, and I think it’s quite likely that nobody will try any funny business at all and nobody’s privacy will be infringed upon by the mod team.

(Your discomfort is noted, and I’ll certainly weigh it and others’ discomfort when (a) reflecting on whether this experiment was net valuable and (b) when thinking about running other experiments in the future.)

• Ah. Here might be some of the issue.

Given that this was introduced on April 1st, I have a strong prior that it was an April Fool’s joke.

If it was, you’re sending a signal that you’re willing to dox people for an April Fool’s joke.

If it wasn’t, you picked a very unfortunate time to do something serious.

• I think it’s pretty reasonable to choose to do something a little out-there /​ funny on April Fool’s, even if there are additional more serious reasons to do it.

• I’d argue the exact opposite.

Bayesian reasoning means that April Fool’s is by far the worst day of the year to do something that you wish to be taken as not purely a joke, especially something that is also a little out there /​ funny.

• I think having a schelling day for trying weird stuff is good, and April Fool’s day seems fine. I don’r have nearly as strong a feeling as you seem to that April Fool’s jokes are never partially serious.

• When you say that “Bayesian reasoning means that April Fool’s is by far the worst day of the year [for an experiment like this]”, what do you mean? I expect you mean something relating to reasoning about intentions during April Fool’s (and that this lack of clarity amongst commenters is a negative), but the specifics are unclear to me. Your more expansive post above details some of the problems you have with this experiment, but doesn’t relate back to Bayes in any way I can identify.

• The probability that it’s just a joke is higher on April Fool’s.

is pretty large (note that “fake” means “not(real)”, which makes this a Bayes Ratio.

• I see! You thought it was just a joke. That makes way more sense.

Still I don’t quite get this. I don’t get why serious and playful are supposed to be separate magesteria, never to interact. Like, if someone cheats a lot when we play board games, then I think of them as more likely to be a cheater in real life too. You could say “why would you connect the playful and the serious” and I’d be like “they’re the same person, this is how they think, their character comes across when they play”. Similarly, I think there’s something silly/​funny about making good heart tokens and paying for them on April First. And yet, if someone tries to steal them, I will think of that as stealing.

But yeah, noted that you assumed it was a joke. Not the first time this has happened to me.

• You could say “why would you connect the playful and the serious” and I’d be like “they’re the same person, this is how they think, their character comes across when they play”.

This feels close to a crux to me. Compare: if you were in a theater troupe, and someone preferred to play malicious characters, would you make the same judgment?

So, it’s not a question of “playful” versus “serious” attitudes, but of “bounded by fiction” versus “executed in reality”. The former is allowed to leak into the latter in ways that are firmly on the side of nondestructive, so optional money handouts in themselves don’t result in recoil. But when that unipolar filter is breached, such as when flip-side consequences like increased moderator scrutiny also arrive in reality, not having a clear barrier where you’ve applied the same serious consideration that the real action would receive feels like introducing something adverse under false pretenses. (There is some exception made here for psychological consequences of e.g. satire.)

The modern April Fools’ tradition as I have usually interpreted it implies that otherwise egregious-seeming things done on April Fools’ Day are expected to be primarily fiction, with something like the aforementioned unipolar liminality to them.

Similarly, I think there’s something silly/​funny about making good heart tokens and paying for them on April First. And yet, if someone tries to steal them, I will think of that as stealing.

Combining this with the above, I would predict TLW to be much less disturbed by a statement of “for the purpose of Good Heart tokens, we will err on the broad side in terms of non-intrusively detecting exploitative behavior and disallowing monetary redemption of tokens accumulated in such a way, but for all other moderation purposes, the level of scrutiny applied will remain as it was”. That would limit any increase in negative consequences to canceling the positive individual consequences “leaking out of” the experiment.

The other and arguably more important half of things here is that the higher-consequence action has been overlaid onto an existing habitual action in an invasive way. If you were playing a board game, moving resource tokens to your area contrary to the rules of the game might be considered antisocial cheating in the real world. However, if the host suddenly announced that the tokens in the game would be cashed out in currency and that stealing them would be considered equivalent to stealing money from their purse, while the game were ongoing, I would expect some people to get up and leave, even if they weren’t intending to cheat, because the tradeoff parameters around other “noise” risks have suddenly been pulled out from underneath them. This is as distinct from e.g. consciously entering a tournament where you know there will be real-money prizes, and it’s congruent with TLW’s initial question about opting out.

For my part, I’m not particularly worried (edit: on a personal level), but I do find it confusing that I didn’t see an explicit rule for which votes would be part of this experiment and which wouldn’t. My best guess is that it applies when both the execution of the vote and the creation of its target fall within the experiment period; is that right?

• Compare: if you were in a theater troupe, and someone preferred to play malicious characters, would you make the same judgment?

So, it’s not a question of “playful” versus “serious” attitudes, but of “bounded by fiction” versus “executed in reality”.

It’s not anything like a 1:1 relationship, but I do indeed infer some information of that sort. I think people on-average play roles in acting that are “a part of them”. It’s easy to play a character when you can empathize with them.

There are people I know who like to wear black and play evil/​trollish roles in video games. When I talk to them about their actual plans in life regarding work and friendship, they come up with similarly trollish and (playfully) evil strategies. It’s another extension of themselves. In contrast I think sometimes people let their shadows play the roles that are the opposite of who they play in life, and that’s also information about who they are, but it is inverted.

Again, this isn’t a rule and there’s massive swathes of exceptions, but I wouldn’t say “I don’t get much information about a person’s social and ethical qualities from what roles they like to play in contexts that are bounded-by-fiction”.

However, if the host suddenly announced that the tokens in the game would be cashed out in currency and that stealing them would be considered equivalent to stealing money from their purse, while the game were ongoing, I would expect some people to get up and leave, even if they weren’t intending to cheat, because the tradeoff parameters around other “noise” risks have suddenly been pulled out from underneath them.

Right. Good analogy.

I definitely updated a bunch due to TLW explaining that this noise is sufficiently serious for them to not want to be on the site. It seems like they’ve been treating their site participation more seriously than I think the median regular site-user does. When I thought about this game setup during its creation I thought a lot more about “most” users rather than the users on the tails.

Like, I didn’t think “some users will find this noisy relationship to things-related-to-deanonymization to be very threatening and consider leaving the site but I’ll do it anyway”, I thought “most users will think it’s fun or they’ll think it’s silly/​irritating but just for a week, and be done with it afterward”. Which was an inaccurate prediction! TLW giving feedback rather than staying silent is personally appreciated.

It’s plausible to me that users like TLW would find it valuable to know more about how much I value anonymity and pseudonymity online.

• For example about around two years ago I dropped everything for a couple days to make DontDoxScottAlexander.com with Jacob Lagerros, to help coordinate a coalition of people to pressure the NYT to have better policies against doxing (in that case and generally).

• When a LW user asked if I would vouch for their good standing when they wanted to write a post about a local organization where they were concerned about inappropriate retaliation, I immediately said yes (before knowing the topic of the post or why they were asking) and I did so, even while I later received a lot of pressure to not do this, and ended up myself with a bunch of criticisms of the post.

• And just last week I used my role as an admin to quickly undo the doxing of a LW user who I (correctly) suspected did not wish to be deanonymized. (I did that 5 mins after the comment was originally posted.)

After doing the last one I texted my friend saying it’s kind of stressful to make those mod calls within a couple minutes close to midnight, and that there’s lots of reasons why people might think it mod overreach (e.g. I edited someone else’s comment which feels kind of dirty to me), but I think it’s kind of crucial to protect pseudonymous identities on the internet.

(Obvious sentences that I’m saying to add redundancy: this doesn’t mean I didn’t make a mistake in this instance, and it doesn’t mean that your and TLW critiques aren’t true.)

• Congratulations[1]. You have managed to describe my position substantially more eloquently and accurately than I could do so myself. I find myself scared and slightly in awe.

Combining this with the above, I would predict TLW to be much less disturbed by a statement of “for the purpose of Good Heart tokens, we will err on the broad side in terms of non-intrusively detecting exploitative behavior and disallowing monetary redemption of tokens accumulated in such a way, but for all other moderation purposes, the level of scrutiny applied will remain as it was”.

Correct, even to the point of correctly predicting “much less” but not zero.

The other and arguably more important half of things here is that the higher-consequence action has been overlaid onto an existing habitual action in an invasive way. If you were playing a board game, moving resource tokens to your area contrary to the rules of the game might be considered antisocial cheating in the real world. However, if the host suddenly announced that the tokens in the game would be cashed out in currency and that stealing them would be considered equivalent to stealing money from their purse, while the game were ongoing, I would expect some people to get up and leave, even if they weren’t intending to cheat, because the tradeoff parameters around other “noise” risks have suddenly been pulled out from underneath them.

This is a very good analogy. One other implication: it also likely results in consequences for future games with said host, not just the current one. The game has changed.

=*=*=*=

I ended up walking away from LessWrong for the (remaining) duration of Good Hart Week; I am debating as to if I should delete my account and walk away permanently, or if I should “just” operate under the assumption[2] that all information I post on this site can and will be later adversarially used against me[3][4] (which includes, but is not limited to, not posting controversial opinions in general).

I was initially leaning toward the former; I think I will do the latter.

1. ^

To be clear, because text on the internet can easily be misinterpreted: this is intended to be a strong compliment.

2. ^

To be clear: as in “for the purposes of bounding risk” not as in “I believe this has a high priority of happening”.

3. ^

Which is far more restrictive than had I been planning for this from the start.

4. ^

This is my default assumption on most sites; I was operating under the (erroneous) assumption that a site whose main distinguishing feature was supposedly the pursuit of rationality wouldn’t go down this path[5].

5. ^

You can easily get strategic-voting-like suboptimal outcomes, for one.

• I’m sorry you’re considering leaving the site or restraining what content you post. I wish it were otherwise. Even as a relatively new writer I like your contributions, and think it likely good for the site for you to contribute more over the coming years.

As perhaps a last note for now, I’ll point to the past events listed at the end of this comment as hopefully helpful for you to have a full-picture of how at least I think about anonymity on the site.

• Bayesian reasoning means that April Fool’s is by far the worst day of the year to do something that you wish to be taken as not purely a joke, especially something that is playful.

Given that the phrasing of your reply implies[1] that this isn’t just a joke, I have additional concerns:

1. Calling anything real-money-related ‘playful’ is a yellow flag, and being confused as to why anyone might consider someone this a yellow flag is a red flag [2].

2. You are discouraging anonymous[3] participants compared to non-anonymous participants, due to the difficulty in anonymously transferring money. This disincentivizes rational discussion.

3. You are further discouraging throwaways and anonymous participants compared to non-anonymous participants, due to the threshold for withdrawals. This also disincentivizes rational discussion.

4. You are yet further discouraging anonymous participants compared to non-anonymous participants, due to the signaling that you are willing to dox people. This too disincentivizes rational discussion.

5. This unilaterally moves voting from a signal of ‘do I wish for this to be more visible[4]’ to ‘do I wish the person who made this comment to get more money’. These are not the same thing. In particular, this discourages upvoting valid-and-insightful comments by participants that I believe are doing more harm than good on the net, and encourages upvoting invalid-or-uninsightful comments by participants that I believe are doing more good than harm on the net, however both of these overall disincentivize rational discussion[5].

6. This seriously disincentivizes ‘risky’ comments by accounts that have a good reputation. This can easily result in strategic-voting-like suboptimal outcomes.

7. Doing this and then calling them Good Heart tokens implies that you explicitly brought up the connection to Goodhart’s Law and then decided to push for it anyway.

8. Likely more, but I am too frustrated by this to continue right now.

1. ^

But doesn’t explicitly state, I note.

2. ^

I seriously hope you understand why. If you don’t, I have to seriously re-examine this forum[6]. I might note that the main defining feature of ‘play’ is that, unlike most things which are externally motivated, play is intrinsically motivated, whereas the classic example of an extrinsic motivator is… money.

3. ^

I am aware this forum isn’t particularly anonymous. And yes, I consider it a strike against it. And yes, there are valid[7] points I have self-censored as a result.

4. ^

Of course, you can argue about this precise definition too. The point is, these two definitions are not the same.

5. ^

Even the base ‘number goes up’ of standard upvotes/​downvotes is bad enough, with discussions about the problems and possibilities as to how to mitigate this on this very site.

6. ^

A forum whose main distinguishing feature is supposedly the pursuit of rationality[8] where of its few[9] admins doesn’t get something this basic and was able to make this much of a change without consulting to see what they were missing is, uh, not great.

7. ^

At least to the best of my knowledge. Obviously I haven’t been able to check by posting said items on this forum.

8. ^

“To that end, LessWrong is a place to 1) develop and train rationality, and 2) apply one’s rationality to real-world problems.”, from https://​​www.lesswrong.com/​​posts/​​bJ2haLkcGeLtTWaD5/​​welcome-to-lesswrong

9. ^

I don’t actually know offhand how many.

• This seriously disincentivizes ‘risky’ comments by accounts that have a good reputation. This can easily result in strategic-voting-like suboptimal outcomes.

(I am not part of ‘the team’ btw.)

• I think I understand that. I do think it’s pretty unlikely this is some kind of step towards a broader trivialization of looking at voting data, but I do understand the concern.

• I do think it’s pretty unlikely this is some kind of step towards a broader trivialization of looking at voting data, but I do understand the concern.

Counterpoint, in this very comment section there is this comment:

On the subject of “maybe we should tolerate a little bit of Goodharting in the name of encouraging people to post”, the EA Forum allows authors to view readership statistics for their posts. I think this is a cool feature and it would be nice if LessWrong also adopted it.

...and clicking through to the linked post[1] it’s also talking about e.g. future potential extensions to split between logged-in and non-logged-in users.

Admittedly, this is not specifically voting data, and this step is still a fairly aggregated statistic, but it is a step in the broader trivialization of looking at user data.

1. ^

...which I was actually somewhat loathe to do.

• For the record, I think it’s the wrong call for the EA Forum to show that data to users. I know from looking at that data on LW that it is a terrible and distracting proxy for what the actual great content is on the site. I think every time we’ve done the annual LW review, the most-viewed post that year has not passed review. One of the top 20 most viewed LW posts of all time is a bad joke about having an orange for a head. It’s a spiky metric and everyone I know whose job depends on views/​clicks finds it incredibly stressful. Karma is a way better metric for what’s valued on-site, and passing annual review is an even better metric.

• If you vote as-usual and don’t think about it, I do not expect you will end up explicitly trading votes with other people.

But no, there is no way to opt-out of Good Heart Tokens this week.

• If you vote as-usual and don’t think about it, I do not expect you will end up explicitly trading votes with other people.

To be clear: I don’t expect that either. Nevertheless. You’re sending a signal that you’re willing to dox people for an April Fool’s joke.

• As an aside: I suspect that “If you vote as-usual and don’t think about it, I do not expect you will end up explicitly trading votes with other people” is less true for me than “usual”.

One of the things I tend to end up doing to discover content on a site like this is to flip through the user pages of people with whom I have had interesting comment chains.

If I ever end up doing so to someone who does the same, this would look very much like trading votes (using an external or implicit collusion mechanism).

• Will the 4x multiplier for posts be retroactively applied to the posts which went up yesterday?

• It might. It depends a bit on how much of a pain it will be for me to go and update the data, which I’ll find out today.

• I’m glad the experiment will be getting a few more days. The extension will increase the quality of my posts; I was tempted to publish some more of my drafts before they were ready, but now I have an incentive to spend a few days working on them. I’m also gonna be a little less worried about them getting swamped as much by all the other good content, as I feel like happened a little bit to many today. I just hope you guys aren’t running out of money!

• Every post that is mostly drafted up by the end of this week but published only later after minor changes should still be eligible.

• Yeah, I suppose that’s one disadvantage. Like it makes me less likely to use the feedback feature as that would delay publishing.

• As a lurker, I failed to understand this system in a way that led to me completely ignoring it (I probably would have engaged more with LW this week had I understood, having noticed now it feels too late to bother), so I feel like I should document what went wrong for me.

I read several front-page posts about the system but did not see this one until today. The posts I read were having fun with it rather than focusing on communication, plus the whole thing was obviously an extended April Fool’s joke so I managed to come away with a host of misconceptions including total ignorance of the core “no really, karma equals actual money for you” feature. I assumed that if it was serious people would be trying a lot harder to communicate the incentives to people (compare announcements of LW bounties, which I routinely manage to hear about even in periods where I’ve fallen out of the habit of checking this website).

On top of “karma equals money” being fundamentally implausible, an April 1st joke named Good Heart Tokens feels like it was designed to not be taken seriously. If the system was meant to incentivize posts from lurkers, more effort could have been put into making the incentives clear.

Edit: Making this comment I double-checked some things and thought I came to a fully correct understanding of the system but upon hitting submit I became confused again. This post says that self-votes don’t count but my fresh comment displays as having 1 token. There is a token counter on the user profile page, but as far as I can tell from looking at the pages of a few random users, that counter is tracking neither karma nor any calculation I can imagine representing token count, I have no idea what it’s doing.

• As a lurker, I failed to understand this system in a way that led to me completely ignoring it (I probably would have engaged more with LW this week had I understood, having noticed now it feels too late to bother), so I feel like I should document what went wrong for me.

Haha, same. I don’t blame this site though, I blame the deluge of April Fool’s content every year for training my brain to aggressively filter April Fool’s content. It’s just like banner ads, I don’t even see them now.

So I FINALLY just read this post because I was curious why the April Fool’s thing was still on the site. All good though, I ended up with almost 50 hearts out of nowhere. Honestly it feels great because I had no idea they were coming. That’s a lot of change to find in your couch!

• I assumed that (1) the numbers attached to posts and comments include self-votes, (2) the total on the profile page is the sum of these but excluding the self-votes, or maybe a potentially-slightly-out-of-date version of that depending on how the code is built, and (3) the real-money payout is the sum of all the numbers excluding the self-votes at the end of the week.

Are the numbers you found when looking at random users’ pages inconsistent with that?

• (gjm’s take is correct)

• Looking at more random users, I think tokens earned via posting are being undercounted somehow. Users with only comments display as having exactly the amount of tokens I would expect from total karma on eligible posts minus self-votes, but looking at users with posts made after April 2 (to avoid complications from a changing post-value formula) consistently have less than the “comment votes plus 3x post votes, not counting self votes” formula would predict. For instance, Zvi has two posts (currently 52 and 68 karma) and zero comments in the last week. With strength 2 selfvotes, (52-2+68-2)*3=348 expected tokens which is a significant mismatch from his displayed 302. It doesn’t seem to be out of date since his displayed tokens change instantly in response to me voting on the posts, is something going wrong or is there some weird special case way of voting on posts that doesn’t get immediately reflected in the user page?

• Posts get self-strong-upvoted by default, and strong upvote strength is modified by user karma. Zvi is a high karma user so his strong votes are like +10. Plug that number in instead of 2 and that accounts for the discrepancy you noticed.

• Doh. I forget how much faster strong upvotes scaled with user karma, that resolves all my confusion.

• This coment indicates that early posts might not have been given the full multiplier, but both of Zvi’s posts went up after this.

• I thought maybe deleted posts/​comments or unpublished drafts could explain the difference. I understand both of those categories contribute to karma/​hearts, but they don’t show up on the user’s page. However, these would have to have been significantly downvoted to match the direction of the discrepancy.

• The numbers attached to posts and comments seem to be a straightforward reskin of karma: includes self-votes, does not include the 3x multiplier for posts. The token counter in user profiles seems to update instantly (I tried voting up and down on posts and comments then refreshing the page of the user to test) but undercounts in ways I don’t understand. For instance, this random user currently displays as 207 karma (edit: tokens, not karma, doh) based on a post with 68 karma and about a dozen comments with net karma-minus-selfvotes of ~15. I can tell it’s up to date because it went up by 3 when I upvoted his post, but it seems like it ought to be ~216 (post karma times three plus comment karma) and I can’t explain the missing tokens, and several of the random users I looked at displayed this sort of obvious undercounting.

• I am personally overwhelmed by the amount of brilliant content it is being published these last days. I can barely keep pace with all these fantastic articles.

• I have already made a Post to LW that I would not have counterfactually, and expect to do 1-2 more this week as a result of the experiment. Part of it for me is a psychological forcing function: if I did not have a particular time at which to do this, well, now I do. I will be interested in seeing if there’s an increase or decrease April 8-22 relative to the pre-experiment trend: I’m not very confident in the direction, but I do have an expectation of higher variance relative to an average week.

• I believe no-one has stated what to me seems rather obvious:

There’s a difference between not winning the lottery and almost winning the lottery. If yesterday you had told me that I would not be making 200$via LW comments, this wouldn’t have bothered me, but now that I perceive an opportunity to win 200$ via LW comments, it does bother me if I don’t succeed. I’m not saying this makes the experiment a bad idea, but it is psychologically unnerving.

• On the subject of “maybe we should tolerate a little bit of Goodharting in the name of encouraging people to post”, the EA Forum allows authors to view readership statistics for their posts. I think this is a cool feature and it would be nice if LessWrong also adopted it.

Writing on LessWrong, I find myself missing the feature for a couple reasons:

• While the Good Heart Project continues, clearly the number of posts being published is higher than average. But are there also a higher than average number of readers? Knowing if I’m getting more or fewer readers than average during Good Heart Project would definitely influence my behavior (probably moreso than the money—the post I wrote in a flurry yesterday was mostly inspired by the fact that I’d secretly wanted to talk about valuing karma for a while, but it felt too taboo/​off-topic for an ordinary EA Forum post, and the Good Heart Project created the perfect opportunity). Seeing post analytics might help me assess whether there are more or fewer readers per post than usual.

• The feature has helped improve my writing—seeing how many people open a page but only stay for a short while, I was encouraged to write more concise posts and always put summaries at the top, to be more useful for readers.

• It’s also been interesting to see which of my posts have “legs” and keep getting revisited later on—this post about EA charity gift cards versus this one analyzing a Peter Thiel essay both got around the same number of upvotes, but the Thiel post still draws a couple of hits every day while attention to the gift-card essay dropped off sharply when it left the front page. I feel like being able to see this helps steer me towards topics that might have more long

• Oddly, even though I read LessWrong as often as I read the EA Forum, I have a poorer sense for whether an idea of mine “belongs” on LessWrong—What’s too political for the front page? Is my short story ‘The Toba Supervolcanic Eruption’ too EA to be worth cross-posting here? What topics are too casual (rambling off some thoughts about evolution and psychology) vs too technical (talking about some aerospace engineering stuff)? Seeing post analytics in addition to upvotes might help me get a better sense of this.

• I think this is a cool feature and it would be nice if LessWrong also adopted it.

As a counterpoint, knowing that the EA forums expose this significantly disincentivizes me, at the very least, from ever looking at or recommending the EA forums.

There is no way to track these statistics in a way that isn’t either inaccurate in adversarial scenarios or leaks far too much user information, or both. And there tends to be a certain cat-and-mouse game:

1. Initially there’s something absolutely basic like a hit counter.

2. Someone writes a script that hammers a page from a single IP, to boost the seeming engagement.

3. A set cardinality estimator is added to e.g. filter by only a single hit per IP.

4. Someone writes a script that hammers a page from many IPs, to boost the seeming engagement.

5. The hit counter is modified to e.g. only work if Javascript is enabled.

6. The script is ported to use a JS interpreter, or to directly poke the backend.

7. The hit counter is modified to e.g. also fingerprint what browser is being used.

8. The script is ported to use headless Chrome or somesuch.

9. The hit counter is modified to e.g. only capture views from logged-in visitors.

10. The script is modified to automatically create accounts and use them.

11. Account creation is modified to include a CAPTCHA or similar.

12. The script is modified to include a tool to bypass CAPTCHAs[1]

13. etc.

Note that every one of these back-and-forths a) also drop or distort data, or otherwise make life harder, for legitimate users, and b) leak more and more information about visitors.

I would not have too much of a problem with readership statistics if the resulting entropy was explicitly calculated, and if the forum precommitted to not in future making changes that continued the ratchet; without these I have serious concerns.

1. ^

Be it ‘feeding audio captchas to a speech-to-text program’, or ‘just use Mechanical Turk’.

• 2. Someone writes a script that hammers a page from a single IP, to boost the seeming engagement.

In the case of EA forum, the readership statistics have no consequences whatsoever. They’re not even publicly viewable. Why would anyone try to artificially inflate them?

Hmm, well, I guess we could imagine a scenario where someone works at an EA nonprofit, and wants to impress their boss, so they write a blog post, and artificially inflate the readership statistics, and then show their boss a printout of how many people have read their blog post. And then the boss goes to EA Forum mods and says “I want these statistics to be harder to fake”. But then I imagine the EA Forum mods would respond “Why should we spend our time doing that? This is your problem, not ours. You should come up with a less stupid way to judge your underlings.”

• I share much of the same failings to be incentivised to comment as some of the other commenters here¹, but at the same time I was indeed motivated to finish up the draft I was working on, which is a bit of a contradiction even to myself. One obvious difference to me is that I was already planning on publishing my draft, and I was never remotely concerned that it was motivated by misaligned incentives; if it ended up getting an ungodly number of votes, then it must have just been that good.

For smaller comments I am much more concerned that attention is a rate-limited resource, and so comments often only provide value if they are better than the noise, and I am so used to seeing upvoted comments, not so much here but definitely pervasively elsewhere, that are nothing but noise. I do like that there is an incentive structure in place, and I also like that there is a meta-incentive for voters to be more discerning with their votes lest they incentivise the wrong strategies, but I don’t like feeling that there is an incentive to just comment more, just above the quality threshold. The part of me that wants to steadfastly reject bad incentives would rather that didn’t happen.

¹which when put like that sounds like a blatant lie

• I’m already feeling a degree of adverse incentives. Getting money involved makes me more reticent to comment rather than more eager.

I don’t want to be perceived as grubbing for money, and I feel like there’s a risk of being treated harshly by a voting audience that are now on guard against (and seeking to massively downvote) people trying to game the system.

• If that’s a suboptimal equilibrium, maybe we should actively try to be in the other equilibrium instead, where people treat the situation more casually and aren’t so on the hunt for cheaters?

• To that end: lo, this is a comment that is at least in part motivated by my desire to grub for money. I casually acknowledge this, in the interest of making the whole situation feel more casual and socially low-stakes. Downvote, upvote, or ignore this comment as you please; I will continue to treat the situation casually regardless.

• Oh wait, this thing ended yesterday. Oh well! :P Hopefully the fact that I wrote the above comment will make the good equilibrium a bit more likely if we do a thing like this again.

• Is this meant to disincentivize downvoting, or is that accidental? Pinning a monetary value to votes makes me feel like downvoting unclear, inaccurate, or off-topic content is literally taking money away from someone.

And, half-jokingly: If a post gets a net negative number of votes, it implies that the author would be expected to pay the site.

• Incurring debt for negative votes is a hilarious image: “Fool! Your muddled, meandering post has damaged our community’s norm of high-quality discussion and polluted the precious epistemic commons of the LessWrong front page—now you must PAY for your transgression!!!”

• It really isn’t funny in the slightest.

It means a clique of users can ostracize someone with real-world consequences.

• It is funny from the perspective of a member of the clique, because if someone tries that kind of thing against a clique member, their friends can retaliate. It is deeply un-funny to an outsider, who can expect no such safety net. Humor is situational, and often extremely revealing about peoples’ underlying assumptions.

• Well, but in fact people don’t incur debt for negative votes, so there is no such clique. I feel like you’re saying “this joke is funny from the perspective of some of the people in the joke and not funny from the perspective of others of them”? And I feel like that might be true, but it doesn’t feel super relevant to whether the joke is funny outside of the joke?

My own reply to TLW would be something like: yes, if that happened it would have that effect, and that would be bad. And also, it’s a pretty funny-to-me idea! Putting those two sentences next to each other suggests there’s some relation between them, like your intent was to say “it’s not funny because if it was true, a clique of users...”. But that’s not how humor works in my brain, at least, and I’d be kinda surprised if it worked that way in yours.

• Well, but in fact people don’t incur debt for negative votes, so there is no such clique.

I was more thinking of “people who have a current positive balance can have said balance wiped” than I was actual debt, to be clear.

• This was an interesting experience, and I appreciate that you’re actively doing experiments with the site. With that being said, I sure am glad that it’s going to be over in a few hours :). I’m looking forward to the retrospective, though!

• Thx! My guess is shortly I’ll write a post saying “It’s over, you now have some time to add your financial details” and then in a few days I’ll write a retrospective.

One user on the leaderboard expressed a preference for me sending out money after the US April 18th tax deadline, so I may wait 10 days to do that (and to write the retro).

• Date: Good Heart Tokens will continue to be accrued until EOD Thursday April 7th (Pacific Time). I do not expect to extend it beyond then.

So are they only accrued during the period of time?

And are they exchangeable after the period of time is over?

• We’ll count up all the GHT during the week from April 1st to EOD April 7th and then hand those out, insofar as people submit paypal/​eth/​charity info. I’ll probably do an announcement after the week is up and give people time to submit their info, but you won’t be able to get more tokens after the week is up.

• Will the 4x multiplier for posts be permanent, when GH points become karma again? I’m not actually against this, but it should be clarified as intentional.

• The scoring on posts will stay as usual, it’s only the translation into GHT that will be 4xed. On the post it will say n, but your GoodHeartTokens will say 4n.

Basically it won’t last after the week.

• Why? The existing multiplier implies that you agree getting karma with posts is harder, so why shouldn’t it extend?

• Um, that’s a pretty reasonable point.

• I think there’s a justification. Let me try.

I claim there are two ways to think about things: “karma = incentive” and “karma = just deserts”. “Karma = incentive” means that insofar as people are trying to maximize their karma, it pushes LW in the direction we want it to go. “Karma = just deserts” means that insofar as people are earning karma, it is “fair” in terms of how much work they put in.

I think ≈1x is about the right multiplier for “karma = incentive”. I think ≈10x (or even more) is about the right multiplier for “karma = just deserts”.

For the former: Having a lots of good comments per post is super-important for a healthy community and intellectual growth, I would argue, even moreso than having lots of good posts.

For the latter: clearly it takes way more work to get a post upvote than a comment upvote, and thus post-writers “deserve” more.

The status quo is 1x, i.e. the “karma = incentive” optimum. And that makes sense.

But if there’s real money on the line, there’s maybe some benefit in moving in the direction of “karma = just deserts”, at least a little bit. The 3x multiplier splits the difference, which seems reasonable.

But ultimately it depends on how much we care about perceived fairness. I think sticking with 1x, even this week, would have been totally reasonable, maybe even better.

Oh hey, here’s a different model /​ perspective: Suppose that LW users care about (1) karma, (2) fame. Of the two, (1) pushes people towards commenting, and (2) towards posting. In normal times, let’s suppose that we’re happy about the LW comment:post ratio. But this week, people care a lot more than normal about (1), relative to (2). So that would push the comment:post ratio way out of whack. Increasing the multiplier could help correct for that, I guess.

• What counts as an “employee of the Center for Applied Rationality”? I do various work for CFAR on a part-time or contract basis but haven’t worked there full-time for a while, does that make me ineligible?

• I’m referring to current full-time employees, not contractors, so you are eligible.

• “Today I’m here to tell you: this is actually happening and it will last a week. You will get a payout if you give us a PayPal/​ETH address or name a charity of your choosing.”

How do we give you the name of a charity? I only see fields to enter a PayPal and email address on the payment info page.

• Not the best UI, but if you just put in the full name of the charity in the PayPal field, we’ll donate it to them.

• Thanks! Do I still need to enter an email?

• It’s optional.

• I am looking forward to the results of this experiment. Will we notice an increase in smaller posts? Increase in comments? Increase or decrease in quality of posts? Increase or decrease in quality of comments?

• Are we changing from “payment sent every day at midnight” to “payment sent at end of week”?