It’s worth noting that the R-squared value for a linear trend-line for 2000-2019 data has R2=0.07 so a constant prediction of 750k acres would only be marginally less accurate. (I think your excluding 2020 graph also excludes 2019 but the story doesn’t change much either way)
It looks like up until 2016 everything was fairly constant and since then 3 out of 4 years have been bad.
I had the same confusion.
One of the key things I think between the 3 games is whether communication beforehand helps (in single shot games).
In PD communication doesn’t really help much as you there is little reason to trust what the other person.
In SH communication should be able to solve your problem as S-S is optimal for both players.
In BotS communication which results in agreement can at least be trusted as co-ordinating is optimal for both players. Choosing which option to co-ordinate on is another matter.
(assuming you’ve included the pleasure of spiting the other person etc. in the payoff matrix)
Maybe I need to be more heterogenous in my hiring!
I hadn’t heard of pair-writing but it sounds like it could work well in my context.
My intended point was that if one person has an ugh-field around something then it is often a generally unenjoyable task. Although other people don’t have ugh-fields around the task, it still seems unfair (and would likely lead to bad team dynamics) to reassign it to someone else who merely dislikes the task.
My experience would be that people generally have Ugh fields around tasks which no-one on the team likes (e.g. report writing). I can’t reassign such tasks without being unfair to the people who are dealing well with such jobs.
I would agree that mentioning to a manager that you’re finding something aversive is basically fine as long as you’re more looking for support than reassignment (although this might be different in different fields) and that a manager should encourage that.
As an example one employee found that people constantly interrupted him and this made getting into the flow of report writing super hard so we blocked off a day a week to allow him to catch up without interruptions.
I guess to some extent it’s knowing what’s possible in your context and knowing how flexible your manager is able/willing to be.
11.9% vs 8.7% is early plasma administration (0-3 days from diagnosis) vs late (4+ days).
13.7% is using low antibody count plasma, 8.9% is using high antibody count plasma. I guess this is the 35% reduction.
All Debates are Bravery Debates?
I do think that it would be very bad if this happened. However I don’t think this is likely. Quoting my other comment:
I think its important to note here that we are not really that homogenous in our opinions and weightings of different sources of value. Alot of the worries about Blind voting seem to assume that we’re all going to vote the same way about the same posts which I think is highly unrealistic. There also seems to be the assumption that everything fractionally above 0 value will get an upvote which again seems unrealistic.
This seems even more true for downvotes—I think people realise that downvotes feel extra bad and only use them sparingly. For instance, I only really downvote when I think something has been a definite breaking of a conversational norm or if someone is doubling down on an argument which has been convincingly refuted.
I think a spread of opinions on what constitutes a downvote (and a general feeling that comments get less votes in general) would make the −80 only happen to super egregiously bad comments.
When a post is at 50, I can think that is a bit too high just from my general sense of what I want to see more of on the site. And it’s be throwing away information about my own beliefs to not give me the fine-gradation of “I want to see these posts on the site about as often as I would if they got 50 karma, not the amount that I would if they got 200 karma.”
This is true when the equilibrium position of the karma system is set to Total Karma Voting.
I think that Blind voting would move the karma system to a new equilibrium. I’m not convinced we should do so as I think it would be a fairly unstable equilibrium but I think it would work if everyone did it and would allow for fine grained expressions of your belief.
The equilibrium I envisage would be that the current amount of something that LW has is taken into account when people blind vote their opinion.
As an example, I think the reason that joke comments can get fairly high karma is that they’re rare. If more people start writing joke comments as a result then that’s fine for as long as people are upvoting.
At some point the people who value the jokes least stop upvoting them or start downvoting them. This continues until the reward experienced by the jokers roughly matches the effort taken or some other balancing factor.
In the case of low positive value posts, some people have a higher threshold for what they will give an upvote for and the more low positive value posts there are the fewer people will upvote them.
(I think its important to note here that we are not really that homogenous in our opinions and weightings of different sources of value. Alot of the worries about Blind voting seem to assume that we’re all going to vote the same way about the same posts which I think is highly unrealistic. There also seems to be the assumption that everything fractionally above 0 value will get an upvote which again seems unrealistic. Frankly I think that anyone who can write a post which is good enough that it persuades 100 different people with different standards to click the upvote button then they deserve to get 150 karma!)
The key then is that in order to get an oversized reward for the amount of effort put in, you have to do better than average at providing value.
In Blind Voting, accounting-for-how-much-of-a-certain-thing-there-currently-is-on-LW is doing the same thing as considering-what-message-the-total-karma-sends does with Total Karma Voting. The former seems to have a lag in the message getting out but I think when you’re in a rough equilibrium the lag is relatively short.
So this brings me onto what I think the main cost of Total karma voting is. If an author looks at a post which has 25 karma from 10 votes, what does it mean? Roughly speaking it means that it was considered about as valuable as another 25 karma post. The 10 votes tells the author how efficient the karma market was for the post and possibly gives limited information on how varied the opinions were.
With Blind voting the author sees that and knows that 10 people had an opinion that this post was wanted more or less and that their average strength of opinion was 2.5 karma points in favour. This probably consists of something like 3 people who want alot more like it and 7 people who want a little more like it (or possibly some who wish there was less like it or were just yay/booing).
I agree that karma is a kludge and the true meaning isn’t necessarily clear but with Blind voting it seems importantly less of a kludge and some extra information can be extracted.
I want to note that the I see the “vote towards the ideal karma” as completely compatible with “vote your belief.”
Agreed. I was looking for a shorthand way of referring to the different voting policies but am yet to find one which is satisfactory—you’ve (rightly) shot down a couple of my ideas! Total Karma voting seems fine for one policy, maybe direct opinion voting for the other? If you shoot that one down too you can come up with your own!
The paper does attempt to adjust for this with a complexity metric although I suspect this doesn’t work perfectly as it seems to be a linear adjustment with number of nodes used by the engine to calculate the optimal move.
I have a concern that the paper is comparing tournament play (offline) to match play with 4 games per match (online). In match play, especially with few games, a player who is behind needs to force the game and the player in the lead can play more conservatively. Tournaments have their own incentives but overall I would expect short match play to cause bigger errors from engine optimal play as losing players try to force a win in naturally drawing situations.
The calculated effect size is >200 ELO points which suggests to me that something is amiss.
Ok, I think I actually agree with your crux.
The points I was trying to make were (kinda scattered across the comments here!):
1. It is advantageous if people have a shared understanding of the system
2. Voting your own belief actually should work pretty well
3. There is a written norm in favour of voting your own belief
I think we disagree on all 3 to some extent, at least in how important they are. I think if we lose the disagreement on number 3 then disagreements on 1&2 are less important.
I’m ok with a norm of voting based somewhat on target karma (making it overly strong an effect I think would be detrimental), especially as this is now common knowledge and seems to be most people’s preference.
This whole thing has resolved some of my confusion as to why karma scores end up the way they do.
Now I fell less stupid for not getting it—at least I included all of the different parts of the recipe! Very impressive content density comment.
I have a close-to-deontological belief in the need to obey the rules of a community that’s trying to create things together (even when the rules seem wrong) and I think I tend to interpret things in that frame (for or against) even if that isn’t the intention. In the immortal words of Scott Alexander:
No! I am Exception Nazi! NO EXCEPTION FOR YOU!
I think the asshole filter is a good point and to be honest its possibly enough to get me to change my mind about this subject. There should be some mitigation in the karma weighting system but even long term members might be assholes.
Does it prove too much? Should the current karma, then, actually be the main consideration in deciding how to vote? Few people on the site seem willing to bite that bullet. Should I almost always use my strong votes as if I don’t then an asshole might do that and therefore have an oversized effect?
Count me confused.
On the other points I won’t go through point by point. The main thing I think is that what you’re describing is in conflict with the explicit phrasing of the reasons for voting. Compare:
What should my votes mean?We encourage people to vote such that upvote means “I want to see more of this” and downvote means “I want to see less of this.”
What should my votes mean?
We encourage people to vote such that upvote means “I want to see more of this” and downvote means “I want to see less of this.”
What should karma indicate?The karma on a post is intended to indicate whether, and by how much, members of the site would like to see more of the posts/comments in question. We encourage people to vote accordingly.
What should karma indicate?
The karma on a post is intended to indicate whether, and by how much, members of the site would like to see more of the posts/comments in question. We encourage people to vote accordingly.
The former is the FAQ but I think the latter is what you’re describing. If this is the case then I think this ends up being an asshole filter in itself and the phrasing in the FAQ should be corrected.
(I realise as Ray qua user this has nothing to do with you but if you can pass this along to Ray qua admin that would be great!)
I like that framing in the first paragraph.
In the second paragraph I can’t work out if the question is intended rhetorically, ironically or genuinely!
IMO you are advocating for basically switching to a new voting system, not “properly” implementing the current one.
Compare to the LW FAQ:
Downvote a different post of the same author because I didn’t like that one? That doesn’t sound like a good idea.
No, I mean why wouldn’t you downvote a hypothetical post that you are agnostic about?
Imagine there are two posts, both have 50 karma.
You read one and feel confident that it is net positive but that 50 is too high.
You read the other and it is not net positive for you—you just have a meh reaction to it.
It seems very odd to me that one would downvote the former but not the latter. The net effect is to encourage people to read/write a post that is more likely to provide a meh reaction than be net positive.
The fundamental problem is that we’re trying to map a multidimensional thing into a single dimension. Whenever you do this you end up throwing out some information and you have to do the best you can.
As described by jimmy, with the “I want to see more/less of this” rule you lose some information on magnitude of like/dislike with the “I want to see more/less of this” rule. This is somewhat mitigated by having weak and strong votes, plus the dither factor jimmy describes (which I think for me is quite significant) so overall I’m not hugely worried about this.
(You can also get some of this information back if you’re really interested by comparing total number of votes to score although this is less obvious)
I’m not sure how a “how much total karma should this post have” rule even works in practice but a couple of options:
How much karma a post has needs to link to post value / correct amount of reward to the author.
If I judge this according to how much value I personally got out of it then the great-great grandparent comment applies and 50% awesome, 50% meh posts get 0 karma—a worse result than with the “I want to see more/less of this” rule, with all of the information from the 50% of people who found it awesome disappearing.
If instead I am trying to judge how much value I think the average LWer would get out of it then I think this gets really hard to assess. As an example, the recent 10 fun questions results showed that people weren’t very good at guessing whether others believed the Civilisational Inadequacy thesis more or less than they themselves did. Here you lose some information on people’s actual opinions in favour of information on what other people think their opinion might be, adding significant noise to the result.
Whichever option you chose you probably end up throwing out information on how many people got value from the post. You can try to get around this by each person estimating how many others would find it useful but I think this just adds more noise to the result.
You can try to make the rule some combination of rules (as it seems most people do) but then to me it seems like interpreting karma scores becomes really difficult. We also run into the problem of how much weighting to give to each sub-rule and if people give different weightings then you get a discrepancy in how effective each person’s opinion is.
I’m interested if someone can explain another way that a “how much total karma should this post have” rule would work in practice which doesn’t run into such problems.
Hmm, interesting—I’m now slightly confused by:
I recently strong-downvoted a post that I would have weak-upvoted if it had been at a lower karma
Was that post good or bad? It sounded to me like you thought the post had value, just not as much as was currently showing. If you downvoted a post you thought had positive value (you were confident that its current karma value was too high?), why not downvote one that you don’t see any value in?
If being agnostic is a cause for not voting at all, a 50% great, 50% agnostic post would get a higher score than a 50% great, 50% slightly good post as the slightly good experiences would downvote and the agnostics wouldn’t.
I think my main concern with the “vote to try to give posts/comments the total karma they should have” rule is that I can’t see a way to operationalise it which doesn’t suffer from worse problems than the simple “I want to see more/less of this” rule.
Further, I’m not sure having a voting condition of “vote to try to bring the karma to the value you think it should be” helps in this situation. If 50% of people didn’t get any value from a post/comment then they would be trying to vote the karma down to 0. So a “50% earth shattering, 50% meh” post would end up with ~0 karma.