I’m actually finding this hypothesis more interesting than the one in your OP (partly because it looks more testable, funnily enough). Bash out a script to watch LW and vote on things as they appear, leave it to generate data as long as one likes, then hey presto. Tiny bit tempted to do it myself, approval or not.
The test is conducted in a random week (edit: not sure how many data points you can get out of a week worth of comments tho, may require longer or shorter time).
The sample size you need to detect an effect depends on that effect’s size. So far, so obvious, so I did a quick & dirty power analysis to get some numbers, although for posts in the discussion section rather than comments. (Posts on main are too infrequent, and I’d expect a smaller effect for comments, so comments would need a bigger sample.) If anyone cares I can throw up my code.
If my numbers are right and you took a sample of 100 upvoted posts and 100 downvoted discussion posts, the bootstrap confidence interval for the effect size would be 3.7-6.8 points wide. Even with a sample of 400 upvoted posts and 400 downvoted (and that’s 3-4 months’ worth of discussion posts), it’d be 2.2-3.0 points wide. So unless the priming effect’s strong (at least 2-4 points) a week of data wouldn’t be conclusive, at least not for posts. Comments might be more doable, though.
Yea, that’ll take a while. We’ll see about testing. The proposed effect can be strong if each next comment is affected by the previous, so that the initial disturbance does not ‘dissolve’ in a larger number. But i kind of doubt. I don’t really care whole ton for votes, i generally take them as a measure of clarity of the point, but any priming most definitely would result in their lower usefulness as a gauge of clarity. Also theres apparently voting via recent comments thread; tbh i nearly forgot you can read comments expanded, as it does seem not to be very interesting due to majority of comments being brief and meaningless outside context.
Bash out a script to watch LW and vote on things as they appear, leave it to generate data as long as one likes, then hey presto. Tiny bit tempted to do it myself, approval or not.
I’m becoming increasingly tempted to submit automation detection scripts to the lesswrong codebase.
I’m actually finding this hypothesis more interesting than the one in your OP (partly because it looks more testable, funnily enough). Bash out a script to watch LW and vote on things as they appear, leave it to generate data as long as one likes, then hey presto. Tiny bit tempted to do it myself, approval or not.
The sample size you need to detect an effect depends on that effect’s size. So far, so obvious, so I did a quick & dirty power analysis to get some numbers, although for posts in the discussion section rather than comments. (Posts on main are too infrequent, and I’d expect a smaller effect for comments, so comments would need a bigger sample.) If anyone cares I can throw up my code.
If my numbers are right and you took a sample of 100 upvoted posts and 100 downvoted discussion posts, the bootstrap confidence interval for the effect size would be 3.7-6.8 points wide. Even with a sample of 400 upvoted posts and 400 downvoted (and that’s 3-4 months’ worth of discussion posts), it’d be 2.2-3.0 points wide. So unless the priming effect’s strong (at least 2-4 points) a week of data wouldn’t be conclusive, at least not for posts. Comments might be more doable, though.
Yea, that’ll take a while. We’ll see about testing. The proposed effect can be strong if each next comment is affected by the previous, so that the initial disturbance does not ‘dissolve’ in a larger number. But i kind of doubt. I don’t really care whole ton for votes, i generally take them as a measure of clarity of the point, but any priming most definitely would result in their lower usefulness as a gauge of clarity. Also theres apparently voting via recent comments thread; tbh i nearly forgot you can read comments expanded, as it does seem not to be very interesting due to majority of comments being brief and meaningless outside context.
I’m becoming increasingly tempted to submit automation detection scripts to the lesswrong codebase.