As suggested in the OP, they have to create the tests, not only evaluate their results. Even if average LWers want to find out whether LW memes are actually helpful, they are likely to be biased in choosing the criteria of rationality. For example, a test made by a LWer would more likely include a Newcombesque question where one-boxing would be classified as the rational answer, and since one-boxers are certainly more prevalent among LWers than in nearly any other group, the results would show that LW memes improve rationality. But the OP is not interested in testing whether LW memes improve LW-style extended rationality (it would be quite weird if they didn’t) but a practical, real-life relevant rationality. We are not impartial judges when it comes to determining the boundary between these two.
Or more generally, you can never be too careful about possible biases. Not seeing a reason for a self-serving bias is a pretty weak evidence for its non-existence.
As suggested in the OP, they have to create the tests, not only evaluate their results. Even if average LWers want to find out whether LW memes are actually helpful, they are likely to be biased in choosing the criteria of rationality. For example, a test made by a LWer would more likely include a Newcombesque question where one-boxing would be classified as the rational answer, and since one-boxers are certainly more prevalent among LWers than in nearly any other group, the results would show that LW memes improve rationality. But the OP is not interested in testing whether LW memes improve LW-style extended rationality (it would be quite weird if they didn’t) but a practical, real-life relevant rationality. We are not impartial judges when it comes to determining the boundary between these two.
Or more generally, you can never be too careful about possible biases. Not seeing a reason for a self-serving bias is a pretty weak evidence for its non-existence.