JasonBrown comments on AI Safety Research Futarchy: Using Prediction Markets to Choose Research Projects for MARS

JasonBrown 5 Oct 2025 16:57 UTC
2 points
0
Ahh I see what you mean now, thank you for the clarification.

I agree that in general people trying to exploit and Goodhart LW karma would be bad, though I hope the experiment would not contribute to this. Here, post karma is only being used as a measure, not as a target. The mentors and mentees gain nothing beyond what any other person would normally gain by their research project resulting in a highly-upvoted LW post. Predicted future post karma is just being used optimise over research ideas, and the space of ideas itself is very small (in this experiment) and I doubt we’ll get any serious Goodharting by selection of them that are perhaps not very good research but likely to produce particularly mimetic LW posts (and even then this is part of the motivation of having several metrics, so that none get too specifically optimised for).

There is perhaps an argument that those who have predicted a post would get high karma might want to manipulate it up to make their prediction come true, but those who predicted it would be lower have the opposite incentive. Regardless of that, that kind of manipulation is I think quite strictly prohibited by both LW and Manifold guidelines, and anyone caught doing it in a serious way would likely be severely reprimanded. In the worst case, if any of the metrics are seriously and obviously manipulated in a way that cannot be rectified, the relevant markets will be resolved N/A, though I think this happening is extremely low probability.

All that said, I think it is important to think about what more suitable / better metrics would be, if research futarchy was to become more common. I can certainly imagine a world where widespread use of LW post karma as a proxy for research success could have negative impacts on the LW ecosytem, though I hope by then there will have been more development and testing of robust measures beyond our starting point (which, for the record, I think is somewhat robust already).