The fact that we use a “proper scoring rule” definitely doesn’t mean that the entire system, including the participant’s true utility functions, are really “proper”. There is really a fair bit of impropriety. For instance, people also may care about their online reputation, and that won’t be captured in the proper scoring rule. The proper scoring rule really helps make sure that one specific aspect of the system is “proper” according to a simplified model. This is definitely subideal, but I think it’s still good enough for a lot of things. I’m not sure what type of system would be “perfectly proper”.
Prediction markets have their own disadvantages; as participants don’t behave as perfectly rational agents their either. So I won’t claim that the system is “perfectly aligned”, but I will suggest that it seems “decently aligned” compared to other alternatives, with the ability to improve as we (or others with other systems) add further complexity.
If you don’t confiscate money for bad predictions, then you’re basically just giving money to people for signing up, which makes having an open platform tricky.
What was done in this case was that participants were basically paid a fixed fee for participating, with a second “bonus” that was larger, that was paid in proportion to how they did on said rule. This works in experimental settings where we can filter the participants. It would definitely be more work to make the system totally openly available, especially as the prizes increase in value, much for the reason you describe. We’re working to try to figure out solutions that could hold up (somewhat) in these circumstances, but it is tricky, for reasons you suggest and for others.
I’d also point out that having a nice scoring system is one challenge out of many challenges. Having nice probability distribution viewers and editors is difficult. Writing good questions and organizing them, and having software that does this well, is also difficult. This is something that @jacobjacob has been spending a decent amount of time thinking about after this experiment, but I’ve personally been focusing on other aspects.
At least in this experiment, the scoring system didn’t seem like a big bottleneck. The people who submitted who won the most money were generally those who seemed to have given thoughtful and useful probability distributions. Things are much easier when you have an audience who is generally taking things in good faith and who can be excluded from future rounds if it seems appropriate.
Cool, thanks for those clarifications :) In case it didn’t come through from the previous comments, I wanted to make clear that this seems like exciting work and I’m looking forward to hearing how follow-ups go.
Thanks! I really do appreciate the thoughts & feedback in general, and am quite happy to answer questions. There’s a whole lot we haven’t written up yet, and it’s much easier for me to reply to things than lay everything out.
The fact that we use a “proper scoring rule” definitely doesn’t mean that the entire system, including the participant’s true utility functions, are really “proper”. There is really a fair bit of impropriety. For instance, people also may care about their online reputation, and that won’t be captured in the proper scoring rule. The proper scoring rule really helps make sure that one specific aspect of the system is “proper” according to a simplified model. This is definitely subideal, but I think it’s still good enough for a lot of things. I’m not sure what type of system would be “perfectly proper”.
Prediction markets have their own disadvantages; as participants don’t behave as perfectly rational agents their either. So I won’t claim that the system is “perfectly aligned”, but I will suggest that it seems “decently aligned” compared to other alternatives, with the ability to improve as we (or others with other systems) add further complexity.
What was done in this case was that participants were basically paid a fixed fee for participating, with a second “bonus” that was larger, that was paid in proportion to how they did on said rule. This works in experimental settings where we can filter the participants. It would definitely be more work to make the system totally openly available, especially as the prizes increase in value, much for the reason you describe. We’re working to try to figure out solutions that could hold up (somewhat) in these circumstances, but it is tricky, for reasons you suggest and for others.
I’d also point out that having a nice scoring system is one challenge out of many challenges. Having nice probability distribution viewers and editors is difficult. Writing good questions and organizing them, and having software that does this well, is also difficult. This is something that @jacobjacob has been spending a decent amount of time thinking about after this experiment, but I’ve personally been focusing on other aspects.
At least in this experiment, the scoring system didn’t seem like a big bottleneck. The people who submitted who won the most money were generally those who seemed to have given thoughtful and useful probability distributions. Things are much easier when you have an audience who is generally taking things in good faith and who can be excluded from future rounds if it seems appropriate.
Cool, thanks for those clarifications :) In case it didn’t come through from the previous comments, I wanted to make clear that this seems like exciting work and I’m looking forward to hearing how follow-ups go.
Thanks! I really do appreciate the thoughts & feedback in general, and am quite happy to answer questions. There’s a whole lot we haven’t written up yet, and it’s much easier for me to reply to things than lay everything out.