Problems like this one have led some skeptics to claim that prediction markets are not necessarily superior to other less sophisticated methods, such as opinion polls, that are harder to manipulate in practice. However, little attention has been paid to evaluating the relative performance of different methods, so nobody really knows for sure. To try to settle the matter, my colleagues at Yahoo! Research and I conducted a systematic comparison of several different prediction methods, where the predictions in question were the outcomes of NFL football games. To begin with, for each of the fourteen to sixteen games taking place each weekend over the course of the 2008 season, we conducted a poll in which we asked respondents to state the probability that the home team would win as well as their confidence in their prediction. We also collected similar data from the website Probability Sports, an online contest where participants can win cash prizes by predicting the outcomes of sporting events. Next, we compared the performance of these two polls with the Vegas sports betting market—one of the oldest and most popular betting markets in the world—as well as with another prediction market, TradeSports. And finally, we compared the prediction of both the markets and the polls against two simple statistical models. The first model relied only on the historical probability that home teams win — which they do 58 percent of the time — while the second model also factored in the recent win-loss records of the two teams in question. In this way, we set up a six-way comparison between different prediction methods — two statistical models, two markets, and two polls.
Given how different these methods were, what we found was surprising: All of them performed about the same. To be fair, the two prediction markets performed a little better than the other methods, which is consistent with the theoretical argument above. But the very best performing method—the Las Vegas Market—was only about 3 percentage points more accurate than the worst-performing method, which was the model that always predicted the home team would win with 58 percent probability. All the other methods were somewhere in between. In fact, the model that also included recent win-loss records was so close to the Vegas market that if you used both methods to predict the actual point differences between the teams, the average error in their predictions would differ by less than a tenth of a point. Now, if you’re betting on the outcomes of hundreds or thousands of games, these tiny differences may still be the difference between making and losing money. At the same time, however, it’s surprising that the aggregated wisdom of thousands of market participants, who collectively devote countless hours to analyzing upcoming games for any shred of useful information, is only incrementally better than a simple statistical model that relies only on historical averages.
When we first told some prediction market researchers about this result, their reaction was that it must reflect some special feature of football. The NFL, they argued, has lots of rules like salary caps and draft picks that help to keep teams as equal as possible. And football, of course, is a game where the result can be decided by tiny random acts, like the wide receiver dragging in the quarterback’s desperate pass with his fingertips as he runs full tilt across the goal line to win the game in its closing seconds. Football games, in other words, have a lot of randomness built into them — arguably, in fact, that’s what makes them exciting. Perhaps it’s not so surprising after all, then, that all the information and analysis that is generated by the small army of football pundits who bombard fans with predictions every week is not superhelpful (although it might be surprising to the pundits). In order to be persuaded, our colleagues insisted, we would have to find the same result in some other domain for which the signal-to-noise ratio might be considerably higher than it is in the specific case of football.
OK, what about baseball? Baseball fans pride themselves on their near-fanatical attention to every measurable detail of the game, from batting averages to pitching rotations. Indeed, an entire field of research called sabermetrics has developed specifically for the purpose of analyzing baseball statistics, even spawning its own journal, the Baseball Research Journal. One might think, therefore, that prediction markets, with their far greater capacity to factor in different sorts of information, would outperform simplistic statistical models by a much wider margin for baseball than they do for football. But that turns out not to be true either. We compared the predictions of the Las Vegas sports betting markets over nearly twenty thousand Major League baseball games played from 1999 to 2006 with a simple statistical model based again on home-team advantage and the recent win-loss records of the two teams. This time, the difference between the two was even smaller — in fact, the performance of the market and the model were indistinguishable. In spite of all the statistics and analysis, in other words, and in spite of the absence of meaningful salary caps in baseball and the resulting concentration of superstar players on teams like the New York Yankees and Boston Red Sox, the outcomes of baseball games are even closer to random events than football games.
Since then, we have either found or learned about the same kind of result for other kinds of events that prediction markets have been used to predict, from the opening weekend box office revenues for feature films to the outcomes of presidential elections. Unlike sports, these events occur without any of the rules or conditions that are designed to make sports competitive. There is also a lot of relevant information that prediction markets could conceivably exploit to boost their performance well beyond that of a simple model or a poll of relatively uninformed individuals. Yet when we compared the Hollywood Stock Exchange (HSX) — one of the most popular prediction markets, which has a reputation for accurate prediction—with a simple statistical model, the HSX did only slightly better. And in a separate study of the outcomes of five US presidential elections from 1988 to 2004, political scientists Robert Erikson and Christopher Wlezien found that a simple statistical correction of ordinary opinion polls outperformed even the vaunted Iowa Electronic Markets.
More (#1) from Everything is Obvious: