Behind every successful forward-looking decision stands a good prediction. When we move to a new job or buy a house, we implicitly predict that our choice will turn out better than the alternatives. Improving foresight, especially about political events, is also of obvious interest to governments. This is why the IARPA, a research arm of the U.S. intelligence agency, funded a large scientific study in crowd prediction. At the Good Judgment Project, we had the opportunity to run dozens of experiments over four years, focused on a single question: How best do we distill the wisdom of crowds for political predictions?
Prediction markets have set the gold standard for wisdom-of-crowds forecasting. A typical prediction market question may read: “Will Hillary Clinton win the 2016 presidential election?” You can buy or sell shares on the question, where each share will be worth $1 if Clinton wins, $0 if she loses. By design, the price corresponds to the probability of Clinton winning, e.g. $0.60 means that she has a 60% chance. If you believe her chances are higher or lower, you can buy or sell shares on that question and reap potential profits from your insights.
If you are an election buff, you can hardly talk about Clinton and Trump’s chances without invoking prediction markets. This is no accident. Prediction markets have a strong track record of picking winners in many colorful contests, ranging from presidential candidates to inkjet printers. Social scientists are using markets to predict which famous psychology findings will replicate. In theory, the invisible hand will do its magic, efficiently aggregating all relevant information available to participants.
So if you have important questions about the future and several hundred folks willing to provide their insights, placing them in a prediction market seems like a sure bet. Rather than taking this for granted, however, we ran the first large experiment comparing prediction markets to competing methods. To our surprise, we had identified a method that produced more accurate predictions than markets. For organizations like the U.S. intelligence community, with a budget of over $50 billion, improved predictions could make the difference between a correct decision and an Iraq-war style fiasco.
The winning method was inspired by concepts developed for weather forecasters, who are in the business of making daily probabilistic predictions. We called it prediction polls. Instead of betting, poll forecasters express their beliefs as a probability. For example, you may say that Clinton has 70% chance of winning the election. After learning if Clinton won or lost, you’d receive an accuracy score reflecting how close you came to the correct answer. If Clinton wins, for example, you will score better than someone who thought she had a 50/50 shot. After you and hundreds of other forecasters express their beliefs, statistical algorithms get to work, aggregating the predictions to arrive at a single wisdom-of-crowds estimate for each question.
Weather forecasters and other experts have used poll-based methods for decades, but no one had experimentally tested the efficacy of polls employing large crowds of non-experts. That is what we did. In a two-year tournament, we randomly assigned participants to either prediction markets or polls. This allowed us to trace the differences in accuracy to the methods, rather the individual forecasters. We collected approximately half a million predictions on almost three hundred questions, covering a wide range of geopolitical topics, from the winner of the German parliamentary elections to the vanquishing of the Ebola epidemic in West Africa.
Prediction polls outperformed markets, reducing errors by 10-20 percent. Three factors behind the strong performance of polls stood out. I believe understanding these will help most of us think more clearly about prediction.
First, individual skill matters. We found that some forecasters are just better than others at predicting the future. Accuracy was a matter of skill, not just luck. Once we had a forecaster’s track record on approximately twenty questions, we could use the information to amplify the voices of better-skilled individuals, improving the quality of the aggregate estimates. Markets have their own mechanism for learning about skill: better forecasters ideally accumulate earnings and influence at the expensive of their not-so-accurate peers.
Second, belief updating is a marker of skill. Accurate poll forecasters update their beliefs in a specific pattern: frequently, in small increments. They experienced the prediction task as a multitude of small surprises, rather than a string of large shocks. We used update frequency to inform which individuals would do best on a given question. By design, markets do not capture such behavioral measures, at least until this behavior has translated into market earnings.
Finally, we found that poll forecasters who considered themselves expert on a topic were not reliably more accurate than their peers who gave modest ratings of their own expertise. It turns out, assessing one’s knowledge and expertise relative to peers is an awfully difficult task for humans. Yet, in markets, that is what traders try to do: place bets when they possess unique advantages relative to other participants. In follow up research, I find that prediction market traders often fail to guess when to place small versus large market bets, potentially add noise to the resulting pricing signals.
So, what does this experience tell us about making better predictions? Probably the best way to improve your skill is to keep track of your accuracy. You should also remain open to changing your mind when you dig out new evidence, which often comes in small doses. And don’t assume that your initial knowledge will automatically enable you to hit the bull’s eye.
The results may disappoint staunch believers in in the superior efficiency of markets as information aggregators. But, as Thomas Huxley put it, such is the tragedy of science: the slaying of a beautiful hypothesis by an ugly fact. And in a time when political and economic risks seem to lurk around every corner, better predictions are worth questioning some long-held assumptions.
This was the first experiment comparing prediction polls and markets. But it is unlikely to be the last word. We do not know, for example, if powerful real-money incentives among thousands, rather than hundreds, of forecasters would boost market performance, and help markets defend their gold-standard status. To learn more about the experience of being a forecaster, try out prediction polls, as well as prediction markets with real or play money. These are great ways to try your luck—and cultivate your skill.