FLI Podcast: On Superforecasting with Robert de Neufville
Essential to our assessment of risk and ability to plan for the future is our understanding of the probability of certain events occurring. If we can estimate the likelihood of risks, then we can evaluate their relative importance and apply our risk mitigation resources effectively. Predicting the future is, obviously, far from easy — and yet a community of “superforecasters” are attempting to do just that. Not only are they trying, but these superforecasters are also reliably outperforming subject matter experts at making predictions in their own fields. Robert de Neufville joins us on this episode of the FLI Podcast to explain what superforecasting is, how it’s done, and the ways it can help us with crucial decision making.
Topics discussed in this episode include:
-What superforecasting is and what the community looks like
-How superforecasting is done and its potential use in decision making
-The challenges of making predictions
-Predictions about and lessons from COVID-19
You can find the page for this podcast here: futureoflife.org/2020/04/30/on-su…rt-de-neufville/
You can take a survey about the podcast here: www.surveymonkey.com/r/W8YLYD3
You can submit a nominee for the Future of Life Award here: futureoflife.org/future-of-life-a…ung-hero-search/
Lucas Perry: Welcome to the Future of Life Institute Podcast. I’m Lucas Perry. Today we have a conversation with Robert de Neufville about superforecasting. But, before I get more into the episode I have two items I’d like to discuss. The first is that the Future of Life Institute is looking for the 2020 recipient of the Future of Life Award. For those not familiar, the Future of Life Award is a $50,000 prize that we give out to an individual who, without having received much recognition at the time of their actions, has helped to make today dramatically better than it may have been otherwise. The first two recipients were Vasili Arkhipov and Stanislav Petrov, two heroes of the nuclear age. Both took actions at great personal risk to possibly prevent an all-out nuclear war. The third recipient was Dr. Matthew Meselson, who spearheaded the international ban on bioweapons. Right now, we’re not sure who to give the 2020 Future of Life Award to. That’s where you come in. If you know of an unsung hero who has helped to avoid global catastrophic disaster, or who have done incredible work to ensure a beneficial future of life, please head over to the Future of Life Award page and submit a candidate for consideration. The link for that page is on the page for this podcast or the description of wherever you might be listening. You can also just search for it directly. If your candidate is chosen, you will receive $3,000 as a token of our appreciation. We’re also incentivizing the search via MIT’s successful red balloon strategy, where the first to nominate the winner gets $3,000 as mentioned, but there are also tiered pay outs to the person who invited the nomination winner, and so on. You can find details about that on the page.
The second item is that there is a new survey that I wrote about the Future of Life Institute and AI Alignment Podcasts. It’s been a year since our last survey and that one was super helpful for me understanding what’s going well, what’s not, and how to improve. I have some new questions this time around and would love to hear from everyone about possible changes to the introductions, editing, content, and topics covered. So, if you have any feedback, good or bad, you can head over to the SurveyMonkey poll in the description of wherever you might find this podcast or on the page for this podcast. You can answer as many or as little of the questions as you’d like and it goes a long way for helping me to gain perspective about the podcast, which is often hard to do from my end because I’m so close to it.
And if you find the content and subject matter of this podcast to be important and beneficial, consider sharing it with friends, subscribing on Apple Podcasts, Spotify, or whatever your preferred listening platform, and leaving us a review. It’s really helpful for getting information on technological risk and the future of life to more people.
Regarding today’s episode, I just want to provide a little bit of context. The foundation of risk analysis has to do with probabilities. We use these probabilities and the predicted value lost if certain risks occur to calculate or estimate expected value. This in turn helps us to prioritize risk mitigation efforts to where it’s truly needed. So, it’s important that we’re able to make accurate predictions about the likelihood of future events and risk so that we can take the appropriate action to mitigate them. This is where superforecasting comes in.
Robert de Neufville is a researcher, forecaster, and futurist with degrees in government and political science from Harvard and Berkeley. He works particularly on the risk of catastrophes that might threaten human civilization. He is also a “superforecaster”, since he was among the top 2% of participants in IARPA’s Good Judgment forecasting tournament. He has taught international relations, comparative politics, and political theory at Berkeley and San Francisco State. He has written about politics for The Economist, The New Republic, The Washington Monthly, and Big Think.
And with that, here’s my conversation with Robert de Neufville on superforecasting.
All right. Robert, thanks so much for coming on the podcast.
Robert de Neufville: It’s great to be here.
Lucas Perry: Let’s just start off real simply here. What is superforecasting? Say if you meet someone, a friend or family member of yours asks you what you do for work. How do you explain what superforecasting is?
Robert de Neufville: I just say that I do some forecasting. People understand what forecasting is. They may not understand specifically the way I do it. I don’t love using “superforecasting” as a noun. There’s the book Superforecasting. It’s a good book and it’s kind of great branding for Good Judgment, the company, but it’s just forecasting, right, and hopefully I’m good at it and there are other people that are good at it. We have used different techniques, but it’s a little bit like an NBA player saying that they play super basketball. It’s still basketball.
But what I tell people for background is that the US intelligence community had this forecasting competition basically just to see if anyone could meaningfully forecast the future because it turns out one of the things that we’ve seen in the past is that people who supposedly have expertise in subjects don’t tend to be very good at estimating probabilities that things will happen.
So the question was, can anyone do that? And it turns out that for the most part people can’t, but a small subset of people in the tournament were consistently more accurate than the rest of the people. And just using open source information, we were able to decisively beat subject matter experts who actually that’s not a high bar. They don’t do very well. And we were also able to beat intelligence community analysts. We didn’t originally know we were going up against them, but we’re talking about forecasters in the intelligence community who had access to classified information we didn’t have access to. We were basically just using Google.
And one of the stats that we got later was that as a group we were more accurate 300 days ahead of a question being resolved than others were just a hundred days ahead. As far as what makes the technique of superforecasting sort of fundamentally distinct, I think one of the things is that we have a system for scoring our accuracy. A lot of times when people think about forecasting, people just make pronouncements. This thing will happen or it won’t happen. And then there’s no real great way of checking whether they were right. And they can also often after the fact explain away their forecast. But we make probabilistic predictions and then we use a mathematical formula that weather forecasters have used to score them. And then we can see whether we’re doing well or not well. We can evaluate and say, “Hey look, we actually outperformed these other people in this way.” And we can also then try to improve our forecasting when we don’t do well, ask ourselves why and try to improve it. So that’s basically how I explain it.
Lucas Perry: All right, so can you give me a better understanding here about who “we” is? You’re saying that the key point and where this started was this military competition basically attempting to make predictions about the future or the outcome of certain events. What are the academic and intellectual foundations of superforecasting? What subject areas would one study or did superforecasters come from? How was this all germinated and seeded prior to this competition?
Robert de Neufville: It actually was the intelligence community, although though I think military intelligence participated in this. But I mean I didn’t study to be a forecaster and I think most of us didn’t. I don’t know if there really has been a formal study that would lead you to be a forecaster. People just learn subject matter and then apply that in some way. There must be some training that people had gotten in the past, but I don’t know about it.
There was a famous study by Phil Tetlock. I think in the 90s it came out as a book called Expert Political Judgment, and he found essentially that experts were not good at this. But what he did find, he made a distinction between foxes and hedgehogs you might’ve heard. Hedgehogs are people that have one way of thinking about things, one system, one ideology, and they apply it to every question, just like the hedgehog has one trick and it’s its spines. Hedgehogs didn’t do well. If you were a Marxist or equally a dyed in the wool Milton Friedman capitalist and you applied that way of thinking to every problem, you tended not to do as well at forecasting.
But there’s this other group of people that he found did a little bit better and he called him foxes, and foxes are tricky. They have all sorts of different approaches. They don’t just come in with some dogmatic ideology. They look at things from a lot of different angles. So that was sort of the initial research that inspired him. And there’s other people that were talking about this, but it was ultimately Phil Tetlock and Barb Miller’s group that outperformed everyone else, had looked for people that were good at forecasting and they put them together in teams, and they aggregated their scores with algorithmic magic.
We had a variety of different backgrounds. If you saw any of the press initially, the big story that came out in the press was that we were just regular people. There was a lot of talk about so-and-so was a housewife and that’s true. We weren’t people that had a reputation for being great pundits or anything. That’s totally true. I think that was a little bit overblown though because it made it sound like so and so was a housewife and no one knew that she had this skill. Otherwise she was completely unremarkable. In fact, superforecasters as a group tended to be highly educated with advanced degrees. They tended to have backgrounds and they lived in a bunch of different countries.
The thing that correlates most with forecasting ability seems to be basically intelligence, performing well on measures of intelligence tests, and also I should say that a lot of very smart people aren’t good forecasters. Just being smart isn’t enough, but that’s one of the strongest predictors of forecasting ability and that’s not as good a story for journalists.
Lucas Perry: So it wasn’t crystals.
Robert de Neufville: If you do surveys of the way superforecasters think about the world, they tend not to do what you would call magical thinking. Some of us are religious. I’m not. But for the most part the divine isn’t an explanation in their forecast. They don’t use God to explain it. They don’t use things that you might consider a superstition. Maybe that seems obvious, but it’s a very rational group.
Lucas Perry: How’s superforecasting done and what kinds of models are generated and brought to bear?
Robert de Neufville: As a group, we tend to be very numeric. That’s one thing that correlates pretty well with forecasting ability. And when I say they come from a lot of backgrounds, I mean there are doctors, pharmacists, engineers. I’m a political scientist. There are actually a fair number of political scientists. Some people who are in finance or economics, but they all tend to be people who could make at least a simple spreadsheet model. We’re not all statisticians, but have at least a intuitive familiarity with statistical thinking and intuitive concept of Bayesian updating.
As far as what the approach is, we make a lot of simple models, often not very complicated models I think because often when you make a complicated model, you end up over fitting the data and drawing falsely precise conclusions, at least when we’re talking about complex, real-world political science-y kind of situations. But I would say the best guide for predicting the future, and this probably sounds obvious, best guide for what’s going to happen is what’s happened in similar situations in the past. One of the key things you do, if somebody asks you, “Will so and so when an election?” you would look back and say, “Well, what’s happened in similar elections in the past? What’s the base rate of the incumbent, for example, maybe from this party or that party winning an election, given this economy and so on?”
Now it is often very hard to beat simple algorithms that try to do the same thing, but that’s not a thing that you can just do by rote. It requires an element of judgment about what situations in the past count as similar to the situation you’re trying to ask a question about. In some ways that’s a big part of the trick is to figure out what’s relevant to the situation, trying to understand what past events are relevant, and that’s something that’s hard to teach I think because you could make a case for all sorts of things being relevant and there’s an intuitive feel that’s hard to explain to someone else.
Lucas Perry: The things that seem to be brought to bear here would be like these formal mathematical models and then the other thing would be what I think comes from Daniel Kahneman and is borrowed by the rationalist community, this idea of system one and system two thinking.
Robert de Neufville: Right.
Lucas Perry: Where system one’s, the intuitive, the emotional. We catch balls using system one. System one says the sun will come out tomorrow.
Robert de Neufville: Well hopefully the system two does too.
Lucas Perry: Yeah. System two does too. So I imagine some questions are just limited to sort of pen and paper system one, system two thinking, and some are questions that are more suitable for mathematical modeling.
Robert de Neufville: Yeah, I mean some questions are more suitable for mathematical modeling for sure. I would say though the main system we use is system two. And this is, as you say, we catch balls with some sort of intuitive reflex. It’s sort of maybe not in our prefrontal cortex. If I were trying to calculate the trajectory of a ball and tried to catch it, that would work very well. But I think most of what we’re doing when we forecast is trying to calculate something else. Often the models are really simple. It might be as simple as saying, “This thing has happened seven times in the last 50 years, so let’s start from the idea there’s a 14% chance of that thing happening again.” It’s analytical. We don’t necessarily just go with the gut and say this feels like a one in three chance.
Now that said, I think that it helps a lot and this is a problem with applying the results of our work. It helps a lot to have a good intuitive feel of probability like what one in three feels like, just a sense of how often that is. And superforecasters tend to be people who they are able to distinguish between smaller gradations of probability.
I think in general people that don’t think about this stuff very much, they have kind of three probabilities: definitely going to happen, might happen, and will never have. And there’s no finer grain distinction there. Whereas, I think superforecasters often feel like they can distinguish between 1% or 2% probabilities, the difference between 50% and 52%.
The sense of what that means I think is a big thing. If we’re going to tell a policymaker there’s a 52% chance of something happening, a big part of the problem is that policymakers have no idea what that means. They’re like, “Well, will it happen or won’t it? Oh, what do I do at number?” Right? How is that different from 50%? And I
Lucas Perry: All right, so a few things I’m interested in here. The first is I’m interested in what you have to say about what it means and how one learns how probabilities work. If you were to explain to policymakers or other persons who are interested who are not familiar with working with probabilities a ton, how one can get a better understanding of them and what that looks like. I feel like that would be interesting and helpful. And then the other thing that I’m sort of interested in getting a better understanding of is most of what is going on here seems like a lot of system two thinking, but I also would suspect and guess that many of the top superforecasters have very excellent, finely tuned system ones.
Robert de Neufville: Yeah.
Lucas Perry: Curious if you have any thoughts about these two things.
Robert de Neufville: I think that’s true. I mean, I don’t know exactly what counts as system one in the cognitive psych sense, but I do think that there is a feel that you get. It’s like practicing a jump shot or something. I’m sure Steph Curry, not that I’m Steph Curry in forecasting, but sure, Steph Curry, when he takes a shot, isn’t thinking about it at the time. He’s just practiced a lot. And by the same token, if you’ve done a lot of forecasting and thought about it and have a good feel for it, you may be able to look at something and think, “Oh, here’s a reasonable forecast. Here’s not a reasonable forecast.” I had that sense recently. When looking at FiveThirtyEight tracking COVID predictions for a bunch of subject matter experts, and they’re honestly kind of doing terribly. And part of it is that some of the probabilities are just not plausible. And that’s immediately obvious to me. And I think to other forecasters spent a lot of time thinking about it.
So I do think that without even having to do a lot of calculations or a lot of analysis, often I have a sense of what’s plausible, what’s in the right range just because of practice. When I’m watching a sporting event and I’m stressed about my team winning, for years before I started doing this, I would habitually calculate the probability of winning. It’s a neurotic thing. It’s like imposing some kind of control. I think I’m doing the same thing with COVID, right? I’m calculating probabilities all the time to make myself feel more in control. But that actually was pretty good practice for getting a sense of it.
I don’t really have the answer to how to teach that to other people except potentially the practice of trying to forecast and seeing what happens and when you’re right and when you’re wrong. Good Judgment does have some training materials that improved forecasting for people validated by research. They involve things about thinking about the base rate of things happening in the past and essentially going through sort of system two approaches, and I think that kind of thing can also really help people get a sense for it. But like anything else, there’s an element of practice. You can get better or worse at it. Well hopefully you get better.
Lucas Perry: So a risk that is 2% likely is two times more likely than a 1% chance risk. How do those feel differently to you than to me or a policymaker who doesn’t work with probabilities a ton?
Robert de Neufville: Well I don’t entirely know. I don’t entirely know what they feel like to someone else. I think I do a lot of one time in 50 that’s what 2% is and one time in a hundred that’s what 1% is. The forecasting platform we use, we only work in integer probabilities. So if it goes below half a percent chance, I’d round down to zero. And honestly I think it’s tricky to get accurate forecasting with low probability events for a bunch of reasons or even to know if you’re doing a good job because you have to do so many of them. I think about fractions often and have a sense of what something happening two times in seven might feel like in a way.
Lucas Perry: So you’ve made this point here that superforecasters are often better at making predictions than subject matter expertise. Can you unpack this a little bit more and explain how big the difference is? You recently just mentioned the COVID-19 virologists.
Robert de Neufville: Virologists, infectious disease experts, I don’t know all of them, but people whose expertise I really admire, who know the most about what’s going on and to whom I would turn in trying to make a forecast about some of these questions. And it’s not really fair because these are people often who have talked to FiveThirtyEight for 10 minutes and produced a forecast. They’re very busy doing other things, although some of them are doing modeling and you would think that they would have thought about some of these probabilities in advance. But one thing that really stands out when you look at those is they’ll give a 5% or 10% chance of something happening, which to me is virtually impossible. And I don’t think it’s their better knowledge of virology that makes them think it’s more likely. I think it’s having thought about what 5% or 10% means a lot. Well, they think it’s not very likely and they assign it, which sounds like a low number. That’s my guess. I don’t really know what they’re doing.
Lucas Perry: What’s an example of that?
Robert de Neufville: Recently there were questions about how many tests would be positive by a certain date, and they assigned a real chance, like a 5% or 10%, I don’t remember exactly the numbers, but way higher than I thought it would be for there being below a certain number of tests. And the problem with that was it would have meant essentially that all of a sudden the number of tests that were happening positive every day would drop off the cliff. Go from, I don’t know how many positive tests are a day, 27,000 in the US all of a sudden that would drop to like 2000 or 3000. And this we’re talking about forecasting like a week ahead. So really a short timeline. It just was never plausible to me that all of a sudden tests would stop turning positive. There’s no indication that that’s about to happen. There’s no reason why that would suddenly shift.
I mean maybe I can always say maybe there’s something that a virologist knows that I don’t, but I have been reading what they’re saying. So how would they think that it would go from 25,000 a day to 2000 a day over the next six days? I’m going to assign that basically a 0% chance.
Another thing that’s really striking, and I think this is generally true and it’s true to some extent of superforecasts, so we’ve had a little bit of an argument on our superforecasting platform, people are terrible at thinking about exponential growth. They really are. They really under predicted the number of cases and deaths even again like a week or two in advance because it was orders of magnitude higher than the number at the beginning of the week. But a computer, they’ve had like an algorithm to fit an exponential curve, would have had no problem doing it. Basically, I think that’s what the good forecasters did is we fit an exponential curve and said, “I don’t even need to know many of the details over the course of a week. My outside knowledge is the progression of the disease and vaccines or whatever isn’t going to make much difference.”
And like I said it’s often hard to beat a simple algorithm, but the virologists and infectious disease experts weren’t applying that simple algorithm, and it’s fair to say, well maybe some public health intervention will change the curve or something like that. But I think they were assigning way too high a probability to the exponential trends stopping. I just think it’s a failure to imagine. You know maybe the Trump administration is motivated reasoning on this score. They kept saying it’s fine. There aren’t very many deaths yet. But it’s easy for someone to project the trajectory a little bit further in the future and say, “Wow, there are going to be.” So I think that’s actually been a major policy issue too is people can’t believe the exponential growth.
Lucas Perry: There’s this tension between not trying to panic everyone in the country or you’re unsure if this is the kind of thing that’s an exponential or you just don’t really intuit how exponentials work. For the longest time, our federal government were like, “Oh, it’s just a person. There’s just like one or two people. They’re just going to get better and that will let go away or something.” What’s your perspective on that? Is that just trying to assuage the populace while they try to figure out what to do or do you think that they actually just don’t understand how exponentials work?
Robert de Neufville: I’m not confident with my theory of mind with people in power. I think one element is this idea that we need to avoid panic and I think that’s probably, they believe in good faith, that’s a thing that we need to do. I am not necessarily an expert on the role of panic in crises, but I think that that’s overblown personally. We have this image of, hey, in the movies, if there’s a disaster, all of a sudden everyone’s looting and killing each other and stuff, and we think that’s what’s going to happen. But actually often in disasters people really pull together and if anything have a stronger sense of community and help their neighbors rather than immediately go and try to steal their supplies. We did see some people fighting over toilet paper on news rolls and there are always people like that, but even this idea that people were hoarding toilet paper, I don’t even think that’s the explanation for why it was out of the stores.
If you tell everyone in the country they need two to three weeks and toilet paper right now today, yeah, of course they’re going to buy it off the shelf. That’s actually just what they need to buy. I haven’t seen a lot of panic. And I honestly am someone, if I had been an advisor to the administrations, I would have said something along the lines of “It’s better to give people accurate information so we can face it squarely than to try to sugarcoat it.”
But I also think that there was a hope that if we pretended things weren’t about to happen or that maybe they would just go away, I think that that was misguided. There seems to be some idea that you could reopen the economy and people would just die but the economy would end up being fine. I don’t think that would be worth it any way. Even if you don’t shut down, the economy’s going to be disrupted by what’s happening. So I think there are a bunch of different motivations for why governments weren’t honest or weren’t dealing squarely with this. It’s hard to know what’s not honesty and what is just genuine confusion.
Lucas Perry: So what organizations exist that are focused on superforecasting? Where or what are the community hubs and prediction aggregation mechanisms for superforecasters?
Robert de Neufville: So originally in the IARPA Forecasting Tournament, there were a bunch of different competing teams, and one of them was run by a group called Good Judgment. And that team ended up doing so well. They ended up basically taking over the later years of the tournament and it became the Good Judgment project. There was then a spinoff. Phil Tetlock and others who were involved with that spun off into something called Good Judgment Incorporated. That is the group that I work with and a lot of the superforecasters that were identified in that original tournament continue to work with Good Judgment.
We do some public forecasting and I try to find private clients interested in our forecasts. It’s really a side gig for me and part of the reason I do it is that it’s really interesting. It gives me an opportunity to think about things in a way and I feel like I’m much better up on certain issues because I’ve thought about them as forecasting questions. So there’s Good Judgment Inc. and they also have something called the Good Judgment Open. They have an open platform where you can forecast the kinds of questions we do. I should say that we have a forecasting platform. They come up with forecastable questions, but forecastable means that they’re a relatively clear resolution criteria.
But also you would be interested in knowing the answer. It wouldn’t be just some picky trivial answer. They’ll have a set resolution date so you know that if you’re forecasting something happening, it has to happen by a certain date. So it’s all very well-defined. And coming up with those questions is a little bit of its own skill. It’s pretty hard to do. So Good Judgment will do that. And they put it on a platform where then as a group we discuss the questions and give our probability estimates.
We operate to some extent in teams and they found there’s some evidence that teams of forecasters, at least good forecasters, can do a little bit better than people on their own. I find it very valuable because other forecasters do a lot of research and they critique my own ideas. There’s concerns about group think, but I think that we’re able to avoid those. I can talk about why if you want. Then there’s also this public platform called Good Judgment Open where they use the same kind of questions and anyone can participate. And they’ve actually identified some new superforecasters who participated on this public platform, people who did exceptionally well, and then they invited them to work with the company as well. There are others. I know a couple of superforecasters who are spinning off their own group. They made an app. I think it’s called Maybe, where you can do your own forecasting and maybe come up with your own questions. And that’s a neat app. There is Metaculus, which certainly tries to apply the same principles. And I know some superforecasters who forecast on Metaculus. I’ve looked at it a little bit, but I just haven’t had time because forecasting takes a fair amount of time. And then there are always prediction markets and things like that. There are a number of other things, I think, that try to apply the same principles. I don’t know enough about the space to know of all of the other platforms and markets that exist.
Lucas Perry: For some more information on the actual act of forecasting that will be put onto these websites, can you take us through something which you have forecasted recently that ended up being true? And tell us how much time it took you to think about it? And what your actual thinking was on it? And how many variables and things you considered?
Robert de Neufville: Yeah, I mean it varies widely. And to some extent it varies widely on the basis of how many times have I forecasted something similar. So sometimes we’ll forecast the change in interest rates, the fed moves. That’s something that’s obviously a lot of interest to people in finance. And at this point, I’ve looked at that kind of thing enough times that I have set ideas about what would make that likely or not likely to happen.
But some questions are much harder. We’ve had questions about mortality in certain age groups in different districts in England and I didn’t know anything about that. And all sorts of things come into play. Is the flu season likely to be bad? What’s the chance of flu season will be bad? Is there a general trend among people who are dying of complications from diabetes? Does poverty matter? How much would Brexit affect mortality chances? Although a lot of what I did was just look at past data and project trends, just basically projecting trends you can get a long way towards an accurate forecast in a lot of circumstances.
Lucas Perry: When such a forecast is made and added to these websites and the question for the thing which is being predicted resolves, what are the ways in which the websites aggregate these predictions? Or are we at the stage of them often being put to use? Or is the utility of these websites currently primarily honing the epistemic acuity of the forecasters?
Robert de Neufville: There are a couple of things. Like I hope that my own personal forecasts are potentially pretty accurate. But when we work together on a platform, we will essentially produce an aggregate, which is, roughly speaking, the median prediction. There’s some proprietary elements to it. They extremize it a little bit, I think, because once you aggregate it kind of blurs things towards the middle. They maybe weight certain forecasts and more recent forecasts differently. I don’t know the details of it. But you can improve accuracy not just by taking the median of our forecast or in a prediction market, but doing a little algorithmic tweaking they found they can improve accuracy a little bit. That’s sort of what happens with our output.
And then as far as how people use it, I’m afraid not very well. There are people who are interested in Good Judgement’s forecasts and who pay them to produce forecasts. But it’s not clear to me what decision makers do with it or if they know what to do.
I think a big problem selling forecasting is that people don’t know what to do with a 78% chance of this, or let’s say a 2% chance of a pandemic in a given year, I’m just making that up. But somewhere in that ballpark, what does that mean about how you should prepare? I think that people don’t know how to work with that. So it’s not clear to me that our forecasts are necessarily affecting policy. Although it’s the kind of thing that gets written up in the news and who knows how much that affects people’s opinions, or they talk about it at Davos and maybe those people go back and they change what they’re doing.
Certain areas, I think people in finance know how to work with probabilities a little bit better. But they also have models that are fairly good at projecting certain types of things, so they’re already doing a reasonable job, I think.
I wish it were used better. If I were the advisor to a president, I would say you should create a predictive intelligence unit using superforecasters. Maybe give them access to some classified information, but even using open source information, have them predict probabilities of certain kinds of things and then develop a system for using that in your decision making. But I think we’re a fair ways away from that. I don’t know any interest in that in the current administration.
Lucas Perry: One obvious leverage point for that would be if you really trusted this group of superforecasters. And the key point for that is just simply how accurate they are. So just generally, how accurate is superforecasting currently? If we took the top 100 superforecasters in the world, how accurate are they over history?
Robert de Neufville: We do keep score, right? But it depends a lot on the difficulty of the question that you’re asking. If you ask me whether the sun will come up tomorrow, yeah, I’m very accurate. If you asked me to predict a random number generator, but you want a 100, I’m not very accurate. And it’s hard often to know with a given question how hard it is to forecast.
I have what’s called a Brier score. Essentially a mathematical way of correlating your forecast, the probabilities you give with the outcomes. A lower Brier score essentially is a better fit. I can tell you what my Brier score was on the questions I forecasted in the last year. And I can tell you that it’s better than a lot of other people’s Brier scores. And that’s the way you know I’m doing a good job. But it’s hard to say how accurate that is in some absolute sense.
It’s like saying how good are NBA players and taking jump shots. It depends where they’re shooting from. That said, I think broadly speaking, we are the most accurate. So far, superforecasters had a number of challenges. And I mean I’m proud of this. We pretty much crushed all comers. They’ve tried to bring artificial intelligence into it. We’re still, I think as far as I know, the gold standard of forecasting. But we’re not prophets by any means. Accuracy for us is saying there’s a 15% chance of this thing in politics happening. And then when we do that over a bunch of things, yeah, 15% of them end up happening. It is not saying this specific scenario will definitely come to pass. We’re not prophets. Getting the well calibrated probabilities over a large number of forecasts is the best that we can do, I think, right now and probably in the near future for these complex political social questions.
Lucas Perry: Would it be skillful to have some sort of standardized group of expert forecasters rank the difficulty of questions, which then you would be able to better evaluate and construct a Brier score for persons?
Robert de Neufville: It’s an interesting question. I think I could probably tell you, I’m sure other forecasters could tell you which questions are relatively easier or harder to predict. Things where there’s a clear trend and there’s no good reason for it changing are relatively easy to predict. Things where small differences could make it tip into a lot of different end states are hard to predict. And I can sort of have a sense initially what those would be.
I don’t know what the advantage of ranking questions like that and then trying to do some weighted adjustment. I mean maybe you could. But the best way that I know of to really evaluate forecasting scale is to compare it with other forecasters. I’d say it’s kind of a baseline. What do you know other good forecasters come up with and what do average forecasters come up with? And can you beat prediction markets? I think that’s the best way of evaluating relative forecasting ability. But I’m not sure it’s possible that some kind of weighting would be useful in some context. I hadn’t really thought about it.
Lucas Perry: All right, so you work both as a superforecaster, as we’ve been talking about, but you also have a position at the Global Catastrophic Risk Institute. Can you provide a little bit of explanation for how superforecasting and existential and global catastrophic risk analysis are complimentary?
Robert de Neufville: What we produce at GCRI, a big part of our product is academic research. And there are a lot of differences. If I say there’s a 10% chance of something happening on a forecasting platform, I have an argument for that. I can try to convince you that my rationale is good. But it’s not the kind of argument that you would make in an academic paper. It wouldn’t convince people it was 100% right. My warrant for saying that on the forecasting platform is I have a track record. I’m good at figuring out what the correct argument is or have been in the past, but producing an academic paper is a whole different thing.
There’s some of the same skills, but we’re trying to produce a somewhat different output. What superforecasters say is an input in writing papers about catastrophic risk or existential risk. We’ll use what superforecasters think as a piece of data. That said, superforecasters are validated at doing well at certain category of political, social economic questions. And over a certain timeline, we know that we outperform others up to like maybe two years.
We don’t really know if we can do meaningful forecasting 10 years out. That hasn’t been validated. You can see why that would be difficult to do. You would have to have a long experiment to even figure that out. And it’s often hard to figure out what the right questions to ask about 2030 would be. I generally think that the same techniques we use would be useful for forecasting 10 years out, but we don’t even know that. And so a lot of the things that I would look at in terms of global catastrophic risk would be things that might happen at some distant point in the future. Now what’s the risk that there will be a nuclear war in 2020, but also over the next 50 years? It’s a somewhat different thing to do.
They’re complementary. They both involve some estimation of risk and they use some of the same techniques. But the longer term aspect … The fact that as I think I said, one of the best ways superforecasters do well is that they use the past as a guide to the future. A good rule of thumb is that the status quo is likely to be the same. There’s a certain inertia. Things are likely to be similar in a lot of ways to the past. I don’t know if that’s necessarily very useful for predicting rare and unprecedented events. There is no precedent for an artificial intelligence catastrophe, so what’s the base rate of that happening? It’s never happened. I can use some of the same techniques, but it’s a little bit of a different kind of thing.
Lucas Perry: Two people are coming to my mind of late. One is Ray Kurzweil, who has made a lot of longterm technological predictions about things that have not happened in the past. And then also curious to know if you’ve read The Precipice: Existential Risk and the Future of Humanity by Toby Ord. Toby makes specific predictions about the likelihood of existential and global catastrophic risks in that book. I’m curious if you have any perspective or opinion or anything to add on either of these two predictors or their predictions?
Robert de Neufville: Yeah, I’ve read some good papers by Toby Ord. I haven’t had a chance to read the book yet, so I can’t really comment on that. I really appreciate Ray Kurzweil. And one of the things he does that I like is that he holds himself accountable. He’s looked back and said, how accurate are my predictions? Did this come true or did that not come true? I think that is a basic hygiene point of forecasting. You have to hold yourself accountable and you can’t just go back and say, “Look, I was right,” and not rationalize whatever somewhat off forecasts you’ve made.
That said, when I read Kurzweil, I’m skeptical, maybe that’s my own inability to handle exponential change. When I look at his predictions for certain years, I think he does a different set of predictions for seven year periods. I thought, “Well, he’s actually seven years ahead.” That’s pretty good actually, if you’re predicting what things are going to be like in 2020, but you just think it’s going to be 2013. Maybe they get some credit for that. But I think that he is too aggressive and optimistic about the pace of change. Obviously exponential change can happen quickly.
But I also think another rule of thumb is that things take a long time to go through beta. There’s the planning fallacy. People always think that projects are going to take less time than they actually do. And even when you try to compensate for the planning fallacy and double the amount of time, it still takes twice as much time as you come up with. I tend to think Kurzweil sees things happening sooner than they will. He’s a little bit of a techno optimist, obviously. But I haven’t gone back and looked at all of his self evaluation. He scores himself pretty well.
Lucas Perry: So we’ve spoken a bit about the different websites. And what are they technically called, what is the difference between a prediction market and … I think Metaculus calls itself a massive online prediction solicitation and aggregation engine, which is not a prediction market. What are the differences here and how’s the language around these platforms used?
Robert de Neufville: Yeah, so I don’t necessarily know all the different distinction categories someone would make. I think a prediction market particularly is where you have some set of funds, some kind of real or fantasy money. We used one market in the Good Judgement project. Our money was called Inkles and we could spend that money. And essentially, they traded probabilities like you would trade a share. So if there was a 30% chance of something happening on the market, that’s like a price of 30 cents. And you would buy that for 30 cents and then if people’s opinions about how likely that was changed and a lot of people bought it, then we could bid up to 50% chance of happening and that would be worth 50 cents.
So if I correctly realize that something … that the market says is a 30% chance of happening, if I correctly realized that, that’s more likely, I would buy shares of that. And then eventually either other people would realize it, too, or it would happen. I should say that when things happened, then you’d get a dollar, then it’s suddenly it’s 100% chance of happening.
So if you recognize that something had a higher percent chance of happening than the market was valuing at, you could buy a share of that and then you would make money. That basically functions like a stock market, except literally what you’re trading is directly the probability of a question will answer yes or no.
The stock market’s supposed to be really efficient, and I think in some ways it is. I think prediction markets are somewhat useful. Big problem with prediction markets is that they’re not liquid enough, which is to say that a stock market, there’s so much money going around and people are really just on it to make money, that it’s hard to manipulate the prices.
There’s plenty of liquidity on the prediction markets that I’ve been a part of. Like for the one on the Good Judgement project, for example, sometimes there’d be something that would say there was like a 95% chance of it happening on the prediction market. In fact, there would be like a 99.9% chance of it happening. But I wouldn’t buy that share, even though I knew it was undervalued, because the return on investment wasn’t as high as it was on some other questions. So it would languish at this inaccurate probability, because there just wasn’t enough money to chase all the good investments.
So that’s one problem you can have in a prediction market. Another problem you can have … I see it happen with PredictIt, I think. They used to be the IO Exchange predicting market. People would try to manipulate the market for some advertising reason, basically.
Say you were working on a candidate’s campaign and you wanted to make it look like they were a serious contender, it was a cheap investment and you put a lot of money in the prediction market and you boost their chances, but that’s not really boosting their chances. That’s just market manipulation. You can’t really do that with the whole stock market, but prediction markets aren’t well capitalized, you can do that.
And then I really enjoy PredictIt. PredictIt’s one of the prediction markets that exists for political questions. They have some dispensation so that it doesn’t count as gambling in the U.S. Add it’s research purposes: is there some research involved with PredictIt. But they have a lot of fees and they use their fees to pay for the people who run the market. And it’s expensive. But the fees mean that the prices are very sticky and it’s actually pretty hard to make money. Probabilities have to be really out of whack before you can make enough money to cover your fees.
So things like that make these markets not as accurate. I also think that although we’ve all heard about the wisdom of the crowds, and broadly speaking, crowds might do better than just a random person. They can also do a lot of herding behavior that good forecasters wouldn’t do. And sometimes the crowds overreact to things. And I don’t always think the probabilities that prediction markets come up with are very good.
Lucas Perry: All right. Moving along here a bit. Continuing the relationship of superforecasting with global catastrophic and existential risk. How narrowly do you think that we can reduce the error range for superforecasts on low probability events like global catastrophic risks and existential risks? If a group of forecasters settled on a point estimate of 2% chance for some kind of global catastrophic for existential risk, but with an error range of like 1%, that dramatically changes how useful the prediction is, because of its major effects on risk. How accurate do you think we can get and how much do you think we can squish the probability range?
Robert de Neufville: That’s a really hard question. When we produce forecasts, I don’t think there’s necessarily clear error bars built in. One thing that Good Judgement will do, is it will show where forecasters all agreed the probability is 2% and then it will show if there’s actually a wide variation. I’m thinking 0%, some think it’s 4% or something like that. And that maybe tells you something. And if we had a lot of very similar forecasts, maybe you could look back and say, we tend to have an error of this much. But for the kinds of questions we look at with catastrophic risk, it might really be hard to have a large enough “n”. Hopefully it’s hard to have a large “n” where you could really compute an error range. If our aggregate spits out a probability of 2%, it’s difficult to know in advance for a somewhat unique question how far off we could be.
I don’t spend a lot of time thinking about frequentist or Bayesian interpretations or probability or counterfactuals or whatever. But at some point, if I say it has a 2% probability of something and then it happens, I mean it’s hard to know what my probability meant. Maybe we live in a deterministic universe and that was 100% going to happen and I simply failed to see the signs of it. I think that to some extent, what kind of probabilities you assign things depend on the amount of information you get.
Often we might say that was a reasonable probability to assign to something because we couldn’t get much better information. Given the information we had, that was our best estimate of the probability. But it might always be possible to know with more confidence if we got better information. So I guess one thing I would say is if you want to reduce the error on our forecasts, it would help to have better information about the world.
And that’s some extent where what I do with GCRI comes in. We’re trying to figure out how to produce better estimates. And that requires research. It requires thinking about these problems in a systematic way to try to decompose them into different parts and figure out what we can look at the past and use to inform our probabilities. You can always get better information and produce more accurate probabilities, I think.
The best thing to do would be to think about these issues more carefully. Obviously, it’s a field. Catastrophic risk is something that people study, but it’s not the most mainstream field. There’s a lot of research that needs to be done. There’s a lot of low hanging fruit, work that could easily be done applying research done in other fields, to catastrophic risk issues. But they’re just aren’t enough researchers and there isn’t enough funding to do all the work that we should do.
So my answer would be, we need to do better research. We need to study these questions more closely. That’s how we get to better probability estimates.
Lucas Perry: So if we have something like a global catastrophic or existential risk, and say a forecaster says that there’s a less than 1% chance that, that thing is likely to occur. And if this less than 1% likely thing happens in the world, how does that update our thinking about what the actual likelihood of that risk was? Given this more meta point that you glossed over about how if the universe is deterministic, then the probability of that thing was actually more like 100%. And the information existed somewhere, we just didn’t have access to that information or something. Can you add a little bit of commentary here about what these risks mean?
Robert de Neufville: I guess I don’t think it’s that important when forecasting, if I have a strong opinion about whether or not we live in a single deterministic universe where outcomes are in some sense in the future, all sort of baked in. And if only we could know everything, then we would know with a 100% chance everything that was going to happen. Or whether there are some fundamental randomness, or maybe we live in a multiverse where all these different outcomes are happening, you could say that in 30% of the universes in this multiverse, this outcome comes true. I don’t think that really matters for the most part. I do think as a practical question, we may make forecast on the basis of the best information we have, that’s all you can do. But there are some times you look back and say, “Well, I missed this. I should’ve seen this thing.” I didn’t think that Donald Trump would win the 2016 election. That’s literally my worst Brier score ever. I’m not alone in that. And I comfort myself by saying there was actually genuinely small differences made a huge impact.
But there are other forecasters who saw it better than I did. Nate Silver didn’t think that Trump was a lock, but he thought it was more likely and he thought it was more likely for the right reasons. That you would get this correlated polling error in a certain set of states that would hand Trump the electoral college. So in retrospect, I think, in that case I should’ve seen something like what Nate Silver did. Now I don’t think in practice it’s possible to know enough about an election to get in advance who’s going to win.
I think we still have to use the tools that we have, which are things like polling. In complex situations, there’s always stuff that I missed when I make a mistake and I can look back and say I should have done a better job figuring that stuff out. I do think though, with the kinds of questions we forecast, there’s a certain irreducible, I don’t want to say randomness because I’m not making a position on whether the university is deterministic, but irreducible uncertainty about what we’re realistically able to know and we have to base our forecasts on the information that’s possible to get. I don’t think metaphysical interpretation is that important to figuring out these questions. Maybe it comes up a little bit more with unprecedented one-off events. Even then I think you’re still trying to use the same information to estimate probabilities.
Lucas Perry: Yeah, that makes sense. There’s only the set of information that you have access to.
Robert de Neufville: Something actually occurs to me. One of the things that superforecaster are proud of is that we beat these intelligence analysts that had access to classified information and I think that if we had access to more information, I mean we’re doing our research on Google, right? Or maybe occasionally we’ll write a government official and get a FOIA request or something, but we’re using open source intelligence and it, I think it would probably help if we had access to more information that would inform our forecasts, but sometimes more information actually hurts you.
People have talked about a classified information bias that if you have secret information that other people don’t have, you are likely to think that is more valuable and useful than it actually is and you overweight the classified information. But if you had that secret information, I don’t know if it’s an ego thing, you want to have a different forecast than other people don’t have access to. It makes you special. You have to be a little bit careful. More information isn’t always better. Sometimes the easy to find information is actually really dispositive and is enough. And if you search for more information, you can find stuff that is irrelevant to your forecast, but think that it is relevant.
Lucas Perry: So if there’s some sort of risk and the risk occurs, after the fact how does one update what the probability was more like?
Robert de Neufville: It depends a little bit of the context. If you want to evaluate my prediction. If I say I thought there was a 30% chance of the original Brexit vote would be to leave England. That actually was more accurate than some other people, but I didn’t think it was likely. Now in hindsight, should I have said 100%. Somebody might argue that I should have, that if you’d really been paying attention, you would have known 100%.
Lucas Perry: But like how do we know it wasn’t 5% and we live in a rare world?
Robert de Neufville: We don’t. You basically can infer almost nothing from an n of 1. Like if I say there’s a 1% chance of something happening and it happens, you can be suspicious that I don’t know what I’m talking about. Even from that n of 1, but there’s also a chance that there was a 1% chance that it happened and that was the 1 time in a 100. To some extent that could be my defense of my prediction that Hillary was going to win. I should talk about my failures. The night before, I thought there was a 97% chance that Hillary would win the election and that’s terrible. And I think that that was a bad forecast in hindsight. But I will say that typically when I’ve said there’s a 97% chance of something happening, they have happened.
I’ve made more than 30-some predictions that things are going to be 97% percent likely and that’s the only one that’s been wrong. So maybe I’m actually well calibrated. Maybe that was the 3% thing that happened. You can only really judge over a body of predictions and if somebody is always saying there’s a 1% chance of things happening and they always happen, then that’s not a good forecaster. But that’s a little bit of a problem when you’re looking at really rare, unprecedented events. It’s hard to know how well someone does at that because you don’t have an n of hopefully more than 1. It is difficult to assess those things.
Now we’re in the middle of a pandemic and I think that the fact that this pandemic happened maybe should update our beliefs about how likely pandemics will be in the future. There was the Spanish flu and the Asian flu and this. And so now we have a little bit more information about the base rate, which these things happen. It’s a little bit difficult because 1918 is very different from 2020. The background rate of risk, may be very different from what it was in 1918 so you want to try to take those factors into account, but each event does give us some information that we can use for estimating the risk in the future. You can do other things. A lot of what we do as a good forecaster is inductive, right? But you can use deductive reasoning. You can, for example, with rare risks, decompose them into the steps that would have to happen for them to happen.
What systems have to fail for a nuclear war to start? Or what are the steps along the way to potentially an artificial intelligence catastrophe. And I might be able to estimate the probability of some of those steps more accurately than I estimate the whole thing. So that gives us some kind of analytic methods to estimate probabilities even without real base rate of the thing itself happening.
Lucas Perry: So related to actual policy work and doing things in the world. The thing that becomes skillful here seems to be to use these probabilities to do expected value calculations to try and estimate how much resources should be fed into mitigating certain kinds of risks.
Robert de Neufville: Yeah.
Lucas Perry: The probability of the thing happening requires a kind of forecasting and then also the value that is lost requires another kind of forecasting. What are your perspectives or opinions on superforecasting and expected value calculations and their use in decision making and hopefully someday more substantially in government decision making around risk?
Robert de Neufville: We were talking earlier about the inability of policymakers to understand probabilities. I think one issue is that a lot of times when people make decisions, they want to just say, “What’s going to happen? I’m going to plan for the single thing that’s going to happen.” But as a forecaster, I don’t know what’s going to happen. I might if I’m doing a good job, know there’s a certain percent chance that this will happen, a certain percent chance that that will happen. And in general, I think that policymakers need to make decisions over sort of the space of possible outcomes with the planning for contingencies. And I think that is a more complicated exercise than a lot of policymakers want to do. I mean I think it does happen, but it requires being able to hold in your mind all these contingencies and plan for them simultaneously. And I think that with expected value calculations to some extent, that’s what you have to do.
That gets very complicated very quickly. When we forecast questions, we might forecast some discrete fact about the world and how many COVID deaths will there be by a certain date. And it’s neat that I’m good at that, but there’s a lot that that doesn’t tell you about the state of the world at that time. There’s a lot of information that would be valuable making decisions. I don’t want to say infinite because it may be sort of technically wrong, but there is essentially uncountable amount of things you might want to know and you might not even know what the relevant questions to ask about a certain space. So it’s always going to be somewhat difficult to get an expected value calculation because you can sort of not possibly forecast all the things that might determine the value of something.
I mean, this is a little bit of a philosophical critique of consequentialist kind of analyses of things too. Like if you ask if something is good or bad, it may have an endless chain of consequences rippling throughout future history and maybe it’s really a disaster now, but maybe it means that future Hitler isn’t born. How do you evaluate that? It might seem like a silly trivial point, but the fact is it may be really difficult to know enough about the consequences of your action to an expected value calculation. So your expected value calculation may have to be kind of a approximation in a certain sense, given broad things we know these are things that are likely to happen. I still think expected value calculations are good. I just think there’s a lot of uncertainty in them and to some extent it’s probably irreducible. I think it’s always better to think about things clearly if you can. It’s not the only approach. You have to get buy-in from people and that makes a difference. But the more you can do accurate analysis about things, I think the better your decisions are likely to be.
Lucas Perry: How much faith or confidence do you have that the benefits of superforecasting and this kind of thought will increasingly be applied to critical government or non-governmental decision-making processes around risk?
Robert de Neufville: Not as much as I’d like. I think now that we know that people can do a better or worse job of predicting the future, we can use that information and it will eventually begin to be integrated into our governance. I think that that will help. But in general, you know my background’s in political science and political science is, I want to say, kind of discouraging. You learn that even under the best circumstances, outcomes of political struggles over decisions are not optimal. And you could imagine some kind of technocratic decision-making system, but even that ends up having its problems or the technocrats end up just lining their own pockets without even realizing they’re doing it or something. So I’m a little bit skeptical about it and right now what we’re seeing with the pandemic, I think we systematically underprepare for certain kinds of things, that there are reasons why it doesn’t help leaders very much to prepare for things that will never happen.
And with something like a public health crisis, the deliverable is for nothing to happen and if you succeed, it looks like all your money was wasted, but in fact you’ve actually prevented anything from happening and that’s great. The problem is that that creates an underincentive for leaders. They don’t get credit for preventing the pandemic that no one even knew could have happened and they don’t necessarily win the next election or business leaders may not improve their quarterly profits much by preparing for rare risks for that and other reasons too. I think that we’re probably… have a hard time believing cognitively that certain kinds of things that seem crazy like this could happen. I’m somewhat skeptical about that. Now I think in this case we had institutions who did prepare for this, but for whatever reason a lot of governments fail to do what was necessary.
Failed to respond quickly enough or minimize that what was happening. There are worse actors than others, right, but this isn’t a problem that’s just about the US government. This is a problem in Italy, in China, and it’s disheartening because COVID-19 is pretty much exactly one of the major scenarios that infectious disease experts have been warning about. The novel coronavirus that jumps from animals to humans that spread through some kind of respiratory pathway that’s highly infectious, that spreads asymptomatically. This is something that people worried about and knew about and in a sense it was probably only a matter of time that this was going to happen and there might be a small risk in any given year and yet we weren’t ready for it, didn’t take the steps, we lost time. It could have been used saving lives. That’s really disheartening.
I would like to see us learn a lesson from this and I think to some extent, once this is all over, whenever that is, we will probably create some institutional structures, but then we have to maintain them. We tend to forget a generation later about these kinds of things. We need to create governance systems that have more incentive to prepare for rare risks. It’s not the only thing we should be doing necessarily, but we are underprepared. That’s my view.
Lucas Perry: Yeah, and I mean the sample size of historic pandemics is quite good, right?
Robert de Neufville: Yeah. It’s not like we were invaded by aliens. Something like this happens in just about every person’s lifetime. It’s historically not that rare and this is a really bad one, but the Spanish flu and the Asian flu were also pretty bad. We should have known this was coming.
Lucas Perry: What I’m also reminded here of and some of these biases you’re talking about, we have climate change on the other hand, which is destabilizing and kind of global catastrophic risky, depending on your definition and for people who are against climate change, there seems to be A) lack in trust of science and B) then not wanting to invest in expensive technologies or something that seemed wasteful. I’m just reflecting here on all of the biases that fed into our inability to prepare for COVID.
Robert de Neufville: Well, I don’t think the distrust of science is sort of a thing that’s out there. I mean, maybe to some extent it is, but it’s also a deliberate strategy that people with interests in continuing, for example, the fossil fuel economy, have deliberately tried to cloud the issue to create distrust in science to create phony studies that make it seem that climate change isn’t real. We thought a little bit about this at GCRI about how this might happen with artificial intelligence. You can imagine that somebody with a financial interest might try to discredit the risks and make it seem safer than it is, and maybe they even believe that to some extent, nobody really wants to believe that the thing that’s getting them a lot of money is actually evil. So I think distrust in science really isn’t an accident and it’s a deliberate strategy and it’s difficult to know how to combat it. There are strategies you can take, but it’s a struggle, right? There are people who have an interest in keeping scientific results quiet.
Lucas Perry: Yeah. Do you have any thoughts then about how we could increase the uptake of using forecasting methodologies for all manner of decision making? It seems like generally you’re pessimistic about it right now.
Robert de Neufville: Yeah. I am a little pessimistic about it. I mean one thing is that I think that we’ve tried to get people interested in our forecasts and a lot of people just don’t know what to do with them. Now one thing I think is interesting is that often people, they’re not interested in my saying, “There’s a 78% chance of something happening.” What they want to know is, how did I get there? What is my arguments? That’s not unreasonable. I really like thinking in terms of probabilities, but I think it often helps people understand what the mechanism is because it tells them something about the world that might help them make a decision. So I think one thing that maybe can be done is not to treat it as a black box probability, but to have some kind of algorithmic transparency about our thinking because that actually helps people, might be more useful in terms of making decisions than just a number.
Lucas Perry: So is there anything else here that you want to add about COVID-19 in particular? General information or intuitions that you have about how things will go? What the next year will look like? There is tension in the federal government about reopening. There’s an eagerness to do that, to restart the economy. The US federal government and the state governments seem totally unequipped to do the kind of testing and contact tracing that is being done in successful areas like South Korea. Sometime in the short to medium term we’ll be open and there might be the second wave and it’s going to take a year or so for a vaccine. What are your intuitions and feelings or forecasts about what the next year will look like?
Robert de Neufville: Again, with the caveat that I’m not a virologist or not an expert in vaccine development and things like that, I have thought about this a lot. I think there was a fantasy, still is a fantasy that we’re going to have what they call a V-shape recovery that… you know everything crashed really quickly. Everyone started filing for unemployment as all the businesses shut down. Very different than other types of financial crises, this virus economics. But there was this fantasy that we would sort of put everything on pause, put the economy into some cryogenic freeze, and somehow keep people able to pay their bills for a certain amount of time. And then after a few months, we’d get some kind of therapy or vaccine or it would die down and suppress the disease somehow. And then we would just give it a jolt of adrenaline and we’d be back and everyone would be back in their old jobs and things would go back to normal. I really don’t think that is what’s going to happen. I think it is almost thermodynamically harder to put things back together than it is to break them. That there are things about the US economy in particular, the fact that in order to keep getting paid, you actually need to lose your job and go on unemployment, in many cases. It’s not seamless. It’s hard to even get through on the phone lines or to get the funding.
I think that even after a few months, the US economy is going to look like a town that’s been hit by a hurricane and we’re going to have to rebuild a lot of things. And maybe unemployment will go down faster than it did in previous recessions where it was more about a bubble popping or something, but I just don’t think that we go back to normal.
I also just don’t think we go back to normal in a broader sense. This idea that we’re going to have some kind of cure. Again, I’m not a virologist, but I don’t think we typically have a therapy that cures viruses the way you know antibiotics might be super efficacious against bacteria. Typically, viral diseases, I think are things we have to try to mitigate and some cocktail may improve treatments and we may figure out better things to do with ventilators. Well, you might get the fatality rate down, but it’s still going to be pretty bad.
And then there is this idea maybe we’ll have a vaccine. I’ve heard people who know more than I do say maybe it’s possible to get a vaccine by November. But, the problem is until you can simulate with a supercomputer what happens in the human body, you can’t really speed up biological trials. You have to culture things in people and that takes time.
You might say, well, let’s don’t do all the trials, this is an emergency. But the fact is, if you don’t demonstrate that a vaccine is safe and efficacious, you could end up giving something to people that has serious adverse effects, or even makes you more susceptible to disease. That was problem one of the SARS vaccines they tried to come up with. Originally, is it made people more susceptible. So you don’t want to hand out millions and millions of doses of something that’s going to actually hurt people, and that’s the danger if you skip these clinical trials. So it’s really hard to imagine a vaccine in the near future.
I don’t want to sell short human ingenuity because we’re really adaptable, smart creatures, and we’re throwing all our resources at this. But, there is a chance that there is really no great vaccine for this virus. We haven’t had great luck with finding vaccines for coronaviruses. It seems to do weird things to the human immune system and maybe there is evidence that immunity doesn’t stick around that long. It’s possible that we come up with a vaccine that only provides partial immunity and doesn’t last that long. And I think there is a good chance that essentially we have to keep social distancing well into 2021 and that this could be a disease that remains dangerous and we have to continue to keep fighting for years potentially.
I think that we’re going to open up and it is important to open up as soon as we can because what’s happening with the economy will literally kill people and cause famines. But on the other hand, we’re going to get outbreaks that come back up again. You know it’s going to be a like fanning coals if we open up too quickly and in some places we’re not going to get it right and that doesn’t save anyone’s life. I mean, if it starts up again and the virus disrupts the economy again. So I think this is going to be a thing we are struggling to find a balance to mitigate and that we’re not going to go back to December 2019 for a while, not this year. Literally, it may be years.
And I think that although humans have amazing capacity to forget things and go back to normal life. I think that we’re going to see permanent changes. I don’t know exactly what they are. But, I think we’re going to see permanent changes in the way we live. And I don’t know if I’m ever shaking anyone’s hands again. We’ll see about that. A whole generation of people are going to be much better at washing their hands.
Lucas Perry: Yeah. I’ve already gotten a lot better at washing my hands watching tutorials.
Robert de Neufville: I was terrible at it. I had no idea how bad I was.
Lucas Perry: Yeah, same. I hope people who have shaken my hand in the past aren’t listening. So the things that will stop this are sufficient herd immunity to some extent or a vaccine that is efficacious. Those seem like the, okay, it’s about time to go back to normal points, right?
Robert de Neufville: Yeah.
Lucas Perry: A vaccine is not a given thing given the class of coronavirus diseases and how they behave?
Robert de Neufville: Yeah. Eventually now this is where I really feel like I’m not a virologist, but eventually diseases evolve and we co-evolve with them. Whatever the Spanish Flu was, it didn’t continue to kill as many people years down the line. I think that’s because people did develop immunity.
But also, viruses don’t get any evolutionary advantage from killing their hosts. They want to use us to reproduce. Well, they don’t want anything, but that advantages them. If they kill us and make us use mitigation strategies, that hurts their ability to reproduce. So in the long run, and I don’t know how long that run is, but eventually we co-evolve with it and it becomes endemic instead of epidemic and it’s presumably not as lethal. But, I think that it is something that we could be fighting for a while.
There is chances of additional disasters happening on top of it. We could get another disease popping out of some animal population while our immune systems are weak or something like that. So we should probably be rethinking the way we interact with caves full of bats and live pangolins.
Lucas Perry: All right. We just need to be prepared for the long haul here.
Robert de Neufville: Yeah, I think so.
Lucas Perry: I’m not sure that most people understand that.
Robert de Neufville: I don’t think they do. I mean, I guess I don’t have my finger on the pulse and I’m not interacting with people anymore, but I don’t think people want to understand it. It’s hard. I had plans. I did not intend to be staying in my apartment. Having your health is more important and the health of others, but it’s hard to face that we may be dealing with a very different new reality.
This thing, the opening up in Georgia, it’s just completely insane to me. Their cases have been slowing, but if it’s shrinking, it seems to be only a little bit. To me, when they talk about opening up, it sounds like they’re saying, well, we reduced the extent of this forest fire by 15%, so we can stop fighting it now. Well, it’s just going to keep growing. But, you have to actually stamp it out or get really close to it before you can stop fighting it. I think people want to stop fighting the disease sooner than we should because it sucks. I don’t want to be doing this.
Lucas Perry: Yeah, it’s a new sad fact and there is a lot of suffering going on right now.
Robert de Neufville: Yeah. I feel really lucky to be in a place where there aren’t a lot of cases, but I worry about family members in other places and I can’t imagine what it’s like in places where it’s bad.
I mean, in Hawaii, people in the hospitality industry and tourism industry have all lost their jobs all at once and they still have to pay our super expensive rent. Maybe that’ll be waived and they won’t be evicted. But, that doesn’t mean they can necessarily get medications and feed their family. And all of these are super challenging for a lot of people.
Nevermind that other people are in the position of, they’re lucky to have jobs, but they’re maybe risking getting an infection going to work, so they have to make this horrible choice. And maybe they have someone with comorbidities or who is elderly living at home. This is awful. So I understand why people really want to get past this part of it soon.
Was it Dr. Fauci that said, “The virus has its own timeline?”
One of the things I think that this may be teaching us, it’s certainly reminding me that humans are not in charge of nature, not the way we think we are. We really dominate the planet in a lot of ways, but it’s still bigger than us. It’s like the ocean or something. You know? You may think you’re a good swimmer, but if you get a big wave, you’re not in control anymore and this is a big wave.
Lucas Perry: Yeah. So back to the point of general superforecasting. Suppose you’re a really good superforecaster and you’re finding well-defined things to make predictions about, which is, as you said, sort of hard to do and you have carefully and honestly compared your predictions to reality and you feel like you’re doing really well.
How do you convince other people that you’re a great predictor when almost everyone else is making lots of vague predictions and cherry picking their successes or their interests groups that are biasing and obscuring things to try to have a seat at the table? Or for example, if you want to compare yourself to someone else who has been keeping a careful track as well, how do you do that technically?
Robert de Neufville: I wish I knew the answer to that question. I think it is probably a long process of building confidence and communicating reasonable forecasts and having people see that they were pretty accurate. People trust something like FiveThirthyEight, Nate Silvers’, or Nick Cohen, or someone like that because they have been communicating for a while and people can now see it. They have this track record and they also are explaining how it happens, how they get to those answers. And at least a lot of people started to trust what Nate Silver says. So I think something like that really is the longterm strategy.
But, I think it’s hard because a lot of times there is always someone who is saying every different thing at any given time. And if somebody says there is definitely a pandemic going to happen, and they do it in November 2019, then a lot of people may think, “Wow, that person’s a prophet and we should listen to them.”
To my mind, if you were saying that in November of 2019, that wasn’t a great prediction. I mean, you turned out to be right, but you didn’t have good reasons for it. At that point, it was still really uncertain unless you had access to way more information than as far as I know anyone had access to.
But, you know sometimes those magic tricks where somebody throws a dart at something and happens to hit the bullseye might be more convincing than an accurate probabilistic forecast. I think that in order to sell the accurate probabilistic forecasts, you really need to build a track record of communication and build confidence slowly.
Lucas Perry: All right, that makes sense.
So on prediction markets and prediction aggregators, they’re pretty well set up to treat questions like will X happen by Y date where X is some super well-defined thing. But lots of things we’d like to know are not really of this form. So what are other useful forms of question about the future that you come across in your work and what do you think are the prospects for training and aggregating skilled human predictors to tackle them?
Robert de Neufville: What are the other forms of questions? There is always a trade off with designing question between sort of the rigor of the question, how easy it is to say whether it turned out to be true or not and how relevant it is to things you might actually want to know. Now, that’s often difficult to balance.
I think that in general we need to be thinking more about questions, so I wouldn’t say here is the different type of question that we should be answering. But rather, let’s really try to spend a lot of time thinking about the questions. What questions could be useful to answer? I think just that exercise is important.
I think things like science fiction are important where they brainstorm a possible scenario and they often fill it out with a lot of detail. But, I often think in forecasting, coming up with very specific scenarios is kind of the enemy. If you come up with a lot of things that could plausibly happen and you build it into one scenario and you think this is the thing that’s going to happen, well the more specific you’ve made that scenario, the less likely it is to actually be the exact right one.
We need to do more thinking about spaces of possible things that could happen, ranges of things, different alternatives rather than just coming up with scenarios and anchoring on them as the thing that happens. So I guess I’d say more questions and realize that at least as far as we’re able to know, I don’t know if the universe is deterministic, but at least as far as we are able to know, a lot of different things are possible and we need to think about those possibilities and potentially plan for them.
Lucas Perry: All right. And so, let’s say you had 100 professors with deep subject matter expertise in say, 10 different subjects and you had 10 superforecasters, how would you make use of all of them and on what sorts of topics would you consult, what group or combination of groups?
Robert de Neufville: That’s a good question. I think we bash on subject matter experts because they’re bad at producing probabilistic forecasts. But the fact is that I completely depend on subject matter experts. When I try to forecast what’s going to happen on the pandemic, I am reading all the virologists and infectious disease experts because I don’t know anything about this. I mean, I know I get some stuff wrong. Although, I’m in a position where I can actually ask people, hey what is this, and get their explanations for it.
But, I would like to see them working together. To some extent, having some of the subject matter experts recognize that we may know some things about estimating probabilities that they don’t. But also, the more I can communicate with people that know specific facts about things, the better the forecasts I can produce are. I don’t know what the best system for that is. I’d like to see more communication. But, I also think you could get some kind of a thing where you put them in a room or on a team together to produce forecasts.
When I’m forecasting, typically, I come up with my own forecast and then I see what other people have said. But, I do that so as not to anchor on somebody else’s opinion and to avoid groupthink. You’re more likely to get groupthink if you have a leader and a team that everyone defers to and then they all anchor on whatever the leader’s opinion is. So, I try to form my own independent opinion.
But, I think some kind of a Delphi technique where people will come up with their own ideas and then share them and then revise their ideas could be useful and you could involve subject matter experts in that. I would love to be able to just sit and talk with epidemiologist about this stuff. I don’t know if they would love it as much to talk to me and I don’t know. But I think that, that would help us collectively produce better forecasts.
Lucas Perry: I am excited and hopeful for the top few percentage of superforecasters being integrated into more decision making about key issues. All right, so you have your own podcast.
Robert de Neufville: Yeah.
Lucas Perry: If people are interested in following you or looking into more of your work at the Global Catastrophic Riss Institute, for example, or following your podcast or following you on social media, where can they do that?
Robert de Neufville: Go to the Global Catastrophic Risk Institute’s website, it’s gcrinstitute.org, so you can see and read about our work. It’s super interesting and I believe super important. We’re doing a lot of work now on artificial intelligence risk. There has been a lot of interest in that. But, we also talk about nuclear war risk and there is going to be I think a new interest in pandemic risk. So these are things that we think about. I also do have a podcast. I co-host it with two other superforecasters, which sometimes becomes sort of like a forecasting politics variety hour. But we have a good time and we do some interviews with other superforecasters and we’ve also talked to people about existential risk and artificial intelligence. That’s called NonProphets. We have a blog, nonprophetspod.wordpress.org. But Nonprophets, it’s N-O-N-P-R-O-P-H-E-T-S like prophet like someone who sees the future, because we are not prophets. However, there is also another podcast, which I’ve never listened to and feel like I should, which also has the same name. There is an atheist podcast out of Texas and atheist comedians. I apologize for taking their name, but we’re not them, so if there is any confusion. One of the things about forecasting is it’s super interesting and it’s a lot of fun, at least for people like me to think about things in this way, and there are ways like Good Judgment Open you can do it too. So we talk about that. It’s fun. And I recommend everyone get into forecasting.
Lucas Perry: All right. Thanks so much for coming on and I hope that more people take up forecasting. And it’s a pretty interesting lifelong thing that you can participate in and see how well you do over time and keep resolving over actual real world stuff. I hope that more people take this up and that it gets further and more deeply integrated into communities of decision makers on important issues.
Robert de Neufville: Yeah. Well, thanks for having me on. It’s a super interesting conversation. I really appreciate talking about this stuff.