Maybe a similar rule in forecasting is “home ground advantage plus personal or professional stakes, or don’t bother.”
On 16 questions currently scored, I’ve done better than the team average at 15. Two of the questions where I outperformed the team by a large margin where the Syrian refugee question, basically a matter of extrapolating a trend and predicting status quo with respect to the conflict, and the Kismayo question, basically a matter of knowing my loss function. I had zero home ground advantage on either question.
Some of my wins resulted purely from general knowledge rather than from having any idea of the specifics of the situation: for instance, in mid-August I answered 40% to “Will Kuwait commence parliamentary elections before 1 October 2012?”, reflecting only status quo bias in that a date for the election had not yet been announced. However, early in September I downgraded this to 10%, because I know that as a rule of thumb it takes at least one month to convene an election. The week before, I went to 5% (and even that was quite a generous margin), while several of my teammates made predictions, after I published mine, of 15%, 19%, 33% and even 51% (!).
This felt like entering a poker tournament where people routinely raise pre-flop with a “beer hand” (seven and two—when you play this, either you’ve had too many beers, or it’s time you have one). Elections aren’t a mysterious thing, we participate in one every so often. You need to print ballots, set up voting booths, audit voter registration records, give people time to campaign on national media, all very mundane stuff. Even dictatorships make at least a half-hearted attempt at this, and it’s not like anyone in Kuwait had any particular interest in meeting an October deadline, this was strictly an internal-to-GJP deadline.
So while this question had to do, ostensibly, with something happening in Kuwait, all you needed to make a call at least as good as mine was background knowledge about extremely mundane, practical stuff that, if I had any hint that you wouldn’t factor that in when making a close-to-home prediction, I wouldn’t trust you with organizing so much as the PTA president election. Maybe a birthday party.
I wouldn’t go so far as to claim that “skill at forecasting macro trends transfer to microeconomic moves”.
But I’d take a stand on “demonstrated incompetence at the most elementary moves of forecasting, in a macro domain, is a strong indicator of likely incompetence at forecasting in any micro domain, other than the few narrow ones you might happen to be good at”.
They compute your Brier score for each day that the question is open, according to what your forecast is on that day, and average over all days.
Suppose you start at 80%, six days pass, you switch to 40% three days before the deadline, and the event doesn’t happen, your score is (6*(0.8)^2+3*(0.4)^2)/9 = .48, which is a so-so score—but an improvement over the .64 that you’d get if you didn’t change your mind.
Some of my wins resulted purely from general knowledge rather than from having any idea of the specifics of the situation: for instance, in mid-August I answered 40% to “Will Kuwait commence parliamentary elections before 1 October 2012?”, reflecting only status quo bias in that a date for the election had not yet been announced. However, early in September I downgraded this to 10%, because I know that as a rule of thumb it takes at least one month to convene an election. The week before, I went to 5% (and even that was quite a generous margin), while several of my teammates made predictions, after I published mine, of 15%, 19%, 33% and even 51% (!).
Yeah. Answering “1%” that “there will be a major earthquake in California during $time_period” a month before the end of $time_period kind-of felt like cheating to me.
Isn’t Tool 0 of forecasting ‘Mind your own business’?
In a nutshell, no.
Consider some practicalities. An advantage of forecasting world events is that it permits participation by a much broader population. I could run a forecasting contest on when the city of Paris will complete a construction project on the banks of the Seine, which is “my backyard” compared to Syria. Nobody would bother.
The point is to find out something about how you think, and comparing yourself to other people will yield information that you can’t get by sitting on your own, minding your own business. (On the other hand, there’s nothing preventing you from doing both.)
Finally, I’m not aware that people routinely make explicit, quantified forecasts even about their own business. Rather, it seems plain that most of the time, we think “probable” the things we would like to happen, and as a result fail to plan for contingencies we don’t like to think about.
To go from not forecasting at all to making forecasts in any domain is progress. It would certainly be useful to many to make forecasts about their daily lives (which I now do, a little bit). But let’s imagine this were taught in schools as a life skill: I suspect you would have people practicing precisely on events that they have no control over and that allow interpersonal comparison.
Isn’t Tool 0 of forecasting ‘Mind your own business’?
Thanks for inspiring the following bit of staircase wit, which might make it into some further version of the post: Tool 0 of forecasting is “forecast”. If you don’t do it, you can’t become better at it.
Gwern prefers PredictionBook—where you can, if you want, record private predictions—to GJP. For my part I prefer GJP, precisely because they ask me questions that might not occur to me otherwise, and the competitive aspect suits me. You could also do just fine by recording your own forecasts in a spreadsheet or a notepad, on whatever topics you like.
The key to accuracy is having fewer moving parts, all of which are visible and known by you.
Is accuracy what you’re after? Which component of accuracy? I can get perfect calibration by throwing a thousand coin flips and predicting 50% all the time. What I seek is debiasing, making the most of whatever information is available without overweighting any part of it (including my own hunches, feelings and fears); and I’m most vulnerable to bias when there are many moving parts, many of which are hidden from me or unknown to me.
Isn’t Tool 0 of forecasting ‘Mind your own business’? One rule of the thumb in poker is “Jacks or better”, meaning that you shouldn’t even consider playing with anything less. Maybe a similar rule in forecasting is “home ground advantage plus personal or professional stakes, or don’t bother.”
No, tool 0 is more like ‘mind your base rates’ or ‘don’t predict what you would like, predict what you really think would happen’. I dunno where you’re getting Tool 0 as ‘Mind your own business’ from; certainly I or Morendil didn’t write it.
How well does skill at forecasting macro trends transfer to microeconomic moves?
I dunno, did you look into any research?
I’m guessing successful policymakers know a lot about their colleagues’ personalities, histories, and connections, and that good policy comes from navigating those instead of forecasting macrotrends accurately.
Per the huge amount of material on Outside View vs Inside View and performance of SPRs already discussed on LW, I would guess quite the opposite.
How much would someone boost their batting average by limiting opinions to things of concern, such as for example health-related science, psychology as a science, or which people to draw closer or keep away? Entering a service-based deal with someone who turns out to be incompetent is costly, painful. Or failing to size up new “friends” fast enough for you to retreat before they waste your time. Many people perform well with evolved and practiced instinct by making sure that their forecasts never leave the home ground of their concern.
Do you know that, or are you just guessing, as you said you were before?
Or was your entire comment just an excuse to do an awful lot of rhetorical questions?
I think he’s saying it’s a waste of effort to predict who or what will happen in the world if you can’t exert any control over it. That sort of makes sense because it seems useless to worry about those sort of things, at first. But it’s important to understand the consequences of the actions of other people so that you can react to them, and he didn’t take that into account. So, for example, a French citizen might be interested in knowing who the next US president will be because they’re curious about the implications that has for their business contacts in America.
I think he’s saying it’s a waste of effort to predict who or what will happen in the world if you can’t exert any control over it.
Buying insurance is a decision that relates to things that may or may not happen, that you have little or no control over: illness, accidents, burglaries, etc. Being able to make informed predictions as to the likelihood of these things is a valuable life skill.
The distinction seems arbitrary at first glance both because what’s personal for one person is impersonal for another and because causality is causality no matter where it occurs. However, if you meant that they’re different in kind in a more epistemic sense, that they’re different in kind from any particular perspective because of the way that they go through your reasoning process, then that seems plausible.
The question is then what types of data work best and why. You’re likely to have less total amounts of data in Near Mode, but you’ll be working with things that are important to you personally which it seems like evolution would favor (individual selection).
On the other hand, evolution seems to make biases more frequent and more intense when they’re about personal matters. But evolution wouldn’t do this if it hadn’t worked often in the past, so perhaps those biases are good? I think that this is fairly plausible, but I also think that these biases would only be “good” in a reproductive sense and not in the sense of epistemic accuracy. They would move you towards maximizing your social status, not the quality of your predictions. It’s unlikely those would overlap.
How likely is it that people are good at evaluating the credibility of the ideas of specific people? I would say that most people are probably bad at this when seeing others face to face because of things like the halo effect and because credibility is rather easy to fake. I would also say that people are rather good at this otherwise. Are these evaluations still accurate when they interact with social motivations, like rivalry? I would say that they probably end up even worse under those circumstances.
So, I believe that personal events and impersonal events should be considered differently because I believe trying to evaluate the accuracy of the views of specific experts would improve the accuracy of your predictions if and only if you avoided personal familiarity or intimacy with those experts, and that otherwise it would damage your accuracy.
I failed to consider the implications of social motivation for professional accuracy, and a bunch of other stuff.
I’m sorry, either I’m misunderstanding you or you misunderstood my comment. I don’t understand what you mean by the phrase “choosing types of data”. I think that although we’re better at dealing with some types of data, that doesn’t mean we should focus exclusively on that type of data. I think that becoming a skilled general forecaster is a very useful thing and something that should be pursued.
Well, I can give you an argument, though you’ll have to evaluate the strength of it yourself.
Forecasting, in a Bayesian sense, is a matter of repeated application of Bayes’ theorem. In short, I make an observation (B) and then ask—what are the chances of prediction (A), given observation (B)? (‘Prediction’ may be the wrong word, given that I may be predicting something unseen that has already happened). Bayes’ theorem states that this is equal to the following:
The chances of observation B, given prediction A, multiplied by the prior probability of prediction A, divided by the prior probability of observation B
Now, the result of the equation is only as good as the figures you feed into it. In your example of the freelancer, the new freelancer (just starting out) has poor estimates of the probabilities involved, though he can improve these estimates by asking a more experienced freelancer for help. The experienced freelancer, on the other hand, has got a better grasp of the input probabilities, and thus gets a more accurate output probability. The equation works for both large-scale, macro events and small-scale, personal events—the difference is, once again, a matter of the input numbers. For a macro event, you’ll have more people looking at, commenting on, discussing the situation; reading the words of others will improve your estimates of the probabilities involved, and putting better numbers in will get you better numbers out. Also, with macro events, you’re more likely to have more time to sit down with pencil and paper and work it out.
However, predicting macro events will help you to better practice the equation, and thus learn how to apply it more quickly and easily to micro events. Sufficient practice will also help you to more quickly and accurately estimate the result for a given set of inputs. So while it is true that the skill of guessing the input probabilities for macro events may have little to do with the skill of guessing the input probabilites for micro events (though there is some correlation there—the skill of accurately putting figures to the probability may transfer to some degree), the skill of practicing the application of the equation is transferable between the two realms.
To continue his line of argument, evolution has gifted us with social instincts superior to our best attempts at rationality. Allowing bias to have its way with us will make us better off socially than we could be otherwise, provided that certain other conditions are met. Forcing flawed attempts at rationality into our behavior may well just corrupt the success of our instincts.
I think I would sort of believe that, with some caveats. For individuals who are good looking and good conversationalists and who value social success over anything else, it probably makes sense to avoid rationality training, as there’s only a chance it can hurt you. So I agree with him in cases like that. But for other individuals, such as those who are unattractive or who are bad conversationalists or who value things other than social success, rationality might be the best strategy, because there’s only a chance it can help you. Learning about biases can hurt you, similarly, making your ability to predict things more rigorous can do the same.
I’m uncertain as to how much I believe that, but I believe the general idea is at least non-obviously false, and that the idea is ultimately more true than false. I believe most people would not do well if they suddenly started working on improving their rationality and predictive accuracy.
So I’m interested in forecasting. It’s an important skill. I’m going on about it because I want to be good, smart, well-calibrated, about what matters to me.
Well, to start with: what evidence do you have at the moment about how well calibrated you are?
The methods that Morendil is discussing here are pretty general forecasting techniques, not limited to a particular domain. Some skills are worth developing, even if you’re practicing them in domains you don’t care about.
Personal example: I was a bio major in college, and I found it very difficult to care about organic chemistry, because we were mostly learning about chemicals that had no biological relevance. Consequently, I didn’t learn it very well, which came back to bite me pretty hard when I took biochemistry.
del
On 16 questions currently scored, I’ve done better than the team average at 15. Two of the questions where I outperformed the team by a large margin where the Syrian refugee question, basically a matter of extrapolating a trend and predicting status quo with respect to the conflict, and the Kismayo question, basically a matter of knowing my loss function. I had zero home ground advantage on either question.
Some of my wins resulted purely from general knowledge rather than from having any idea of the specifics of the situation: for instance, in mid-August I answered 40% to “Will Kuwait commence parliamentary elections before 1 October 2012?”, reflecting only status quo bias in that a date for the election had not yet been announced. However, early in September I downgraded this to 10%, because I know that as a rule of thumb it takes at least one month to convene an election. The week before, I went to 5% (and even that was quite a generous margin), while several of my teammates made predictions, after I published mine, of 15%, 19%, 33% and even 51% (!).
This felt like entering a poker tournament where people routinely raise pre-flop with a “beer hand” (seven and two—when you play this, either you’ve had too many beers, or it’s time you have one). Elections aren’t a mysterious thing, we participate in one every so often. You need to print ballots, set up voting booths, audit voter registration records, give people time to campaign on national media, all very mundane stuff. Even dictatorships make at least a half-hearted attempt at this, and it’s not like anyone in Kuwait had any particular interest in meeting an October deadline, this was strictly an internal-to-GJP deadline.
So while this question had to do, ostensibly, with something happening in Kuwait, all you needed to make a call at least as good as mine was background knowledge about extremely mundane, practical stuff that, if I had any hint that you wouldn’t factor that in when making a close-to-home prediction, I wouldn’t trust you with organizing so much as the PTA president election. Maybe a birthday party.
I wouldn’t go so far as to claim that “skill at forecasting macro trends transfer to microeconomic moves”.
But I’d take a stand on “demonstrated incompetence at the most elementary moves of forecasting, in a macro domain, is a strong indicator of likely incompetence at forecasting in any micro domain, other than the few narrow ones you might happen to be good at”.
How does GJP score predictions that change over time?
They compute your Brier score for each day that the question is open, according to what your forecast is on that day, and average over all days.
Suppose you start at 80%, six days pass, you switch to 40% three days before the deadline, and the event doesn’t happen, your score is (6*(0.8)^2+3*(0.4)^2)/9 = .48, which is a so-so score—but an improvement over the .64 that you’d get if you didn’t change your mind.
Yeah. Answering “1%” that “there will be a major earthquake in California during $time_period” a month before the end of $time_period kind-of felt like cheating to me.
In a nutshell, no.
Consider some practicalities. An advantage of forecasting world events is that it permits participation by a much broader population. I could run a forecasting contest on when the city of Paris will complete a construction project on the banks of the Seine, which is “my backyard” compared to Syria. Nobody would bother.
The point is to find out something about how you think, and comparing yourself to other people will yield information that you can’t get by sitting on your own, minding your own business. (On the other hand, there’s nothing preventing you from doing both.)
Finally, I’m not aware that people routinely make explicit, quantified forecasts even about their own business. Rather, it seems plain that most of the time, we think “probable” the things we would like to happen, and as a result fail to plan for contingencies we don’t like to think about.
To go from not forecasting at all to making forecasts in any domain is progress. It would certainly be useful to many to make forecasts about their daily lives (which I now do, a little bit). But let’s imagine this were taught in schools as a life skill: I suspect you would have people practicing precisely on events that they have no control over and that allow interpersonal comparison.
del
Thanks for inspiring the following bit of staircase wit, which might make it into some further version of the post: Tool 0 of forecasting is “forecast”. If you don’t do it, you can’t become better at it.
Gwern prefers PredictionBook—where you can, if you want, record private predictions—to GJP. For my part I prefer GJP, precisely because they ask me questions that might not occur to me otherwise, and the competitive aspect suits me. You could also do just fine by recording your own forecasts in a spreadsheet or a notepad, on whatever topics you like.
Is accuracy what you’re after? Which component of accuracy? I can get perfect calibration by throwing a thousand coin flips and predicting 50% all the time. What I seek is debiasing, making the most of whatever information is available without overweighting any part of it (including my own hunches, feelings and fears); and I’m most vulnerable to bias when there are many moving parts, many of which are hidden from me or unknown to me.
No, tool 0 is more like ‘mind your base rates’ or ‘don’t predict what you would like, predict what you really think would happen’. I dunno where you’re getting Tool 0 as ‘Mind your own business’ from; certainly I or Morendil didn’t write it.
I dunno, did you look into any research?
Per the huge amount of material on Outside View vs Inside View and performance of SPRs already discussed on LW, I would guess quite the opposite.
Do you know that, or are you just guessing, as you said you were before?
Or was your entire comment just an excuse to do an awful lot of rhetorical questions?
I think he’s saying it’s a waste of effort to predict who or what will happen in the world if you can’t exert any control over it. That sort of makes sense because it seems useless to worry about those sort of things, at first. But it’s important to understand the consequences of the actions of other people so that you can react to them, and he didn’t take that into account. So, for example, a French citizen might be interested in knowing who the next US president will be because they’re curious about the implications that has for their business contacts in America.
Buying insurance is a decision that relates to things that may or may not happen, that you have little or no control over: illness, accidents, burglaries, etc. Being able to make informed predictions as to the likelihood of these things is a valuable life skill.
del
Are they different in kind? I’m uncertain.
The distinction seems arbitrary at first glance both because what’s personal for one person is impersonal for another and because causality is causality no matter where it occurs. However, if you meant that they’re different in kind in a more epistemic sense, that they’re different in kind from any particular perspective because of the way that they go through your reasoning process, then that seems plausible.
The question is then what types of data work best and why. You’re likely to have less total amounts of data in Near Mode, but you’ll be working with things that are important to you personally which it seems like evolution would favor (individual selection).
On the other hand, evolution seems to make biases more frequent and more intense when they’re about personal matters. But evolution wouldn’t do this if it hadn’t worked often in the past, so perhaps those biases are good? I think that this is fairly plausible, but I also think that these biases would only be “good” in a reproductive sense and not in the sense of epistemic accuracy. They would move you towards maximizing your social status, not the quality of your predictions. It’s unlikely those would overlap.
How likely is it that people are good at evaluating the credibility of the ideas of specific people? I would say that most people are probably bad at this when seeing others face to face because of things like the halo effect and because credibility is rather easy to fake. I would also say that people are rather good at this otherwise. Are these evaluations still accurate when they interact with social motivations, like rivalry? I would say that they probably end up even worse under those circumstances.
So, I believe that personal events and impersonal events should be considered differently because I believe trying to evaluate the accuracy of the views of specific experts would improve the accuracy of your predictions if and only if you avoided personal familiarity or intimacy with those experts, and that otherwise it would damage your accuracy.
I failed to consider the implications of social motivation for professional accuracy, and a bunch of other stuff.
del
I’m sorry, either I’m misunderstanding you or you misunderstood my comment. I don’t understand what you mean by the phrase “choosing types of data”. I think that although we’re better at dealing with some types of data, that doesn’t mean we should focus exclusively on that type of data. I think that becoming a skilled general forecaster is a very useful thing and something that should be pursued.
What sort of questions did you have in mind?
del
Well, I can give you an argument, though you’ll have to evaluate the strength of it yourself.
Forecasting, in a Bayesian sense, is a matter of repeated application of Bayes’ theorem. In short, I make an observation (B) and then ask—what are the chances of prediction (A), given observation (B)? (‘Prediction’ may be the wrong word, given that I may be predicting something unseen that has already happened). Bayes’ theorem states that this is equal to the following:
The chances of observation B, given prediction A, multiplied by the prior probability of prediction A, divided by the prior probability of observation B
Now, the result of the equation is only as good as the figures you feed into it. In your example of the freelancer, the new freelancer (just starting out) has poor estimates of the probabilities involved, though he can improve these estimates by asking a more experienced freelancer for help. The experienced freelancer, on the other hand, has got a better grasp of the input probabilities, and thus gets a more accurate output probability. The equation works for both large-scale, macro events and small-scale, personal events—the difference is, once again, a matter of the input numbers. For a macro event, you’ll have more people looking at, commenting on, discussing the situation; reading the words of others will improve your estimates of the probabilities involved, and putting better numbers in will get you better numbers out. Also, with macro events, you’re more likely to have more time to sit down with pencil and paper and work it out.
However, predicting macro events will help you to better practice the equation, and thus learn how to apply it more quickly and easily to micro events. Sufficient practice will also help you to more quickly and accurately estimate the result for a given set of inputs. So while it is true that the skill of guessing the input probabilities for macro events may have little to do with the skill of guessing the input probabilites for micro events (though there is some correlation there—the skill of accurately putting figures to the probability may transfer to some degree), the skill of practicing the application of the equation is transferable between the two realms.
To continue his line of argument, evolution has gifted us with social instincts superior to our best attempts at rationality. Allowing bias to have its way with us will make us better off socially than we could be otherwise, provided that certain other conditions are met. Forcing flawed attempts at rationality into our behavior may well just corrupt the success of our instincts.
I think I would sort of believe that, with some caveats. For individuals who are good looking and good conversationalists and who value social success over anything else, it probably makes sense to avoid rationality training, as there’s only a chance it can hurt you. So I agree with him in cases like that. But for other individuals, such as those who are unattractive or who are bad conversationalists or who value things other than social success, rationality might be the best strategy, because there’s only a chance it can help you. Learning about biases can hurt you, similarly, making your ability to predict things more rigorous can do the same.
I’m uncertain as to how much I believe that, but I believe the general idea is at least non-obviously false, and that the idea is ultimately more true than false. I believe most people would not do well if they suddenly started working on improving their rationality and predictive accuracy.
Well, to start with: what evidence do you have at the moment about how well calibrated you are?
The methods that Morendil is discussing here are pretty general forecasting techniques, not limited to a particular domain. Some skills are worth developing, even if you’re practicing them in domains you don’t care about.
Personal example: I was a bio major in college, and I found it very difficult to care about organic chemistry, because we were mostly learning about chemicals that had no biological relevance. Consequently, I didn’t learn it very well, which came back to bite me pretty hard when I took biochemistry.