This is great, I can see it being really helpful for me to consciously think about which of these I’m optimizing for (or am willing to sacrifice) when writing. I got confused by the introduction of the term ‘density’ in the section on trade-offs, as this isn’t represented in the RAIN framework. Is density just a sub-consideration of accessibility or are you considering it in its own right?
rosiecam
There seems to be some evidence that a norm of helmet-wearing discourages people from cycling. When people don’t wear helmets, they are taking on some personal risk, but by challenging the norm it could mean that more people take up cycling. This is likely to make cycling safer (fewer cars on the roads, drivers are more used to cyclists, safety in numbers at intersections etc), and so the need to wear a helmet is reduced.
I’d prefer to live in a world where more people cycle and helmets aren’t (as) needed (like in European cities such as Berlin and Amsterdam), so I tend to feel grateful towards people who don’t wear helmets.
Help forecast study replication in this social science prediction market
As has been noted, the impressiveness of the predictions has nothing to do with which way round they are stated; predicting P at 50% is exactly as impressive as predicting ¬P at 50% because they are literally the same. I think one only sounds more impressive when compared to the ‘baseline’ because our brains seem to be more attuned to predictions that sound surprisingly high, and we don’t seem to notice ones that seem surprisingly low. I.e., we hear: ‘there is a 40% chance that Joe Biden will be the democratic nominee’ and somehow translate that to ‘at least 40%’, and fail to consider what it implies for the other 60%.
Consider the examples given of unimpressive-sounding predictions:
There is a 50% chance that the price of a barrel of oil at the end of 2020 will not be between $50.95 and $51.02
There is a 50% chance that Tesla’s stock price at the end of the year 2020 is below $512 or above $514
You can immediately make these sound impressive without flipping them by inserting the word ‘only’ or ‘just’:
There is only a 50% chance that the price of a barrel of oil at the end of 2020 will not be between $50.95 and $51.02
There is just a 50% chance that Tesla’s stock price at the end of the year 2020 will be below $512 or above $514
Suddenly, we are forced to confront how surprisingly low this percentage is, given what you might expect from common wisdom, and it goes back to seeming impressive.
I also think it’s a mistake to confuse ‘common wisdom’ and ‘baseline’ with ‘all possible futures’ when thinking about impressiveness. If I say that there’s a 50% that the price of a barrel of oil at the end of 2020 will be between -$1 million and $1 million, this sounds unimpressive because I’ve chosen a very wide interval relative to common sense. But there are a lot more numbers below -$1 million and above $1 million than there are within it, so arguably this is actually quite a precise prediction in the space of all possible futures, but that’s not important—what matters is the common sense range / baseline.
(Of course, “there’s a 50% that the price of a barrel of oil at the end of 2020 will be between -$1 million and $1 million” is actually a very bold prediction, because it’s saying that there is a 50% chance that the price of oil will be either less than -$1 million or above $1 million which is surprisingly high… but we only notice it when phrased to seem surprisingly high rather than surprisingly low!)
- 25 Apr 2020 5:59 UTC; 1 point) 's comment on How to evaluate (50%) predictions by (
I had the same feeling and have written up my thoughts in this comment https://www.lesswrong.com/posts/DAc4iuy4D3EiNBt9B/how-to-evaluate-50-predictions?commentId=RykR3sX57jbbLJaRB
Thanks for the response!
I don’t think there is any difference in those lists! Here’s why:
The impressiveness of 50% predictions can only be evaluated with respect to common wisdom. If everyone thinks P is only 10% likely, and you give it 50%, and P turns out to be true, this is impressive because you gave it a surprisingly high percentage! But also if everyone says P is 90% likely, and P turns out to be false, this is also impressive because you gave it a surprisingly low percentage!
I think what you’re suggesting is that people should always phrase their prediction in a way that, if P comes true, makes their prediction impressive because the percentage was surprisingly high, i.e.:
Most people think there is only a 20% chance that the price of a barrel of oil at the end of 2020 will be between $50.95 and $51.02. I think it’s 50% (surprisingly high), so you should be impressed if it turns out to be true.
But you could also say:
Most people think there is an 80% chance that the price of a barrel of oil at the end of 2020 will not be between $50.95 and $51.02. I think it’s only 50% (surprisingly low), so you should be impressed if it turns out to be false.
These are equally impressive (though I admit the second is phrased in a less intuitive way) - when it comes to 50% predictions, it doesn’t matter whether you evaluate it with respect to ‘it turned out to be true’ vs ‘it turned out to be false’; you’re trying to correctly represent both the percentages in both cases (i.e. the correct ratio), and the impressiveness comes from the extent to which your percentages on both sides differ from the baseline.
I think what I’m saying is that it doesn’t matter how the author phases it, when evaluating 50% predictions we should notice both when it seems surprisingly high and turns out to be true, and when it’s surprisingly low and turns out to be false, as they are both impressive.
When it comes to a list of 50% predictions, it’s impossible to evaluate the impressiveness only by looking at how many came true, since it’s arbitrary which way they are phrased, and you could equally evaluate the impressiveness by how many turned out to be false. So you have to compare each one to the baseline ratio.
Probability is weird and unintuitive and I’m not sure if I’ve explained myself very well...
I agree there is a difference between those lists if you are evaluating everything with respect to each prediction being ‘true’. My point is that sometimes a 50% prediction is impressive when it turns out to be false, because everyone else would have put a higher percentage than 50% on it being true. The first list contains only statements that are impressive if evaluated as true, the second mixes ones that would be impressive if evaluated as true with those that are impressive if evaluated as false. If Tesla’s stock ends up at $513, it feels weird to say ‘well done’ to someone who predicts “Tesla’s stock price at the end of the year 2020 is below 512$ or above 514$ (50%)”, but that’s what I’m suggesting we should do, if everyone else would have only put say a 10% chance on that outcome. If you’re saying that we should always phrase 50% predictions such that they would be impressive if evaluated as true because it’s more intuitive for our brains to interpret, I don’t disagree.
I read the post in good faith and I appreciate that it made me think about predictions and probabilities more deeply. I’m not sure how else to explain my position so will leave it here.
I hereby offer you 2000$ if you provide me with a list of this kind
Can you specify what you mean by ‘of this kind’, i.e. what are the criteria for predictions included on the list? Do you mean a series of predictions which give a narrow range?
Ok this confirms you haven’t understood what I’m claiming. If I gave a list of predictions that were my true 50% confidence interval, they would look very similar to common wisdom because I’m not a superforecaster (unless I had private information about a topic, e.g. a prediction on my net worth at the end of the year or something). If I gave my true 50% confidence interval, I would be indifferent to which way I phrased it (in the same way that if I was to predict 10 coin tosses it doesn’t matter whether I predict ten heads, ten tails, or some mix of the two).
From what I can tell from your examples, the list of predictions you proposed sending to me would not have represented your true 50% confidence intervals each time—you could have sent me 5 things you are very confident will come true and 5 things you are very confident won’t come true. It’s possible to fake any given level of calibration in this way.
Thanks I appreciate that :) And I apologize if my comment about probability being weird came across as patronizing, it was meant to be a reflection on the difficulty I was having putting my model into words, not a comment on your understanding
FWIW, there seems to be a trend in my corner of the AI safety community (CHAI) to move away from the term ‘values’ and towards the term ‘preferences’, I think for similar reasons.