I’m really happy that you are writing a book on this topic. I mean, the Sequences and the other discussion on Less Wrong has given us a lot of tools with which to form our own opinion, but then we need to figure out how to balance this against the opinions of experts with more domain-specific knowledge. There’s a sense in which all the other knowledge isn’t of any use unless we know when to actually use it.
“Now, on the modest view, this was the unfairest test imaginable. Out of all the times that I’ve ever suggested that a government’s policy is suboptimal, the rare time a government tries my preferred alternative will select the most mainstream, highest-conventional-prestige policies I happen to advocate, and those are the very policy proposals that modesty is least likely to disapprove of.”
This is a pretty big deal, so I wanted to emphasise it. Let’s suppose you come up with 50 policies you think the government should implement. 10 get implemented and 8 work out well. Pretty good right? But what if 30 of your policies would have been utterly stupid and this is obvious to any of the experts? This effect could completely destroy your attempts at callibration.
That is indeed pretty good! If the experts would only call things utterly stupid that are in fact utterly stupid, you’ve got it made. All you have to do is run your policies by the experts and have them explain why those 30 are utterly stupid, and then advocate for the remaining 20. If they’d call 40 of them stupid and only 30 are, now we have the problem that we are discarding 10 good ideas, but that still leaves 10 good ideas, 8 of which worked out. Sweet!
It’s certainly possible to think you’re better than you are, this way, but this is far from inevitable. But it’s pretty immodest to claim “I generate ideas that, conditional on being implemented, have an 80% chance of working.” Provided that far less than 80% of similar implemented policies work, at least.
Advocating strongly for a policy that would work in the worlds in which it could possibly get implemented is a good idea even if most of your policies would be diasterous. I can’t think of a source of good ideas that doesn’t mostly generate bad ideas until it encounters criticism, but the process working at all seems like a hugely immodest claim.
I don’t understand what you mean by “conditional on being implemented.” Do you mean that for each policy, it is implemented regardless of being impossible, then out of these words we find the number that have gotten better relative to their controls? Or do you mean that we find the number of possible worlds in which the policy is implemented, and compare it to a similar possible world in which it is not, and determine if a positive correlation between “has Policy X implemented” and “is a world with Y utilons”? The former doesn’t seem right, but in context the latter doesn’t seem to fit.
The advanced answer to this is to create conditional prediction markets. For example: a market for whether or not the Bank of Japan implements a policy, a market for the future GDP or inflation rate of Japan (or whatever your preferred metric is), and a conditional market for (GDP given policy) and (GDP given no policy).
Then people can make conditional bets as desired, and you can report your track record, and so on. Without a prediction market you can’t, in general, solve the problem of “how good is this prediction track record really” except by looking at it in detail and making judgment calls.
I’m really happy that you are writing a book on this topic. I mean, the Sequences and the other discussion on Less Wrong has given us a lot of tools with which to form our own opinion, but then we need to figure out how to balance this against the opinions of experts with more domain-specific knowledge. There’s a sense in which all the other knowledge isn’t of any use unless we know when to actually use it.
“Now, on the modest view, this was the unfairest test imaginable. Out of all the times that I’ve ever suggested that a government’s policy is suboptimal, the rare time a government tries my preferred alternative will select the most mainstream, highest-conventional-prestige policies I happen to advocate, and those are the very policy proposals that modesty is least likely to disapprove of.”
This is a pretty big deal, so I wanted to emphasise it. Let’s suppose you come up with 50 policies you think the government should implement. 10 get implemented and 8 work out well. Pretty good right? But what if 30 of your policies would have been utterly stupid and this is obvious to any of the experts? This effect could completely destroy your attempts at callibration.
That is indeed pretty good! If the experts would only call things utterly stupid that are in fact utterly stupid, you’ve got it made. All you have to do is run your policies by the experts and have them explain why those 30 are utterly stupid, and then advocate for the remaining 20. If they’d call 40 of them stupid and only 30 are, now we have the problem that we are discarding 10 good ideas, but that still leaves 10 good ideas, 8 of which worked out. Sweet!
It’s certainly possible to think you’re better than you are, this way, but this is far from inevitable. But it’s pretty immodest to claim “I generate ideas that, conditional on being implemented, have an 80% chance of working.” Provided that far less than 80% of similar implemented policies work, at least.
Advocating strongly for a policy that would work in the worlds in which it could possibly get implemented is a good idea even if most of your policies would be diasterous. I can’t think of a source of good ideas that doesn’t mostly generate bad ideas until it encounters criticism, but the process working at all seems like a hugely immodest claim.
I don’t understand what you mean by “conditional on being implemented.” Do you mean that for each policy, it is implemented regardless of being impossible, then out of these words we find the number that have gotten better relative to their controls? Or do you mean that we find the number of possible worlds in which the policy is implemented, and compare it to a similar possible world in which it is not, and determine if a positive correlation between “has Policy X implemented” and “is a world with Y utilons”? The former doesn’t seem right, but in context the latter doesn’t seem to fit.
The advanced answer to this is to create conditional prediction markets. For example: a market for whether or not the Bank of Japan implements a policy, a market for the future GDP or inflation rate of Japan (or whatever your preferred metric is), and a conditional market for (GDP given policy) and (GDP given no policy).
Then people can make conditional bets as desired, and you can report your track record, and so on. Without a prediction market you can’t, in general, solve the problem of “how good is this prediction track record really” except by looking at it in detail and making judgment calls.