How to predict if bombing ISIS in Syria is a good idea:
Draw up a comprehensive spreadsheet of every ‘Western’ intervention (and almost-but-not-quite-intervention) in a >foreign country.
Rate each case by how similar it is to the present case (e.g. location, how long ago it was, civil war vs no civil war, >religious war vs non-religious war, how many countries support the intervention, cultural differences between the >countries involved, level of involvement, etc).
Rate how much each intervention (or decision not to intervene) helped or hurt the situation, in retrospect, on a >scale from −10 to +10.
Take a weighted average.
If on average intervention makes things worse, do nothing. If it makes things better, decide if the level of improvement created in such cases is worth the cost in dollars and dead people.
it requires a good handle of experiment design but biostatisticians do this day in day out. Hopefully risk analysts do this too in defense institutions.
The original quote said to rate each intervention by how much it helped or hurt the situation, i.e. its individual-level causal effect. None of those study designs will help you with that: They may be appropriate if you want to estimate the average effect across multiple similar situations, but that is not what you need here.
This is a serious question. How do you plan to rate the effectiveness of things like the decision to intervene in Libya, or the decision not to intervene in Syria, under profound uncertainty about what would have happened if the alternative decision had been made?
The original quote said to rate each intervention by how much it helped or hurt the situation, i.e. its individual-level causal effect. None of those study designs will help you with that: They may be appropriate if you want to estimate the average effect across multiple similar situations, but that is not what you need here.
Yes I concede that cross-level inferences between aggregate (average of multiple similar situations) and individual level causes has less predictive power than inferences across identical levels of inference. However, I reckon it’s the best available means to make such an inference.
This is a serious question. How do you plan to rate the effectiveness of things like the decision to intervene in Libya, or the decision not to intervene in Syria, under profound uncertainty about what would have happened if the alternative decision had been made?
Analysts has tools to model and simulate scenarios. Analysis of competiting hypothesis is staple in intelligence methodology. It’s also used by earth scientists, but I haven’t seen it used elsewhere. Based on this approach, analysts can:
make a prediction about outcomes without interventions in libya with and without intervention
when they choose to intervene on non-intervene, calculate those outcomes
over the long term of making comparisons between predicted and actual outcomes, they make decide to re-adjust their predictions post-hoc for the counterfactual branch
under profound uncertainty about what would have happened if the alternative decision had been made?
I’m not trying to downplay the level of uncertainty. Just that the methodological considerations remain constant.
How self-referentially absurd. More precisely, epidemiologists do this day in day out using biostatistical models, then applying causal inference (the counterfactual knowledge part incl.). I said biostatisticians because epidemiology isn’t in the common vernacular. Ironically, counterfactual knowledge is, to those familiar with the distinction, distinctly removed from the biostatistical domain.
Just for the sake of intellectual curiosity, I wonder what kind of paradox was just invoked prior to this clarification.
It wouldn’t be the epimenides paradox since that refers to an individual making a self-referentially absurd claim:
The Epimenides paradox is the same principle as psychologists and sceptics using arguments from psychology claiming humans to be unreliable. The paradox comes from the fact that the psychologists and sceptics are human themselves, meaning that they state themselves to be unreliable
More precisely, epidemiologists do this day in day out using biostatistical models, then applying causal inference (the counterfactual knowledge part incl.)
Yes, Anders_H is Doctor of Science in Epidemiology. He’s someone worth listening to when he tells you about what can and can’t be done with experiment design.
Oooh, an appeal to authority. If that is the case he is no doubt highly accomplished. However, that need not translate to blind deference.
This is a text conversation, so rhetorical questions aren’t immediately apparent. Moreover, we’re in a community that explicitly celebrates reason over other modes of rhetoric. So, my interpretation of his question about counterfactual conditions was interpreted was sincere rather than disingenuous.
Oooh, an appeal to authority. If that is the case he is no doubt highly accomplished. However, that need not translate to blind deference.
Yes, but if you disagree you can’t simply point to biostatisticians do this day in day out and a bunch of wikipedia articles but actually argue the merits of why you think that those techniques can be used in this case.
If it makes things better, decide if the level of improvement created in such cases is worth the cost in dollars and dead people.
That is a tendentious way of comparing the two: a cold, abstract “level of improvement” against the more concrete “dollars” and very concrete “dead people”. It suggests the writer is predisposed to find that intervention is a bad idea.
But what is improvement, but resources then available to apply to better things, and live people living better lives?
Presumably, Wiblin is talking about Western bombing of ISIS in Syria. If one finds that Turkish interventions have been effective and American interventions haven’t, say, then that’s an argument that Americans shouldn’t intervene now (but Turks should).
Presumably, Wiblin is talking about Western bombing of ISIS in Syria. If one finds that Turkish interventions have been effective and American interventions haven’t, say, then that’s an argument that Americans shouldn’t intervene now (but Turks should).
Choose your reference class, get the result you want. Is Turkey “Western” or not? It wants to join the EU (but hasn’t been admitted yet). Russia is bombing Syria. Why exclude it from the class of foreign interventions? For that matter, I don’t know what military actions, if any, Turkey has taken in Syria, but that would also be a foreign intervention.
Not to mention the the smallness of N in the proposed study and the elastic assessment.
I googled some of the phrases in the OP but only got hits to the OP. Is this even a quote?
Rating each decision on a scale of 1 to 10 and then taking a weighted average is a recipe for biasing the result against intervention, since you’ve created a hard upper limit for how much you count an intervention as helping, so you’ll count a successful intervention as 10 and be unable to count a successful intervention that does even more good as more than 10. (This has a similar problem at the low end of the scale, but that doesn’t affect the final result since you can’t go below zero intervention.)
This also produces bad results in cases where the intervention failed because it was insufficient. You’d end up concluding that intervention is bad when it may just be that insufficient intervention is bad. This method has clause 2 to cover similarity of case, but not similarity of intervention, and at any rate “similarity” is a fuzzy concept. If bombing half the country is a disaster and bombing a whole country succeeds, is bombing half a country “similar” to bombing a whole country? (Actually, you usually end up compressing all the dispute over intervention into a dispute over how similar two cases are.)
And it’s generally a bad idea to put on a numerical scale things that you can’t actually measure numerically. It gives a false appearance of accuracy and precision, like a company executive who wants to see figures for his company improve but doesn’t actually care where the figures come from.
Also, “level of improvement created” is subject to noise. It is possible for an improvement to fail for reasons unrelated to the effectiveness of the intervention, like if the country gets hit by a meteor the next day (or more realistically, gets invaded or attacked the next day).
Basically one huge problem here is that there isn’t enough data compared to the number of variables involved.
Not to mention that this is a problem in what Taleb would call extremistan, i.e., the distribution of possible outcomes from intervening, or not-intervening, are fat-tailed and include a lot of rare possibilities that haven’t yet shown up in the data at all.
-robert wiblin
How do you plan to do this without counterfactual knowledge?
take your pick
it requires a good handle of experiment design but biostatisticians do this day in day out. Hopefully risk analysts do this too in defense institutions.
The original quote said to rate each intervention by how much it helped or hurt the situation, i.e. its individual-level causal effect. None of those study designs will help you with that: They may be appropriate if you want to estimate the average effect across multiple similar situations, but that is not what you need here.
This is a serious question. How do you plan to rate the effectiveness of things like the decision to intervene in Libya, or the decision not to intervene in Syria, under profound uncertainty about what would have happened if the alternative decision had been made?
Yes I concede that cross-level inferences between aggregate (average of multiple similar situations) and individual level causes has less predictive power than inferences across identical levels of inference. However, I reckon it’s the best available means to make such an inference.
Analysts has tools to model and simulate scenarios. Analysis of competiting hypothesis is staple in intelligence methodology. It’s also used by earth scientists, but I haven’t seen it used elsewhere. Based on this approach, analysts can:
make a prediction about outcomes without interventions in libya with and without intervention
when they choose to intervene on non-intervene, calculate those outcomes
over the long term of making comparisons between predicted and actual outcomes, they make decide to re-adjust their predictions post-hoc for the counterfactual branch
I’m not trying to downplay the level of uncertainty. Just that the methodological considerations remain constant.
Just for completion, Anders_H is one of those guys.
How self-referentially absurd. More precisely, epidemiologists do this day in day out using biostatistical models, then applying causal inference (the counterfactual knowledge part incl.). I said biostatisticians because epidemiology isn’t in the common vernacular. Ironically, counterfactual knowledge is, to those familiar with the distinction, distinctly removed from the biostatistical domain.
Just for the sake of intellectual curiosity, I wonder what kind of paradox was just invoked prior to this clarification.
It wouldn’t be the epimenides paradox since that refers to an individual making a self-referentially absurd claim:
Anyone?
Yes, Anders_H is Doctor of Science in Epidemiology. He’s someone worth listening to when he tells you about what can and can’t be done with experiment design.
Oooh, an appeal to authority. If that is the case he is no doubt highly accomplished. However, that need not translate to blind deference.
This is a text conversation, so rhetorical questions aren’t immediately apparent. Moreover, we’re in a community that explicitly celebrates reason over other modes of rhetoric. So, my interpretation of his question about counterfactual conditions was interpreted was sincere rather than disingenuous.
Yes, but if you disagree you can’t simply point to
biostatisticians do this day in day out
and a bunch of wikipedia articles but actually argue the merits of why you think that those techniques can be used in this case.That is a tendentious way of comparing the two: a cold, abstract “level of improvement” against the more concrete “dollars” and very concrete “dead people”. It suggests the writer is predisposed to find that intervention is a bad idea.
But what is improvement, but resources then available to apply to better things, and live people living better lives?
And why the reference class “Western”?
Presumably, Wiblin is talking about Western bombing of ISIS in Syria. If one finds that Turkish interventions have been effective and American interventions haven’t, say, then that’s an argument that Americans shouldn’t intervene now (but Turks should).
Choose your reference class, get the result you want. Is Turkey “Western” or not? It wants to join the EU (but hasn’t been admitted yet). Russia is bombing Syria. Why exclude it from the class of foreign interventions? For that matter, I don’t know what military actions, if any, Turkey has taken in Syria, but that would also be a foreign intervention.
Not to mention the the smallness of N in the proposed study and the elastic assessment.
I googled some of the phrases in the OP but only got hits to the OP. Is this even a quote?
Rating each decision on a scale of 1 to 10 and then taking a weighted average is a recipe for biasing the result against intervention, since you’ve created a hard upper limit for how much you count an intervention as helping, so you’ll count a successful intervention as 10 and be unable to count a successful intervention that does even more good as more than 10. (This has a similar problem at the low end of the scale, but that doesn’t affect the final result since you can’t go below zero intervention.)
This also produces bad results in cases where the intervention failed because it was insufficient. You’d end up concluding that intervention is bad when it may just be that insufficient intervention is bad. This method has clause 2 to cover similarity of case, but not similarity of intervention, and at any rate “similarity” is a fuzzy concept. If bombing half the country is a disaster and bombing a whole country succeeds, is bombing half a country “similar” to bombing a whole country? (Actually, you usually end up compressing all the dispute over intervention into a dispute over how similar two cases are.)
And it’s generally a bad idea to put on a numerical scale things that you can’t actually measure numerically. It gives a false appearance of accuracy and precision, like a company executive who wants to see figures for his company improve but doesn’t actually care where the figures come from.
Also, “level of improvement created” is subject to noise. It is possible for an improvement to fail for reasons unrelated to the effectiveness of the intervention, like if the country gets hit by a meteor the next day (or more realistically, gets invaded or attacked the next day).
Basically one huge problem here is that there isn’t enough data compared to the number of variables involved.
Not to mention that this is a problem in what Taleb would call extremistan, i.e., the distribution of possible outcomes from intervening, or not-intervening, are fat-tailed and include a lot of rare possibilities that haven’t yet shown up in the data at all.