# When is it appropriate to use statistical models and probabilities for decision making ?

I enjoy reading rationalist and effective altruist blogs. Members of the community usually back their arguments with data and/​or other evidence and tend to be scientifically literate. And although no one is a perfect, detached belief updater, I found that rationalists are probably the community that is closest to that ideal. However, I believe this community often commits a fallacy while attempting to think rationally. It is the fallacy of applying cost benefit analysis under deep uncertainty. What is deep uncertainty ? I found a great definition in the textbook Decision making under deep uncertainty by Marchau et al., so I’m simply going to quote it :

Complete certainty is the situation in which we know everything precisely. This is almost never attainable, but acts as a limiting characteristic at one end of the spectrum.

Level 1 uncertainty represents situations in which one admits that one is not absolutely certain, but one does not see the need for, or is not able, to measure the degree of uncertainty in any explicit way (Hillier and Lieberman 2001, p. 43). These are generally situations involving short-term decisions, in which the system of interest is well defined and it is reasonable to assume that historical data can be used as predictors of the future. Level 1 uncertainty, if acknowledged at all, is generally treated through a simple sensitivity analysis of model parameters, where the impacts of small perturbations of model input parameters on the outcomes of a model are assessed. Several services in our life are predictable, based on the past such as mail delivery and garbage collection. These are examples of this level of uncertainty.

In the case of Level 2 uncertainties, it is assumed that the system model or its inputs can be described probabilistically, or that there are a few alternative futures that can be predicted well enough (and to which probabilities can be assigned). The system model includes parameters describing the stochastic—or probabilistic—properties of the underlying system. In this case, the model can be used to estimate the probability distributions of the outcomes of interest for these futures. A preferred policy can be chosen based on the outcomes and the associated probabilities of the futures (i.e., based on “expected outcomes” and levels of acceptable risk). The tools of probability and statistics can be used to solve problems involving Level 2 uncertainties. Deciding on which line to join in a supermarket would be a Level 2 problem.

Level 3 uncertainties involve situations in which there are a limited set of plausible futures, system models, outcomes, or weights, and probabilities cannot be assigned to them—so the tools of neither Level 1 nor Level 2 are appropriate. In these cases, traditional scenario analysis is usually used. The core of this approach is that the future can be predicted well enough to identify policies that will produce favorable outcomes in a few specific, plausible future worlds (Schwartz 1996). The future worlds are called scenarios. Analysts use best-estimate models (based on the most up-to-date scientific knowledge) to examine the consequences that would follow from the implementation of each of several possible policies in each scenario. The “best” policy is the one that produces the most favorable outcomes across the scenarios. (Such a policy is called robust.) A scenario does not predict what will happen in the future; rather it is a plausible description of what can happen. The scenario approach assumes that, although the likelihood of the future worlds is unknown, the range of plausible futures can be specified well enough to identify a (static) policy that will produce acceptable outcomes in most of them. Leaving an umbrella in the trunk of your car in case of rain is an approach to addressing Level 3 uncertainty.

Level 4 uncertainty represents the deepest level of recognized uncertainty. A distinction can be made between situations in which we are still able (or assume) to bound the future around many plausible futures (4a) and situations in which we only know that we do not know (4b). This vacuum can be due to a lack of knowledge or data about the mechanism or functional relationships being studied (4a), but this can also stem from the potential for unpredictable, surprising, events (4b). Taleb (2007) 8 V. A. W. J. Marchau et al. calls these events “black swans.” He defines a black swan event as one that lies outside the realm of regular expectations (i.e., “nothing in the past can convincingly point to its possibility”), carries an extreme impact, and is explainable only after the fact (i.e., through retrospective, not prospective, predictability). In these situations, analysts either struggle to (Level 4a) or cannot (Level 4b) specify the appropriate models to describe interactions among the system’s variables, select the probability distributions to represent uncertainty about key parameters in the models, and/​or value the desirability of alternative outcomes.

Total ignorance is the other extreme from determinism on the scale of uncertainty; it acts as a limiting characteristic at the other end of the spectrum.

As you can see, the textbook distinguishes 4 levels of uncertainty. Statistical models and probabilities are considered useful up until level 2 uncertainty.

Starting at level 3 the system is considered too uncertain to assign probabilities to scenarios, but the number of scenarios is limited. An example of level 3 uncertainty is, perhaps, the possibility of conflict between two neighboring states. Even though it can be impossible to assign a probability to the event of war because of the immense number of factors that come into play, we know the two possibilities are “war” and “no war”. It is thus possible to take an action that leads to the best results across both scenarios.

In level 4a, the number of scenarios is large. In level 4b, we have no idea what the scenarios are or how many they are. Level 4 is usually what is called deep uncertainty.

I believe many organisations and thinkers use level 2 methods in level 3 or level 4 contexts. Here is an example. In this blog post, rationalist adjacent economist Bryan Caplan argues that a cost benefit analysis of climate change action shows that it might actually be less costly to do nothing. But climate change is a case of deep uncertainty and cost benefit analysis does not apply. There are many unknown unknowns and therefore estimates of the costs of climate change damage are not reliable nor valid. I enjoy reading Caplan but in this case I think his argument leads to a false sense of certainty.

Another example of this fallacy, in my opinion, are 80000 hours’ rankings of urgent world issues. For example, they consider that AI risk is a more pressing issue than climate change. Although I admit that is possible, I don’t think their justification for that belief is valid. All of the systemic risks they attempt to rank involve unknown unknowns and are thus not quantifiable. One also has to keep in mind that we are very likely exposed to other systemic risks that we do not yet know about.

My goal with this blog post is not to give a lecture on decision making under uncertainty. Firstly, because I don’t consider myself knowledgeable enough in this domain yet and haven’t finished reading the book. Secondly, because I think the book is excellent (and freely available !) and that I won’t do a better job than the authors in teaching you their subject. My goal is to raise awareness about this problem in the rationalist community in order to improve discourse about systemic risks and decision making under uncertainty.

As a Data scientist, I believe it is important to be aware of the limits of the discipline. To this date, I think our quantitative methods are unable to fully inform decisions under deep uncertainty. But I am hopeful that progress will be made. Perhaps we can develop reinforcement learning agents capable of evolving in environments of deep uncertainty ?

• “All models are wrong, but some are useful. ”

The real question is “what else do you have to make decisions with?” Your intuition is a model. Brainstorming then picking the best-sounding is a model. Strict application of rules is a model. Some are more legible than others, and some are more powerful at handling out-of-domain questions. Some, importantly, SEEM more powerful BECAUSE they’re illegible, and you can’t tell how bad they are. One very strong mechanism for improving them is to quantify. Even if you know it’s not precise, it adds a lot of analytical power to testing the plausibility of the results before committing a lot of resources and seeing what actually happens.

It’s provably wrong to believe that any formal model is complete and consistent. But it’s even more wrong to throw out modeling methodologies for making many, perhaps most, decisions.

A few examples of topics where you really don’t include any cost/​benefit estimates in your decision (as opposed to strawman examples of INCORRECT cost/​benefit use) would go a long way.

• Your intuition is a model.

Sure, you can use a broad definition of “model” to include any decision making process. But I used the word model to refer to probabilistic and quantitative models.

A few examples of topics where you really don’t include any cost/​benefit estimates in your decision (as opposed to strawman examples of INCORRECT cost/​benefit use) would go a long way.

Sure. An example from my life is “I refrain from investing in the stock market because we do not understand how it works and it is too uncertain”. I don’t rely on cost benefit analysis in this case. It is more of qualitative analysis. I do not use cost benefit analysis because I am unable to quantify the expected utility I would derive from investing in the stock market. I do not have the necessary information to compute it.

• While I agree that the Efficient Market Hypothesis basically means you shouldn’t pick stocks, indexes like the S&P 500 are pretty good to invest in due to you getting the risk-free rate. That’s usually around 7% long term. Focus on long-term growth, and don’t time the market. You can invest, as long as you are willing to focus on decades of holding a index.

• I know about index funds. Even those are not nearly as safe as people think. It is a fallacy to assume that because the SP500 on average grows 7% a year that you will get a 7%/​year return rate on your investment. Your true expected return is lower than that. People have a hard time predicting how they will behave in particular situations. They swear they won’t sell after a crash, and yet they do. You might say you are not like that, but probabilistically speaking you probably are. You might get sick and need to get cash quick and sell while the market is down. You might need to buy a house because of an unexpected child. Because the group gets 7% return, does not mean that an individual will get 7% return on the long run. This is called the ergodicity fallacy. There is also tracking error and fees, depending on your broker.

• I think you haven’t really responded to Dagon’s key point here:

“what else do you have to make decisions with?”

You express concern about Caplan underestimating the importance of climate change. What if I think the risk of the Large Hadron Collider collapsing the false vacuum is a much bigger deal, and that any resources currently going to reduce or mitigate climate change should instead go to preventing false vacuum collapse. Both concerns have lots of unknown unknowns. On what grounds would you convince me—or a decisionmaker controlling large amounts of money—to focus on climate change instead? Presumably you think the likelihood of catastrophic climate change is higher—on what basis?

Probabilistic models may get weaker as we move toward deeper uncertainty, but they’re what we’ve got, and we’ve got to choose how to direct resources somehow. Even under level 3 uncertainty, we don’t always have the luxury of seeing a course of action that would be better in all scenarios (eg I think we clearly don’t in my example—if we’re in the climate-change-is-higher-risk scenario, we should put most resources toward that; if we’re in the vacuum-collapse-is-higher-risk scenario, we should put our resources there instead.

• One does not have to know what might happen in order to accurately predict how it would affect a one-dimensional scale of utility, though. It seems intuitively to me, someone who admittedly doesn’t know much about the subject, like unknown unknowns could be accurately modeled most of the time with a long-tailed normal distribution, and one could use evidence from past times that something totally unexpected happened throughout human history (and there were plenty) to get a sense of what that distribution ought to look like.

• it seems intuitively to me, someone who admittedly doesn’t know much about the subject, like unknown unknowns could be accurately modeled most of the time with a long-tailed normal distribution

How fat tailed do you make it ? You said you use past extreme events to choose a distribution. But what if the past largest event is not the largest possible event ? What if the past does not predict the future in this case ?

You can say “the largest truck that ever crossed by bridge was 20 tons, therefore my bridge has to be able to sustain 20 tons” but that is a logical fallacy. The fact that the largest truck you ever saw was 20 tons does not mean a 30 ton truck could not come by one day. This amounts to saying “I have observed the queen of England for 600 days and she hasn’t died in any of them, therefore the queen of England will never die”.