The Nature of Counterfactuals

I’m finally beginning to feel that I have a clear idea of the true nature of counterfactuals. In this post I’ll argue that counterfactuals are just intrinsicly a part of how we make sense of the world. However, it would be inaccurate to present them as purely a human invention as we were shaped by evolution in such a way as to ground these conceptions in reality.

Unless you’re David Lewis, you’re probably going to be rather dubious of the claim that all possibilities exist (ie. that counterfactuals are ontologically real). Instead, you’ll probably be willing to concede that they’re something we construct; that they’re in the map rather than in the territory.

Things in the map are tools, they are constructed because they are useful. In other words, they are constructed for a purpose or a number of purposes. So what is the purpose (or the purposes) of counterfactuals?

I first raised this question in Counterfactuals are an Answer, Not a Question and I struggled with it for around a year. Eventually, I realised that a big part of the challenge is just how abstract the question is. So I replaced it with something more concrete: “Why don’t agents construct crazy counterfactuals?” One example would be expecting the world to explode if I made this post. Another would be filling in the future with randomly generated events? What shouldn’t I either of these.

I’ll make a modest claim: it’s not about aesthetics. We don’t construct counterfactuals because we want them to be pretty or funny or entertaining. We want them to be useful. The reason why we don’t just construct counterfactuals in a silly or arbitrary manner because we believe in some vague sense that it’d lead outcomes that are sub-optimal or that in expectation it’ll lead to sub-optimal outcomes.

I suspect most people will agree that the answer must be something along these lines, but I’ve hardly defined it very precisely. So let’s attempt to clarify. To keep this discussion as general as possible, note that we could have stated similar sentiments in terms of achieving good outcomes, avoiding bad outcomes, achieving better outcomes. But regardless of how we word it, we’re comparing worlds and deciding that one is better than another. It’s not about just considering one world and comparing it to a standard because we can’t produce such a standard without constructing a world non-identical to the first.

Essentially, we conceive of certain worlds being possible, then we consider the expected value or the median outcome or some other metric over these worlds and finally we suggest that according to this metric the agents constructing a sane theory of counterfactuals tend to do better than the agents with crazy theories.

This naturally leads to another question: what worlds should we conceive of as being possible? Again, we can make this concrete by asking what would happen if we were to choose a crazy set of possible worlds—say a world just like this one and then a world with unicorns and fountains of gold—and no other worlds. Well again, the reason why we wouldn’t do this is because we’d expect an agent building its decision theory based on these possible worlds to perform poorly.

What do I mean by poorly? Well, again it seems like we’re conceiving of certain worlds as possible, imagining how agents constructing their decision theory based on different notions of possibility perform in these worlds and utilising some kind of metric to evaluate performance.

So we’re back we’re we were before. That is, we’re going around in circles. Suppose an agent that believes we should consider set W of worlds as possible and construct a decision theory based on this. Then this agent will evaluate agents who adopt W in order to develop their decision theory as making an optimal decision and they will evaluate agents who adopt a different set of worlds that leads to a different decision theory as making a sub-optimal decision, except for in the rare cases where this doesn’t make a difference. In other words, such an agent will reaffirm what it already believes about what worlds are possible.

You might think that the circularity is a problem, but circular epistemology turns out to be viable (see Eliezer’s Where Recursive Justification Hits Bottom). And while circular reasoning is less than ideal, if the comparative is eventually hitting a point where we can provide no justification at all, then circular justification might not seem so bad after all.

Kant theorised that certain aspects of phenomenon were the result of intrinsic ways of how we interpret the world and that it is impossible for us to step outside this perspective. He called this Transcendental Idealism and suggested that it provided a form of a priori synthetic knowledge which provided the basic assumptions we needed to begin reasoning about the world (such as causation).

My approach is slightly different as I’m using circular epistemology rather than a priori synthetic knowledge to provide a starting place for reason. By having our starting claims amenable to updating based on evidence, I avoid a particular problem in the Kantian approach that is best highlighted by Einstein’s Theory of Relativity. Namely, Kant claimed that space and time existed a priori, but experimental results were able to convince us otherwise, which should not be possible with an a priori result.

However, I agree with him that certain basic concept are frames that we impose on the world due to our cognitive structure (in my case I’m focusing on the notion of possibility). I’m not picturing this as a straightjacket that is completely impossible to escape; indeed these assumptions may be subsumed by something similar as they were in the case of relatively. The point is more that to even begin reasoning we have to begin within a cognitive frame.

Imagine trying to do physics without being able to say things like, “Imagine we have a frictionless ball with weight 1kg...”, mathematics without being able to entertain the truth of a proposition that may be false or divide a problem into cases and philosophy without being allowed to do thought experiments. Counterfactuals are such a basic concept that it makes sense to believe that they—or something very much like them—are a primitive.

Another aspect that adds to the merits of this theory—it is simple enough to be plausible (this seems to me like the kind of thing that should have a simple answer), yet also complicated enough to explain why it has been surprisingly difficult to progress on.

After writing this post I found myself in a strange position. I felt certain I had dramatically improved my conceptual understanding of counterfactuals, yet at the same time I found myself struggling to understand where to go from here in order to produce a concrete theory of counterfactuals or even having trouble to articulate how it helps in this regard at all.

A big part of the challenge for me is that I have almost no idea of how we should handle circular epistmology in the general case. There are far too many different strategies you could attempt to produce something consistent.

That said, there are a few ideas I want to offer up on how to approach this. The big challenge with counterfactuals is not imagining other states the universe could be in or how we could apply our “laws” of physics to discover the state of the universe at other points of time. Instead the challenge comes when we want to construct a counterfactual representing someone choosing a different decisison. After all, in a deterministic universe someone could only have make a different choice if the universe were different, but then it’s not clear why we would care about the fact that someone in a different universe would have achieved a particular score when we just care about this universe.

I believe that answer to this question will be roughly that in certain circumstances we only care about particular things. For example, let’s suppose Omega is programmed in such a way that it would be impossible for Amy to choose box A without gaining 5 utility or choose box B without gaining 10 utility. Assume that in the universe Amy chooses box A and gains 5 utility. We’re tempted to say “If she had chosen box B she would have gained 10 utility” even though she would have to occupy a different mental state at the time of the decision and the past would be different because the model has been set up so that those factors are unimportant. Since those factors are the only difference between the state where she chooses A and the state where she chooses B we’re tempted to treat these possibilities as the same situation.

So naturally this leads to a question, why should we build a model where those particular factors are unimportant? Does this lead to pure subjectivity? Well, the answer seems to be that often in practise such a heuristic tends to work well—agents that ignore such factors tend to perform pretty close to agents that account for them—and often better when we include time pressure in our model.

This is the point where the nature of counterfactuals becomes important—whether they are ontologically real or merely a way in which we structure our understanding of the universe. If we’re looking for something ontologically real, the fact that we have such a heuristic feels unimportant to understanding counterfactuals; on the other hand if they’re a way of structuring our understanding, then we’re probably aiming to produce something consistent from our inuitions and our experience of the universe. And from this perspective, this heuristic would be something that we’d want to take into account.

I suspect that with a bit more work this kind of account could be enough to get a circular epistemology off the ground.