The One Mistake Rule

Link post

Epistemic Status: The Bed of Procrustes

If a model gives a definitely wrong answer anywhere, it is useless everywhere.

This principle is doubtless ancient, and has doubtless gone by many names with many different formulations.

All models are wrong. That does not make them useless. What makes them useless is when they are giving answers that you know are definitely wrong. You need to fix that, if only by having the model more often spit out “I don’t know.”

As an example of saying “I don’t know” that I’m taking from the comments, if you want to use Newtonian Physics, you need to be aware that it will give wrong answers in relativistic situations, and therefore slightly wrong answers in other places, and introduce the relevant error bars.

Of course, a wrong prediction of what is probably going to happen is not definitely wrong, in this sense. An obviously wrong probability is definitely wrong no matter the outcome.

The origin of this particular version of this principle was when me and a partner were, as part of an ongoing campaign of wagering, attempting to model the outcomes of sporting events.

He is the expert on sports. I am the expert on creating models and banging on databases and spreadsheets. My specialty was assuming the most liquid sports betting market odds were mostly accurate, and extrapolating what that implied elsewhere.

First we would talk and he would explain how things worked. Then I would look at the data lots of different ways and create a spreadsheet that modeled things. Then, he would vary the inputs to that spreadsheet until he got it to give him a wrong answer, or at least one that seemed wrong to him.

Then he’d point out the wrong answer and explain why it was definitely wrong. I could either argue that the answer was right and change his mind, or I could accept that it was wrong and go back and fix the model. Then the cycle repeated until he couldn’t find a wrong answer.

Until this cycle stopped, we did not use the new model for anything at all, anywhere, no matter what. If a new wrong answer was found, we stopped using the model in question until we resolved the problem.

Two big reasons:

If we did use the model, even if it was only wrong in this one place, then the one place it was wrong would be the one place we would disagree with the market. Fools and their money would be soon parted.

Also, if the model was obviously wrong here, there’s no reason to trust anything else the model says, either. Fix your model.

This included tail risk style events that were extremely unlikely. If you can’t predict the probability of such events in a reasonable way, even if those outliers won’t somehow bankrupt you directly, you’re going to get the overall distributions wrong.

This also includes the change in predictions between different states of the world. If your model predictably doesn’t agree with itself over time, or changes its answer based on things that can’t plausibly matter much, then it’s wrong. Period. Fix it.

You should be deeply embarrassed if your model outputs an obviously wrong or obviously time-inconsistent answer even in a hypothetical situation. You should be even more embarrassed if it gives such an answer to the actual situation.

The cycle isn’t bad. It’s good. It’s an excellent way to improve your model: Build one, show it to someone, they point out a mistake, you figure out how it happened and fix it, repeat. And in the meantime, you can still use the model’s answers to help supplement your intuitions, as a sanity check or very rough approximation, or as a jumping off point. But until the cycle is over, don’t pretend you have anything more than that.