Statistical models & the irrelevance of rare exceptions

“I don’t care about this instance—I don’t care about any instances! Life is too short to care about anything but the general case!”

Yes, the general case is drawn from instances, I’m saying that we shouldn’t get caught up in the details unless they really matter. And if there is a clear statistical generalization, the details matter very little.

A common model might be: “Well, the pattern is that it’s some mix of A & B, with more A as factors X & Y are higher, and more B if they are lower”.

Replying “Z fits better than Y in the mix of variables to map onto the A & B mix” is a correction within the general case frame. Whereas, pointing out “Here are cases F & G that are high A with low X & Y—contradicting your model”—when such cases are very rare—is irrelevant. I find it’s usually obvious from the description of F&G that they’re extremely rare.

Rare exceptions are irrelevant because almost all models of the real world (not physics) are statistical claims about what’s usually true, not absolute claims about what’s always true. So a rare data point that doesn’t fit is actually *not* a contradiction! Pointing such cases out is just reiterating the tautological fact that statistical models are not absolute, which seems like a total waste of time to me. (Especially if the speaker agrees with the model!)

I’ve noticed this happens more often with careful, intellectually humble thinkers, who often include caveats of the form “But here’s an exception to this strong model I’ve just presented.” I think often they’re trying to proactively defend against others pointing out this case. But to me, this is wrongfully falling into the frame that rare exceptions are relevant criticisms or corrections of a statistical model.

So, rather than defending by acknowledging the rare case, I think it’s far better to break the false frame that rare exceptions are counter-examples and pointing them out is a relevant thing that needs to be addressed. Move into the new, correct frame that this is a statistical model—say by asking them if they actually disagree that the suggested pattern fits in the vast majority of cases and thus is statistically true.

I’ve also lately noticed that while many people are abstract systematizers, I find far fewer who are relentlessly meta-seeking, constantly seeking to expand to more general theories over broader domains. Eg I’ve noticed myself saying regularly: “Hey, your model for domain S is actually a model for domain Q, where S is a subdomain of Q. Notice that everything we said about S applies equally to Q—we didn’t actually use any special features of S in our reasoning!”

I also recently noticed a weird but cool thing—my brain automatically generates statistical models without details ever coming into my mind. Like it has a “model this data set” function, where it retrieves and analyzes the data set without me having to consciously consider cases. Ofc I use cases (common ones!) to check the model afterwards. One thing I love about the LW community is that these cognitive modes I describe here are much more common than in most other circles.

Finally, I do find rare exceptions relevant when there is a pattern to the exceptions. So, rather than just pointing out rare exceptions F&G, the responder then generalizes them into subclass H. Now we can make the general model more accurate by adding that it doesn’t apply to H. This “move” still falls within the true statistical frame.

As an example for this topic, note that in extreme distributions like power law, a “rare exception” that happens at 1% frequency could have 100x intensity compared to the other 99%, and so need to be weighted equally in a model. The generalization of this exception is that the more extreme the distribution, the more rare a rare case has to be to be irrelevant. This generalized exception now improves our model of statistical models.