Yes, It’s Subjective, But Why All The Crabs?


Nature really loves to evolve crabs.


Some early biologist, equipped with knowledge of evolution but not much else, might see all these crabs and expect a common ancestral lineage. That’s the obvious explanation of the similarity, after all: if the crabs descended from a common ancestor, then of course we’d expect them to be pretty similar.

… but then our hypothetical biologist might start to notice surprisingly deep differences between all these crabs. The smoking gun, of course, would come with genetic sequencing: if the crabs’ physiological similarity is achieved by totally different genetic means, or if functionally-irrelevant mutations differ across crab-species by more than mutational noise would induce over the hypothesized evolutionary timescale, then we’d have to conclude that the crabs had different lineages. (In fact, historically, people apparently figured out that crabs have different lineages long before sequencing came along.)

Now, having accepted that the crabs have very different lineages, the differences are basically explained. If the crabs all descended from very different lineages, then of course we’d expect them to be very different.

… but then our hypothetical biologist returns to the original empirical fact: all these crabs sure are very similar in form. If the crabs all descended from totally different lineages, then the convergent form is a huge empirical surprise! The differences between the crab have ceased to be an interesting puzzle—they’re explained—but now the similarities are the interesting puzzle. What caused the convergence?

To summarize: if we imagine that the crabs are all closely related, then any deep differences are a surprising empirical fact, and are the main remaining thing our model needs to explain. But once we accept that the crabs are not closely related, then any convergence/​similarity is a surprising empirical fact, and is the main remaining thing our model needs to explain.


A common starting point for thinking about “What are agents?” is Dennett’s intentional stance:

Here is how it works: first you decide to treat the object whose behavior is to be predicted as a rational agent; then you figure out what beliefs that agent ought to have, given its place in the world and its purpose. Then you figure out what desires it ought to have, on the same considerations, and finally you predict that this rational agent will act to further its goals in the light of its beliefs. A little practical reasoning from the chosen set of beliefs and desires will in most instances yield a decision about what the agent ought to do; that is what you predict the agent will do.

— Daniel Dennett, The Intentional Stance, p. 17

One of the main interesting features of the intentional stance is that it hypothesizes subjective agency: I model a system as agentic, and you and I might model different systems as agentic.

Compared to a starting point which treats agency as objective, the intentional stance neatly explains many empirical facts—e.g. different people model different things as agents at different times. Sometimes I model other people as planning to achieve goals in the world, sometimes I model them as following set scripts, and you and I might differ in which way we’re modeling any given person at any given time. If agency is subjective, then the differences are basically explained.

… but then we’re faced with a surprising empirical fact: there’s a remarkable degree of convergence among which things people do-or-don’t model as agentic at which times. Humans yes, rocks no. Even among cases where people disagree, there are certain kinds of arguments/​evidence which people generally agree update in a certain direction—e.g. Bottlecaps Aren’t Optimizers seems to make a largely-right kind of argument, even if one found its main claim surprising. If agency is fundamentally subjective, then the convergence is a huge empirical surprise! The differences between which-things-people-treat-as-agents have ceased to be an interesting puzzle—they’re explained—but now the similarities are the interesting puzzle. What caused the convergence?

Metaphorically: if agency is subjective, then why all the crabs? What’s causing so much convergence?

And, to be clear, I don’t mean this as an argument against the intentional stance. The intentional stance seems basically correct, at a fundamental level: “agent” is a pretty high-level abstraction, and high-level abstractions “live in the mind, not in the territory” as Jaynes would say. So the intentional stance is as solid a foundation as one could hope for. But the crabs question indicates that the intentional stance, though correct, is incomplete as an answer to the “What are agents?” question: once we accept subjectivity, the main remaining thing our model needs to explain is the degree of convergence. When and why do people agree about agency? What’s the underlying generator of that agreement?


We’re all Bayesian now, right?

The big difference between (the most common flavor of) Bayesianism and its main historical competitor, frequentism, is that Bayesianism interprets probabilities as subjective. Different people have different models, and different information, and therefore different (usually implicit) probabilities. There’s not fundamentally an objective “probability that the die comes up 2”. Frequentism, on the other hand, would say that probabilities only make sense as frequencies in experiments which can be independently repeated many times—e.g. one can independently roll a die many times, and track the frequency at which it comes up 2, yielding a probability.

Compared to frequentism, Bayesianism does resolve a lot of puzzles. In analogy to the intentional stance, it also plays a lot better with reductionism: we don’t need a kinda-circular notion of “independent experiment”, and we don’t need to worry about what it means that an experiment “could be” repeated many times even if it actually isn’t.

But we’ve resolved those puzzles in large part by introducing subjectivity, so we have to ask: why all the crabs?

Empirically, there is an awful lot of convergence in (usually implicit) probabilities. Six-sided dice sure do come up 2 approximately ⅙ of the time, and people seem to broadly agree about that. Also, there are all sorts of “weird” Bayesian models—e.g. anti-inductive models, which say that the more often something has happened before, the less likely it is to happen again, and vice-versa. (“The sun definitely won’t come up tomorrow, because it’s come up so many times before.” “But you say that every day, and you’re always wrong!” “Exactly. I’ve never been right before, so I’ll near-certainly be right this time.”) Under the rules of subjective Bayesianism alone, those are totally valid models! … Yet somehow, people seem to converge on models which aren’t anti-inductive.

Much like the intentional stance, (subjective) Bayesianism seems roughly correct (modulo some hedging about approximation and domain of applicability). But it’s incomplete—it doesn’t explain the empirically-high degree of convergence. When and why do people agree about probabilities (or, at a meta level, the models in which those probabilities live)? What’s the underlying generator of that agreement?


It’s a longstanding empirical observation that different cultures sometimes use different ontologies. Eskimos have like a gajillion words for snow, polynesian navigators had some weird-to-us system for thinking about the stars, etc. So, claim: ontologies are subjective.

(In this case there’s a whole historical/​political angle, to a much greater degree than the previous examples. The narrative goes that bad imperial colonizers would see the different ontologies of other cultures, decide that the ontologies of those “primitive” cultures were clearly “irrational”, and then “help” them by imposing the imperial colonizer’s ontology. And therefore today, any claim that ontologies might not be entirely subjective, or especially any claim that one ontology might be better than another, is in certain circles met with cries of alarm and occasionally attempts at modern academia’s equivalent of exorcism.)

Similar to the intentional stance, there’s a sense in which ontologies are obviously subjective. They live in the mind, not the territory. The claim of subjectivity sure does seem correct, in at least that sense.

But why all the crabs?

Even more so than agents or probability, convergence is the central empirical fact about ontologies. We’re in a very high dimensional world, yet humans are able to successfully communicate at all. That requires an extremely high degree of everyday ontological convergence—every time you say something to me, and I’m not just totally confused or misunderstanding you completely, we’ve managed to converge on ontology. (And babies are somehow able to basically figure out what words mean with only like 1-3 examples, so the convergence isn’t by brute force of exchanging examples with other humans; that would require massively more examples. Most of the convergence has to happen via unsupervised learning.)

So modeling ontologies as subjective is correct in a fundamental sense, but it’s incomplete—it doesn’t explain the empirically-high degree of convergence. When and why do people converge on the same ontology? What’s the underlying generator of that convergence?


I find this pattern comes up all the time when discussing philosophy-adjacent questions. The pattern goes like this:

  • Question: What is X? (e.g. agency, probability, etc)

  • Response: Well, that’s subjective.

  • Counter-response: Yeah, that seems basically-true in at least a reductive sense; X lives in the mind, not the territory. But then why all the crabs? Why so much convergence in peoples’ notions of X?

… and then usually the answer is “well, there’s a certain structure out in the world which people recognize as X, because recognizing it as X is convergently instrumental for a wide variety of goals”, so we want to characterize that structure. But that’s beyond the scope of this post; this post is about the question, not the answer.


To wrap up, two gotchas to look out for.

First, convergence is largely an empirical question, and it might turn out that there’s surprisingly little convergence. I think “consciousness” is a likely example: there’s a sort of vague thematic convergence (it has to do with the mind, and maybe subjective experience, and morality?), but people in fact mean a whole slew of wildly different things when they talk about “consciousness”. (Critch has some informal empirical results illustrating what I mean.) “Human values” is, unfortunately, another plausible example.

Second, when empirically checking for convergence, bear in mind that when people try to give definitions for what they mean, they are almost always wrong. I don’t mean that people give definitions which disagree with the definitions other people use, or that people give definitions which disagree with some platonic standard; I mean people give definitions which are not very good descriptions of their own use of a word or concept. Humans do not have externally-legible introspective access to what we mean by a word or concept. So, don’t ask people to define their terms as a test of convergence. Instead, ask for examples.