Internal Information Cascades

Epistemic Status: This was written in 2010 and existed in LW’s editing purgatory ever since. It doesn’t seem extremely wrong. (“Remember ‘light side epistemology’? Pepperidge Farms remembers.”) Perhaps the biggest flaw is that I didn’t hit the publication button sooner and this meta-flaw is implicit in the perspective being described? Maybe the last 10 years would have gone better, but it is hard to reverse the action of publishing. Some links don’t work. I fixed a few typos.. Perhaps link archiving will one day receive more systematic attention and links here which once worked, and now do not, can be restored via some kind of archival reconciliation?

A classic information cascade is a situation where early decisions by individuals in a group bias the decisions of other people in same group in the same direction. When people talk about information cascades they’re generally talking about “dumb herd behavior”.

If one person observed something and justified a behavior of their own in a robust way, their behavior, observed, is indirect evidence of their observations and reasoning. If 10 people update based on a shallow observation of the first person, their own behavior is not actually 10 times more evidence. It would be more informative for there to be 11 independent people who all just happened to think and act alike independently.

The classic example of multi-agent information cascades is probably an economic bubble, where early investment decisions are “irrationally” mimicked by later speculators creating a temporary increase in demand for the investment. Purchasing behavior expands through a community like a wave and the “genuine” opportunity here is to buy ahead of the wave and “flip” the investment shortly afterwards to a “greater fool” who found out about the bubble later than yourself. Aside from obvious moral concerns, the selfish worry is that bubbles are basically self organized Ponzi schemes where the last round of investors is the largest, and they lose a substantial amount of money to early adopters who successfully executed the “flip” maneuver before the crash. The “average investor” has no easy way to tell that they are a late entrant who will lose money.

In this community, RichardKennaway called attention to a paper about classical cascades in academic citations and Johnicholas wrote about similar potential in voting and karma on lesswrong.

The rest of this article will discuss a process similar to classical information cascades except happening within the mind of a single person in a way that requires no mass delusion of any kind. The similarity between “internal information cascades” and the more traditional “external” information cascades arises from the fact that an initial idea non-obviously predetermines subsequent conclusions. Perhaps a way to think about it is that your own past behaviors and circumstances are not independent evidence, because they all had you in common.

Below, there are two rather different examples of the phenomenon. One example involves moral behavior based on theories of human nature. The other example uses economically rational skill acquisition. The post concludes with abstract observations and possible behavioral implications.

Expert Specialization As An Internal Information Cascade

In this essay I wanted to make sure to provide two examples of internal cascades and I wanted one of the examples to be positive because the critical feature of an internal information cascade is not that they are “bad”. The critical feature is that “initial” prior beliefs can feed forward into evidence gathering processes with potentially dramatic effects on subsequent beliefs. The final beliefs that grow from internal information cascades are deeply justified for the agent that believes them in the sense that they explain the agent’s sensory expectations… the thing these beliefs don’t have is the property of predicting the sensory data that other agents should expect (unless, perhaps, they start experiencing the world via the behavioral patterns and environmental contexts that are consistent with the belief’s implied actions performed long enough and skillfully enough).

The easiest way I can think to explain this is with the example of a friend of mine who adjusted her life in her 40′s. She’d been a programmer for years and had a sort of aura of wisdom for technology and consulting and startups. Some of my jobs in the past have involved computers and it was uncanny the way she could predict what was going on in my jobs from snippets of story about work. The thing is, she’s a nurse now, because she got sick of programming and decided to change careers. The career change was a doozy in terms of personal finance but the real cost wasn’t from the nursing school tuition… the really expensive part was the opportunity cost from not utilizing her programming skills while learning to be a nurse.

Something to notice here is how the pop culture “ten thousand hours till expertise” rule implies that you can work on almost anything and get progressively better and better at it. There is also a substantial academic literature on “expertise” but the key point in terms of internal information cascades is that people get better at whatever they do, not what they don’t do. My friend is a counterexample of an internal information cascade (she successfully changed contexts) but the costs she bore in doing so highlight the natural barriers to novel skill acquisition: the more useful your skills in any domain become, the larger the opportunity costs for subsequent skill acquisition in other areas.

Economists are very big on the benefits of specialization, and point to the real benefits of trade as, in part, simply enabling expertise to develop so that people can do something enough times to become really good at it. From Adam Smith’s theories about people to individual bees gaining productivity from flower specialization the economic value of specialization seems relatively clear1.

So while gaining expertise is generally a good thing, one should keep in mind that your world model after you become an expert will gain ridiculously high resolution (compared to non experts) in your area of expertise, while other areas of your world model will have much less resolution. The high resolution will tend to become ever more pronounced over time, so the cascade here is primarily in the “quantity” of beliefs within a domain, but the quality of the beliefs might also go up. Despite the benefits, the process may bring along certain “free riding beliefs” if you generate the occasional motivated belief at the edges. Perhaps justifying the value of your expertise (which is probably real) by processes that don’t track for everyone? When people from different professions interact in an area outside either of their expertise you’ll see some of this, and the biases revealed in this way can make for amusing jokes about “The mathematician, the physicist, and the engineer...”

Uncooperative Misanthropy As An Internal Information Cascade

Game theory has been inspiring psychological studies for many decades now. Kelley & Stahelski were early researchers in this area who proposed the “Triangle Hypothesis” which states “that competitors hold homogeneous views of others by assuming that most others are competitive, whereas cooperators or pro-social people hold more heterogeneous views by assuming that others are either cooperative or competitive” (source).

To vividly understand the triangle hypothesis, imagine that you’re a participant in a study on actual *human performance* in an iterated prisoner’s dilemma (especially in the 1960′s and 1970′s before the prisoner’s dilemma paradigm had diffused into popular culture). The standard “tit for tat” strategy involves cooperating as your first move and thereafter simply mirroring the other person’s behavior back at them. It is a very simple strategy that frequently works well. Suppose, however, you defected initially, just to see what would happen? When the other person defected on the next turn (as a natural response) there’s a significant chance that repeated retaliation would be the outcome. It is a tragic fact that retaliatory cycles are sometimes actually observed with real human participants even in iterated prisoners dilemmas where the structure of the game should push people into cooperation.

Supposing you found yourself in a retaliatory cycle at the beginning of the study, then a grumpy mood from your first session could lead to another retaliatory cycle with a second partner. At this point you might start to wonder if everyone in this experiment was just automatically hostile? Perhaps maybe the world is just generally full of assholes and idiots? The more you believe something like this, the more likely you are to preemptively defect in your subsequent sessions with other experimental participants. Your preemptive cheating will lead to more conflict, thereby confirming your hypothesis. At the end of the experiment you’ll have built up an impressive body of evidence supporting the theory that the world is full of evil idiots.

In 1974 Braver confirmed that the perceived intentions of partners in game theoretic contexts were frequently more important for predicting behavior than a subject’s personal payoff matrix, and in 1978 Goldman tried pre-sorting participants into predicted cooperators and defectors and found (basically as predicted) that defectors tended not to notice opportunities to cooperate even when those opportunities actually existed.

Consider the tragedy here: People can update on the evidence all they want, but initial social hypotheses can still channel them into a social dynamics where they generate the evidence necessary to confirm their prior beliefs, even when those beliefs lead to suboptimal results. This is especially worrisome in light of the way the first few pieces of evidence can be acquired based on guesses growing out of marginally related environment noise in the moments before learning “formally starts”.

Summarizing The Concept And Its Implications

First, here are four broad observations about internal information cascades:

  1. Internal information cascades happen within a single mind, by definition, with no intermediate step requiring the agreement of others. I was tempted to call just them single-agent information cascades. They probably even happen in the minds of solitary agents, as with skill specialization for a survivor marooned on a desert island who repeats their relatively random early successes in food acquisition. I have a pet hypothesis that isolation-induced unusual internal information cascades around academic subjects are part of the story for autodidacts who break into academic respectability like Ramanujan and Jane Jacobs.

  2. The beliefs that result from internal information cascades may be either good or bad for an individual or for the group they are a part of. In the broader scheme of things (taking into account the finite channel capacity of humans trying to learn, and the trivial ease of barter) it would probably be objectively bad for everyone in the world to attempt to acquire precisely the same set of beliefs about everything. Nonetheless, at least one example already exists (the triangle hypothesis mentioned above) of misanthropic beliefs cascading into anti-social behavioral implications that reinforce the misanthropy. Being precise, the misanthropy almost certainly deserves a clear negative emotional valence, not the cascade as such.

  3. Internal information cascades put one in a curious state of “Bayesian sin” because one’s prior beliefs “contaminate” the evidence stream from which one forms posterior beliefs. There is a sense in which they may be inescapable for people running on meat brains, because our “prior beliefs” are in some sense a part of the “external world” that we face moment-to-moment. Perhaps there are people who can update their entire belief network instantaneously? Perhaps computer based minds will be able to do this in the future? But I don’t seem to have this ability. And despite the inescapability of this “Bayesian sin” the curiosity is that such cascades can sometimes be beneficial, which is a weird result from a sinful situation...

  4. Beliefs formed in the course of internal information cascades appear to have very weird “universality” properties. Generally “rationality”, “reason”, and “science” are supposed to have the property of universality because, in theory, people should all converge to the same conclusions by application of “reason”. However, using a pragmatic interpretation of “the universe” as the generator of sense data that one can feasibly access, it may be the case that some pairs of people could find themselves facing “incommensurate data sources”. This would happen when someone is incapable of acting “as if” they believed what another person believes because the inferential distance is too great. The radically different sensory environment of two people deep into their own internal information cascades may pose substantial barriers to coordination and communication especially if they are not recognized and addressed by at least one of the parties.

Second, in terms of behavioral implications, I honestly don’t have any properly validated insights for how to act in light of internal information cascades. Experimentally tested advice for applying these insights in predictably beneficial ways would be great, but I’m just not aware of any. Here are some practical “suggestions” to take with a grain of salt:

  1. Pay a lot of attention to epistemic reversibility. Before I perform an experiment like (1) taking a drug, (2) hanging out with a “community of belief”, or (3) taking major public stands on an issue, I try to look farther into the process to see how hard it is for people to get out later on. A version of rationality that can teach one to regret the adoption of that version of rationality is one that has my respect, and that I’m interested in trying out. If none of your apostates are awesome, maybe your curriculum is bad? For a practical application, my part in this comment sequence was aimed at testing Phillip Eby’s self help techniques for reversibility before I jumped into experimenting with them.

  2. Try to start with social hypotheses that explain situations by attributing inadequacy and vice to “me” and wisdom and virtue to “others” (but keep an exit option in your back pocket in case you’re wrong). The triangle hypothesis was my first vivid exposure (I think I learned about it in 2003 or so) to the idea that internal information cascades can massively disrupt opportunities to cooperate with people. I have personally found it helpful to keep in mind. While hypotheses that are self-insulting and other-complimenting are not necessarily the place that “perfectly calibrated priors” would always start out, I’m not personally trying to be perfectly calibrated at every possible moment. My bigger picture goal is to have a life I’ll look back from the near and far future as rewarding and positive taken as an integrated whole. The idea is that if my initial working hypothesis about being at fault and able to learn from someone is wrong then the downside isn’t that bad, and I’ll be motivated to change, but if I’m right then I may avoid falling into the “epistemic pit” of an internal information cascade with low value. I suspect that many traditional moral injunctions work partly to help here, keeping people out of dynamics from which they are unlikely to escape without substantial effort, and in which they may become trapped because they won’t even notice that their situation could be otherwise.

  3. Internal information cascades generally make the world that you’re dealing with more tractable. For example, if you think that everyone around you is constantly trying to rip you off and/​or sue you, you can take advantage of this “homogeneity in the world” by retaining a good lawyer for the kind of lawsuits you find yourself in over and over. I was once engaged in a “walk and talk” and the guy I was with asked out of the blue whether I noticed anything weird about people we’d passed. Apparently, a lot of people had been smiling at us, but people don’t smile at strangers (or to my conversational partner?) very much. I make a point of trying to smile, make eye contact, and wave any time I pass someone on a sparsely populated street but I didn’t realize that this caused me to be out of calibration with respect to the generically perceivable incidence of friendly strangers. I was tempted to not even mention it here? Practicing skills to gain expertise in neighborliness seems like an obvious application of the principle, but maybe it is complicated.

  4. Another use for the concept involves “world biasing actions” as a useful way of modeling “dark-side epistemology”. My little smile-and-waves to proximate strangers were one sort of world biasing action. Threatening someone that you’ll call your lawyer is another sort of world biasing action. In both cases they are actions that are potentially part of internal information cascades. I think one reason (in the causal explanation sense) that people become emotionally committed to dark-side epistemology is that some nominally false beliefs really do imply personally helpful world biasing actions. When people can see these outcomes without mechanistically understanding where the positive results come from, they may (rightly) think that loss of “the belief” might remove life benefits that they are, in actual fact, deriving from the belief. My guess is that a “strict light side epistemologist” would argue that false beliefs, are false beliefs, are false beliefs and the correct thing to do is (1) engage in positive world biasing actions anyway, (2) while refraining from negative world biasing actions, (3) knowing full well that reality probably isn’t the way the actions “seem to assume”. Personally, I think strict light side epistemology gives humans more credit for mindfulness than we really have. On the one hand I don’t want to believe crazy stuff, but on the other hand I don’t want to have to remember five facts and derive elaborate consequences from them just to say hi to someone. I have found it helpful to sometimes just call the theory that endorses my preferred plan my “operating hypothesis” without worry about the priors or what I “really believe”. Its easier for me to think about theories held at a distance and their implications, than to “truly believe” one thing and “do” another. In any case, internal information cascades help me think about the questions of the “practically useful beliefs” and world biasing actions in a more mechanistic and conceptually productive way.

Notes:

1 = While hunting down the link about bee specialization I found an anomalous species of ants where nothing seemed to be gained from specialization. I’m not sure what to make of this, but it seemed irresponsible to suppress the surprise.