I’ve come to disagree with all of these perspectives.
I’m not sure whether your model actually differs substantially from mine. :-) Or at least not the version of my model articulated in “Subagents, akrasia, and coherence in humans”. Compare you:
I’ve previously quoted Stephan Guyenet on the motivational system of lampreys (a simple fish used as a model organism). Guyenet describes various brain regions making “bids” to the basal ganglia, using dopamine as the “currency”—whichever brain region makes the highest bid gets to determine the lamprey’s next action. “If there’s a predator nearby”, he writes “the flee-predator region will put in a very strong bid to the striatum”.
The economic metaphor here is cute, but the predictive coding community uses a different one: they describe it as representing the “confidence” or “level of evidence” for a specific calculation. So an alternate way to think about lampreys is that the flee-predator region is saying “I have VERY VERY strong evidence that fleeing a predator would be the best thing to do right now.” Other regions submit their own evidence for their preferred tasks, and the basal ganglia weighs the evidence using Bayes and flees the predator.
with me:
One model (e.g. Redgrave 2007, McHaffie 2005) is that the basal ganglia receives inputs from many different brain systems; each of those systems can send different “bids” supporting or opposing a specific course of action to the basal ganglia. A bid submitted by one subsystem may, through looped connections going back from the basal ganglia, inhibit other subsystems, until one of the proposed actions becomes sufficiently dominant to be taken. [...]
Some subsystems having concerns (e.g. immediate survival) which are ranked more highly than others (e.g. creative exploration) means that the decision-making process ends up carrying out an implicit expected utility calculation. The strengths of bids submitted by different systems do not just reflect the probability that those subsystems put on an action being the most beneficial. There are also different mechanisms giving the bids from different subsystems varying amounts of weight, depending on how important the concerns represented by that subsystem happen to be in that situation. This ends up doing something like weighting the probabilities by utility, with the kinds of utility calculations that are chosen by evolution and culture in a way to maximize genetic fitness on average. Protectors, of course, are subsystems whose bids are weighted particularly strongly, since the system puts high utility on avoiding the kinds of outcomes they are trying to avoid.
The original question which motivated this section was: why are we sometimes incapable of adopting a new habit or abandoning an old one, despite knowing that to be a good idea? And the answer is: because we don’t know that such a change would be a good idea. Rather, some subsystems think that it would be a good idea, but other subsystems remain unconvinced. Thus the system’s overall judgment is that the old behavior should be maintained.
You don’t explicitly talk about the utility side, just the probability, but if the flee-predator region says its proposed course of action is “the best thing to do right now”, then that sounds like there’s some kind of utility calculation also going on. On the other hand, I didn’t think of the “dopamine represents the strength of the bid” hypothesis, but combining that with my model doesn’t produce any issues as far as I can see.
Expanding a bit on this correspondence: I think a key idea Scott is missing in the post is that a lot of things are mathematically identical to “agents”, “markets”, etc. These are not exclusive categories, such that e.g. the brain using an internal market means it’s not using Bayes’ rule. Internal markets are a way to implement things like (Bayesian) maximum a-posteriori estimates; they’re a very general algorithmic technique, often found in the guise of Lagrange multipliers (historically called “shadow prices” for good reason) or intermediates in backpropagation. Similar considerations apply to “agents”.
I think this theory matches my internal experience when I’m struggling to exert willpower. My intellectual/logical brain processes have some evidence for doing something (“knowing how the education system works, it’s important to do homework so I can get into a good college and get the job I want”). My reinforcement-learner/instinctual brain processes have some opposing argument (“doing your homework has never felt reinforcing in the past, but playing computer games has felt really reinforcing!”). These two processes fight it out. If one of them gets stronger (for example, my teacher says I have to do the homework tomorrow or fail the class) it will have more “evidence” for its view and win out.
It also explains an otherwise odd feature of willpower: sufficient evidence doesn’t necessarily make you do something, but overwhelming evidence sometimes does. For example, many alcoholics know that they need to quit alcohol, but find they can’t. They only succeed after they “hit bottom”, ie things go so bad that the evidence against using alcohol gets “beyond a reasonable doubt”. Alcoholism involves some imbalance in brain regions such that the reinforcing effect of alcohol is abnormally strong. The reinforcement system is always more convinced in favor of alcohol than the intellectual system is convinced against it—until the intellectual evidence becomes disproportionately strong even more than the degree to which the reinforcement system is disproportionately strong.
Me, later in my post:
Note those last sentences: besides the subsystems making their own predictions, there might also be a meta-learning system keeping track of which other subsystems tend to make the most accurate predictions in each situation, giving extra weight to the bids of the subsystem which has tended to perform the best in that situation. We’ll come back to that in future posts.
This seems compatible with my experience in that, I feel like it’s possible for me to change even entrenched habits relatively quickly—assuming that the new habit really is unambiguously better. In that case, while I might forget and lapse to the old habit a few times, there’s still a rapid feedback loop which quickly indicates that the goal-directed system is simply right about the new habit being better.
Or, the behavior in question might be sufficiently complex and I might be sufficiently inexperienced at it, that the goal-directed (default planning) subagent has always mostly remained in control of it. In that case change is again easy, since there is no strong habitual pattern to override.
In contrast, in cases where it’s hard to establish a new behavior, there tends to be some kind of genuine uncertainty:
The benefits of the old behavior have been validated in the form of direct experience (e.g. unhealthy food that tastes good, has in fact tasted good each time), whereas the benefits of the new behavior come from a less trusted information source which is harder to validate (e.g. I’ve read scientific studies about the long-term health risks of this food).
Immediate vs. long-term rewards: the more remote the rewards, the larger the risk that they will for some reason never materialize.
High vs. low variance: sometimes when I’m bored, looking at my phone produces genuinely better results than letting my thoughts wander. E.g. I might see an interesting article or discussion, which gives me novel ideas or insights that I would not otherwise have had. Basically looking at my phone usually produces worse results than not looking at it—but sometimes it also produces much better ones than the alternative.
Situational variables affecting the value of the behaviors: looking at my phone can be a way to escape uncomfortable thoughts or sensations, for which purpose it’s often excellent. This then also tends to reinforce the behavior of looking at the phone when I’m in the same situation otherwise, but without uncomfortable sensations that I’d like to escape.
When there is significant uncertainty, the brain seems to fall back to those responses which have worked the best in the past—which seems like a reasonable approach, given that intelligence involves hitting tiny targets in a huge search space, so most novel responses are likely to be wrong.
I’m not sure whether your model actually differs substantially from mine. :-) Or at least not the version of my model articulated in “Subagents, akrasia, and coherence in humans”. Compare you:
with me:
You don’t explicitly talk about the utility side, just the probability, but if the flee-predator region says its proposed course of action is “the best thing to do right now”, then that sounds like there’s some kind of utility calculation also going on. On the other hand, I didn’t think of the “dopamine represents the strength of the bid” hypothesis, but combining that with my model doesn’t produce any issues as far as I can see.
Expanding a bit on this correspondence: I think a key idea Scott is missing in the post is that a lot of things are mathematically identical to “agents”, “markets”, etc. These are not exclusive categories, such that e.g. the brain using an internal market means it’s not using Bayes’ rule. Internal markets are a way to implement things like (Bayesian) maximum a-posteriori estimates; they’re a very general algorithmic technique, often found in the guise of Lagrange multipliers (historically called “shadow prices” for good reason) or intermediates in backpropagation. Similar considerations apply to “agents”.
See also the correspondence between prediction markets of kelly bettors and Bayesian updating.
Also, you:
Me, later in my post: