I suspect some of the things that you want to use PP for, I would rather use my machine-learning model of meditation. The basic idea is that we are something like a model-based RL agent, but (pathologically) have some control over our attention mechanism. We can learn what kind of attention patterns are more useful. But we can also get our attention patterns into self-reinforcing loops, where we attend to the things which reinforce those attention patterns, and not things which punish them.
For example, when drinking too much, we might resist thinking about how we’ll hate ourselves tomorrow. This attention pattern is self-reinforcing, because it lets us drink more (yay!), while refusing to spend the necessary attention to propagate the negative consequences which might stop that behavior (and which would also harm the attention pattern). All our hurting tomorrow won’t de-enforce the pattern very effectively, because that pattern isn’t very active to be de-enforced, tomorrow. (RL works by propagating expected pain/pleasure shortly after we do things—it can achieve things on long time horizons because the expected pain/pleasure includes expectations on long time horizons, but the actual learning which updates an action only happens soon after we take that action.)
Wishful thinking works by avoiding painful thoughts. This is a self-reinforcing attention pattern for the same reason: if we avoid painful thoughts, we in particular avoid propagating the negative consequences of avoiding painful thoughts. Avoiding painful thoughts feels useful in the moment, because pain is pain. But this causes us to leave that important paperwork in the desk drawer for months, building up the problem, making us avoid it all the more. The more successful we are at not noticing it, the less the negative consequences propagate to the attention pattern which is creating the whole problem.
I have a weaker story for confirmation bias. Naturally, confirming a theory feels good, and getting disconfirmation feels bad. (This is not because we experience the basic neural feedback of perceptual PP as pain/pleasure, which would make us seek predictability and avoid predictive error—I don’t think that’s true, as I’ve discussed at length. Rather, this is more of a social thing. It feels bad to be proven wrong, because that often has negative consequences, especially in the ancestral environment.)
So attention patterns (and behavior patterns) which lead to being proven right will be reinforced. This is effectively one of those pathological self-reinforcing attention patterns, since it avoids its own disconfirmation, and hence, avoids propagating the consequences which would de-enforce it.
I would predict confirmation bias is strongest when we have every social incentive to prove ourselves right.
However, I doubt my story is the full story of confirmation bias. It doesn’t really explain performance in the task where you have to flip over cards to check whether “every vowel has an even number on the other side” or such things.
In any case, my theory is very much a just-so story which I contrived. Take with heap of salt.
I suspect some of the things that you want to use PP for, I would rather use my machine-learning model of meditation. The basic idea is that we are something like a model-based RL agent, but (pathologically) have some control over our attention mechanism. We can learn what kind of attention patterns are more useful. But we can also get our attention patterns into self-reinforcing loops, where we attend to the things which reinforce those attention patterns, and not things which punish them.
For example, when drinking too much, we might resist thinking about how we’ll hate ourselves tomorrow. This attention pattern is self-reinforcing, because it lets us drink more (yay!), while refusing to spend the necessary attention to propagate the negative consequences which might stop that behavior (and which would also harm the attention pattern). All our hurting tomorrow won’t de-enforce the pattern very effectively, because that pattern isn’t very active to be de-enforced, tomorrow. (RL works by propagating expected pain/pleasure shortly after we do things—it can achieve things on long time horizons because the expected pain/pleasure includes expectations on long time horizons, but the actual learning which updates an action only happens soon after we take that action.)
Wishful thinking works by avoiding painful thoughts. This is a self-reinforcing attention pattern for the same reason: if we avoid painful thoughts, we in particular avoid propagating the negative consequences of avoiding painful thoughts. Avoiding painful thoughts feels useful in the moment, because pain is pain. But this causes us to leave that important paperwork in the desk drawer for months, building up the problem, making us avoid it all the more. The more successful we are at not noticing it, the less the negative consequences propagate to the attention pattern which is creating the whole problem.
I have a weaker story for confirmation bias. Naturally, confirming a theory feels good, and getting disconfirmation feels bad. (This is not because we experience the basic neural feedback of perceptual PP as pain/pleasure, which would make us seek predictability and avoid predictive error—I don’t think that’s true, as I’ve discussed at length. Rather, this is more of a social thing. It feels bad to be proven wrong, because that often has negative consequences, especially in the ancestral environment.)
So attention patterns (and behavior patterns) which lead to being proven right will be reinforced. This is effectively one of those pathological self-reinforcing attention patterns, since it avoids its own disconfirmation, and hence, avoids propagating the consequences which would de-enforce it.
I would predict confirmation bias is strongest when we have every social incentive to prove ourselves right.
However, I doubt my story is the full story of confirmation bias. It doesn’t really explain performance in the task where you have to flip over cards to check whether “every vowel has an even number on the other side” or such things.
In any case, my theory is very much a just-so story which I contrived. Take with heap of salt.