Goodhart’s Law inside the human mind
Some time back, I saw a tweet from somebody that read:
Much of social psychology seems to be premised on the bizarre assumption that what people really care about is not real-world outcomes but the state of their own mind: self-esteem, a positive self-image, dissonance reduction, feelings of control, reducing uncertainty, etc.
I’ve certainly seen versions of the same myself. Maybe the most poignant example comes from this book review, which suggested that gambling addicts get hooked on a sense of control—even though someone who’s hooked on gambling to the point of ruining their life clearly isn’t in much control of anything:
The primary objective that machine gambling addicts have is not to win, but to stay in the zone. The zone is a state that suspends real life, and reduces the world to the screen and the buttons of the machine. Entering the zone is easiest when gamblers can get into a rhythm. Anything that disrupts the rhythm becomes an annoyance. This is true even when the disruption is winning the game. Many gamblers talk about how winning the game brings them out of the zone, and they actually dislike winning for that reason. For some gamblers, the very act of pressing buttons to play the game disrupts the rhythm. These gamblers use autoplay modes on games that offer them, and jerryrig an autoplay mode on machines that don’t by jamming something into buttons to keep them pressed. They don’t want to chase a win or pick their lucky numbers, they want to disappear into the zone. [...]
The book is full of heartbreaking stories about what gamblers endure on their path to extinction. They sacrifice their bodies, their time, and their relationships. Sharon, for example, spent four days at a casino, trying to lose all her money to reach extinction. At the end of this ordeal, she came home to sleep, but she found three nickels in her bedroom. The thought of not having spent all her money bothered her so much that she drove back to a casino immediately to lose those last three nickels. [...]
I used to think that gambling addicts “lost control” when they gambled excessively. But the addicts in the book use machines as a way to gain control in their lives. In front of a machine, the world is simple: they place bets and lose a little bit of money on each turn. The gamblers are in control of this machine world. It is the world away from machines where the prospect of losing control in frightening ways looms. Away from the machines, life is long and full of terrors.
This certainly sounds odd and destructive—seeking control in a way that destroys one’s life.
But it’s also one that resonates with me—it feels like it describes the relationship I’ve often had with social media. Picking up my phone and checking Facebook can give me a sense of autonomy, like I’m choosing to leave the current situation and momentarily visit another world. This feels like it’s the case even when the phone-checking becomes compulsive; a part of me craves that feeling of control so much that it gets out of control.
While I’ve never smoked, I understand that many smokers describe their relationship with cigarettes similarly. Smoking offers a socially acceptable reason to step away from a dull conversation or meeting for a moment. Briefly, you can tune out and regain a sense of control.
There’s something peculiar about this. The initial tweet mentioned the “bizarre” assumption in social psychology that people prioritize internal mental states over actual control (or any other attribute the feelings seem to track). It’s strange that a gambling addict would chase the feeling of control even when it leads their life to spiral out of control.
However, I believe it’s relatively simple to explain. First, there’s a brain subsystem that calculates a variable, such as “to what extent do I feel in control,” based on external factors. Then, other subsystems learn various strategies to regulate that variable and maintain a desired range.
In ideal circumstances, this would be beneficial—the sense of control would correlate with actual control over the environment. When something reduces control, the regulating subsystems act to restore the sense of control. For example, losing your job might make you feel less in control of your finances, so you search for another job until you feel in control again.
Unfortunately, subsystems within the brain are just as subject to Goodhart’s law—“when a measure becomes a target, it ceases to be a good measure”—as anything else is. The feeling of control is a proxy for actual control, but an imperfect one. When it becomes an optimization target, it becomes an even worse proxy. If part of you feels like your life is spiraling out of control, the regulator subsystem may get stuck in a local maximum where it clings to anything that slightly increases the sense of control—even if that overall causes a continued loss of control.
While playing the slot machine, you’re distracted from distressing thoughts, and the predictable nature of the game elevates your “sense of control” variable. As soon as the game ends or you win, unpredictability returns, and you’re reminded of the unpleasant aspects of life, causing the sense of control to drop. The regulator predicts that playing another game will raise the sense of control. If you act on this prediction, it may be quickly confirmed, reinforcing the pattern of “play the slot machines when distressed, since that will bring the sense of control up”
The original tweeter acknowledged that it makes sense for there to be feelings that track external states. He seemed to think that explanatory theories only went off the rails when they proposed that humans sought to primarily regulate internal psychological states—after all, the evolutionary purpose of the regulation system can’t be purely internal, it has to have its roots in the external world.
But it seems very reasonable to me that we might first evolve systems for the purpose of regulating external states which then take a life of their own, developing something like inner optimizers that put intrinsic value on the regulation of internal variables. If that’s the case, then it makes perfect sense to explain human behavior in terms of humans attempting to regulate their internal variables—even when those variables have become uncorrelated from external reality, and even if the variables originally evolved to track external states.
Behavior by feeling regulation
I suspect that there’s a sense in which it could be said that our brains and bodies are doing nothing but trying to keep various feelings within desired parameters. Even though it has its failure modes, it’s a pretty good system overall. And it can be useful to start paying attention to the various kinds of different feelings that our minds are tracking.
The original tweet mentioned a number of feelings that people have proposed the mind might be optimizing for: control, internal dissonance, self-esteem, self-image, and uncertainty. Here are a few more:
The sense of balance. By this, I mean just the thing that you have when you’re e.g. walking, that tells you whether you’re well-balanced or whether you just slipped and are about to fall. If you are a healthy, able-bodied adult, you don’t need to pay attention to it. You have internalized the skill of walking enough that your sense of balance is automatically kept within the right range to maintain upright motion. But if you slip and lose that sense, then those regulator subsystems will quickly kick in and try to bring it back up.
The sense of singing right. An experienced singer may have a sense of what singing a song right feels like physically, even if they can’t actually hear themselves singing. There can be various concert venues whose acoustics are such that it’s impossible for the singer to hear themselves, but they can still sing perfectly, just by the right internal feel.
The sense of ‘everything is done’. Maybe you are finishing a day of work or packing your things for vacation. You might have a feeling of “there was still something that I needed to do/pack”, which you are hoping to turn into a feeling of “I have done everything that I need for now and can relax now”. You might do that by consulting a list of things that needed to be done/packed, reviewing what you have done/packed so far, or just continuing to do or pack things until you can’t think of anything else and just intuitively feel like you have everything.
And while the previous discussion may have made it sound like it’s a bad thing to try to strive for a “sense of control”, often it can be beneficial. Rossin shares the following:
I used to think of myself as someone who was very spontaneous and did not like to plan or organize things any more or any sooner than absolutely necessary. I thought that was just the kind of person I am and getting overly organized would just feel wrong.
But I felt a lot of aberrant bouts of anxiety. I probably could have figured out the problem through standard Focusing but I was having trouble with the negative feeling. And I found it easier to focus on positive feelings, so I began to apply Focusing to when I felt happy. And a common trend that emerged from good felt senses was a feeling of being in control of my life. And it turned out that this feeling of being in control came from having planned to do something I wanted to do and having done it. I would not have noticed that experiences of having planned well made me feel so good through normal analysis because that was just completely contrary to my self-image. But by Focusing on what made me have good feelings, I was able to shift my self-image to be more accurate. I like having detailed plans. Who would have thought? Certainly not me.
Once I realized that my self-image of enjoying disorganization was actually the opposite of what actually made me happy I was able to begin methodically organizing and scheduling my life. Since then, those unexplained bouts of anxiety have vanished and I feel happier more of the time.
Here’s a more general description of how we learn to do anything, argued for on the basis of academic studies in Anders Ericsson’s Peak and more anecdotally in Josh Waitzkin’s (International Master in chess and 2004 world champion in Taiji Push Hands) The Art of Learning.
When we practice any skill, we get feedback that tells us when we’re doing well at it, and our brain then associates a feeling of “doing well” with the kinds of states where we’re doing well. Initially, that sense is only relatively rough. A beginner at playing piano might recognize they’re doing well when they play a melody approximately right, but not notice more subtle mistakes in it.
As we learn to get to that state, our ability to identify more fine-grained measures of success develops. An intermediate piano player may be able to notice subtler errors that they are making and learn what the finger motions associated with playing just right feel like. We then get better and better, as learning to hit each more fine-grained feeling of “doing well” allows us to develop even more detailed feelings of doing well
A way I would phrase this is that we learn:
To have a particular kind of feelings (felt senses) that represent something (control, balance, singing right, playing the piano right, everything being done)
A range of intensity that we should keep that feeling sense in, in some given context (either trying to make sure we have some positive feeling, or that we avoid some negative feeling)
Various strategies for keeping it within that range
Returning to our earlier examples, these might look something like:
Feeling of control
Range to keep it in: Variable; depending on the person’s learned expectations of how much control they think they have (see locus of control)
Strategies: Looking for sources of income when don’t have one (financial control), making detailed plans and schedules, playing slot machines, looking at social media, smoking
Feeling of balance
Range to keep it in: One that feels “upright” when walking or standing, one that doesn’t when lying down
Strategies: The body doing automatic balance adjustments, trying to correct position or grab a hold of something if you slip
Feeling of singing right
Range to keep it in: Try to get keep it as close to the “right shape of the song” as possible
Strategies: Adjusting the physical state of the vocal cords, breath, and the like to be within the right range to produce the right feeling
Feeling of everything being done
Range to keep it in: Try to get the feeling as strong as possible
Strategies: Doing things that you remember you should do, explicitly deciding to leave some things undone/for later to clear them off the set of things that need to be done now, consulting a to-do list to check whether you have done everything
At the same time, there are various ways in which this can go wrong.
Pain as the unit of effort. alkjash writes about a failure mode where pain is the unit of effort. Here, a person has learned that they are only trying if they put themselves in as much pain as they can handle. The feeling that some part of them comes to optimize for is misery, even as this comes uncorrelated from the goal of actually doing well. An anecdote from the post:
As a child, I spent most of my evenings studying mathematics under some amount of supervision from my mother. While studying, if I expressed discomfort or fatigue, my mother would bring me a snack or drink and tell me to stretch or take a break. I think she took it as a sign that I was trying my best. If on the other hand I was smiling or joyful for extended periods of time, she took that as a sign that I had effort to spare and increased the hours I was supposed to study each day. To this day there’s a gremlin on my shoulder that whispers, “If you’re happy, you’re not trying your best.”
A fear of being accused of something. A pattern that I’ve seen come up in parts work is that a child might be accused of some wrongdoing that they didn’t actually commit. If this happens more than once, their mind may develop the prediction that “being unexpectedly accused of something” is something that might happen at any time, or in particular kinds of circumstances. This creates a feeling of discomfort and uncertainty, and a desire to reduce that feeling of discomfort. A strategy that their mind might then hit upon is to actually do something that they’ll be blamed for.
Actually having something bad happen to you can often feel less bad than living in constant uncertainty about when it will happen. And getting accused about something can temporarily bring the anticipation of being accused down to zero. Thus, intentionally doing bad things serves as an effective strategy to keep the anticipation within the desired range… even as it creates a self-fulfilling prophecy. Other people will come to expect that the person will consistently misbehave, making them even more likely to accuse the person even of wrongdoings that the person didn’t actually commit.
Thus, in more ways than just one, trying to eliminate the feeling of an impending accusation actually causes more accusations.
General self-sabotage. Some people might have experienced something bad happening when they thought that things were going well, and developed the general expectation of “things are so good, something is about to go wrong”. Similar to the example above, they might then learn to intentionally sabotage themselves in order to realize that expectation of failure. Or it might the case that there is something frightening about success—maybe you’ve learned that success causes there to be more expectations to be put on you while being incompetent means that you’ll get extra support, making the feeling of failure correlated with safety—so that your mind actually comes to optimize for a feeling of failure.
In an old comment on a post on self-sabotage, pjeby mentions a subtle sense of reward associated with getting a feeling of failure:
My wife and I both described the “Bruce effect” sensation as being more like a sense of recognition or rightness—like confirmation of something that you expected, something that’s just the way the world works. That, upon successfully losing, it’s like, “yep, this is where I’m supposed to be”. Not enjoyment… more like satisfaction… though that’s still too strong. Closure, maybe? Relief? It’s a brief and subtle reward, not a conscious pleasure.
Bringing yourself down to make your social status clearer. Another self-sabotage-like pattern is that if someone is feeling uncertainty about their social status and wants to bring that uncertainty down, they may disparage and degrade themselves to others to establish that they are low status. Maybe they have learned that if others perceive their status as being unclear, those others may act to take the person down until the person’s low status has been established. So a part of the person learns that they should keep their feeling of status down in order to be safe, and they learn a strategy of pre-emptively bringing it down. This may then happen even in environments where it’s not necessary, and one’s sense of social status is uncorrelated with being safe (or maybe it would even be safer to have higher social status).
“Stuck on” anxiety responses. According to some models of trauma, when people feel anxious without clearly knowing why, the cause is commonly in distress responses that have gotten “stuck on”. Something distressing once happened to the person, and the feeling of distress triggered a strategy of trying to get away from the situation.
However, for whatever reason, their mind/body has failed to properly register that the original threat is gone, so it keeps generating distressing feelings. The person may then e.g. engage in a strategy of pursuing addictive activities that numb the feeling (essentially engaging in a type of flight response intended to bring the distressing feeling down to zero). While this helps alleviate the discomfort temporarily, it doesn’t actually help the response get unstuck.
In these cases, the feeling of threat activates escape responses even though it has become uncorrelated from the actual threat it was a response to.
Additional failure modes left as an exercise for the reader. I’m sure you can come up with plenty, post them in the comments!
Contradictory failure modes
It was beautiful
and I fled from there.
There was nature
and I ran away from it.
There was my love
and I left them.
All was well there
and I couldn’t stand it.
– Pentti Saaritsa
(translated by DeepL and me)
There are also failure modes where the person has feelings that trigger opposite strategies:
Pursuit/distancing in relationships. Someone may have learned that the feeling of being alone and without a romantic relationship is painful and something to avoid, and developed strategies to get into a relationship that are triggered whenever they are single. At the same time, they may also have had the experience of feeling suffocated and constrained in relationships. This may cause a strategy that triggers whenever the person feels at all at risk of being constrained in the relationship and causes them to distance themselves from the other person.
Janina Fisher offers an vivid description of this in Healing the Fragmented Selves of Trauma Survivors:
Aaron described the reasons for which he had come: “I start out by getting attached to women very quickly—I immediately think they’re the ‘one.’ I’m all over them, can’t see them enough … until they start to get serious or there’s a commitment. Then I suddenly start to see everything I didn’t see before, everything that’s wrong with them. I start feeling trapped with someone who’s not right for me—I want to leave, but I feel guilty—or afraid they’ll leave me. I’m stuck. I can’t relax and be happy, but I can’t get out of it either.”
Aaron was describing an internal struggle between parts: between an attachment-seeking part that quickly connected to any attractive woman who treated him warmly and a hypervigilant, hypercritical fight part that reacted to every less-than-optimal quality she possessed as a sign of trouble. His flight part, triggered by the alarms of the fight part, then would start to feel trapped with what felt like the “wrong person,” generating impulses to get out—an action that his submit and cry for help parts couldn’t allow. [...] Without a language to differentiate each part and bring it to his awareness, he ruminated constantly: should he leave? Or should he stay? Was she enough? Or should he get out now? Often, suicide seemed to him the most logical solution to this painful dilemma, yet at the same time “he” dreamed of having a family with children and a loving and lovely wife.
Wanting to be with people vs. wanting to be alone. A similar phenomenon may pop up in a non-romantic context, where a person desires connection and goes to social events—but then feels uncomfortable in them, and wants to reduce that feeling of discomfort by leaving right away. A short story I once read described this evocatively as “I remember wondering why I always first want to belong, but then always immediately want to leave”. 
Playing the slot machine vs. not playing the slot machine. In the case of playing the slot machine, it may be that another part of the mind notices that this behavior is actually self-destructive and generates a feeling of distress about being stuck in self-destructive behavior. This distress may then make the person less inclined to play, even as the lack of control when not playing drives back towards playing.
In the worst-case scenario, the distress produced by this self-destructive pattern may serve to further entrench the pattern. A part aimed at reducing the distress from gaming might recognize that when the person becomes distracted by the slot machines, they temporarily forget about other concerns, including their own distress over constant gaming. Thus, it might end up also driving the person to gamble more.
What to do about it?
So, if you recognize yourself running into some of these failure modes—what can you actually do about it?
All of them involve some subconscious learning about what kinds of feelings to have and what kinds of strategies to employ in response to those feelings. As such, various memory reconsolidation-based practices can be employed to identify and change those learnings.
As usual, Internal Family Systems is one of my personal favorite methodologies. Practicing Gendlin’s Focusing is also useful for developing a higher awareness of the various felt senses happening in your mind. Core Transformation is a parts work technique explicitly aimed at processing different feelings and refactoring them.
A prompt that may sometimes work for stuck-on anxiety responses is the following:
(It may be easier to do this exercise in a situation where to anxiety is not acutely activated, but you remember it well enough to remind yourself of what it feels like when it is triggered.)
See if you can get curious about what the response is trying to do. For example, is it try to run away, is trying to hide, is it trying to keep you motionless?
Ask yourself—if the response could do that just the way that it wanted to do, what would that be like? For example, maybe the desire to run away has a sense of somewhere safe where it would like to run to. Maybe the desire to hide has a sense that by digging itself somewhere underground, it could get away from threats. Maybe the desire to stay motionless feels like if it could freeze completely enough, it could become invisible, and any danger would move past it.
You might have the thought that the imagined outcome feels impossible, irrational or counterproductive. In that case, note that the objection is a valid one, and then put it aside. You’re not evaluating the overall realism of the plan or committing to actually doing this in a real situation, you’re trying to get into contact with the emotional schema which does think that this makes sense.
Invite the reaction to complete, just the way that it wants to complete. You can imagine actually running away to that sense of safety, actually digging yourself to somewhere underground, actually staying motionless enough to become invisible.
If it feels like this is doing something, stay with the thing-you’ve-imagined until it feels like the reaction has resolved.
In Creating a Truly Formidable Art, Valentine talks about internal noise. I interpret him to be talking about two different kinds of noise:
Emotional noise—something like feelings of discomfort or stuck-on anxiety responses that give rise to various, often Goodharty-strategies to avoid them.
Cognitive noise—thoughts that make up the strategies themselves, including the strategy of having lots of thoughts that distract from the underlying feeling.
He has some suggestions of what to do about that in the section about Becoming the Art. And I don’t think it’s a coincidence that he talks about these practices as anti-Goodhart moves, nor that he talks about developing a highly honed feeling of the Art.
I find it hard to see what’s going on in me when everything is loud inside and thoughts are slamming into one another and creating turbulence while other thoughts are running in the background influencing me unseen. It doesn’t matter how accurate some or even all of those thoughts are: I still can’t do much intentionally with all that clutter. I’m just reacting.
But if I can come to inner silence, I can see and hear what’s going on in me very clearly.
In my words: if you are full of different feelings that various parts of your mind are urgently reacting to (as well as the reactions themselves), it doesn’t leave any space to actually see what’s happening, or what the different strategies are responses to. If you can calm down some of that activity, it becomes possible to perceive things better and to actually let your entire mind-system participate in choosing your actions.
Practices such as meditation, IFS, or Acceptance and Commitment Therapy can also help in unblending / cognitively defusing from various feelings—that is, coming to experience e.g. a feeling of danger as a feeling of danger, instead of an objective fact of being in danger.
In general, I would say that the right approach is not to start distrusting your feelings too much. It’s good to have some healthy skepticism, but probably all of our thought and action is driven by learned senses of what “good thought” and “good action” should feel like. (If you develop a distrust of your feelings, then that feeling of distrust is also a feeling.)
Rather, I would suggest opening up to feelings. Becoming familiar with them, understanding where they come from and what they are trying to do, and allowing them to become updated with new evidence and feedback. (As well as applying more therapeutic approaches to ones that seem to resist updating.) In the best case, you can straighten out some of the more Goodharty loops and let your actions become more grounded in reality and in what you actually care about.
Example from The Art of Learning, p. 121.
Siellä oli kaunista
ja minä karkasin sieltä.
Siellä oli luontoa ja minä
pakenin sen luota.
Siellä oli rakkauteni
ja minä jätin sen.
Siellä oli kaikki hyvin
enkä minä kestänyt sitä.
Orig. “Muistan miettineeni, miksi minä aina ensin haluan kuulua ja sitten kuitenkin haluan heti pois.”