From the perspective of someone with a nail stuck in their head, the world does not look like there’s a nail stuck in their head which they could easily remove in order to improve their life in ways in which they want it to be improved. [...] They’re best modeled not as an agents who are being willfully obstinate, but as people helplessly trapped in the cognitive equivalents of malfunctioning motorized exoskeletons.
I think this is false. It’s like the old “dragon in the garage” parable: the woman is too good at systematically denying the things which would actually help to not have a working model somewhere in there. It’s very much a case of “some part of the person knows what they need to do to fix their situation but they can’t bring themselves to admit it and do it”, and that does not look from-the-inside like “a confused muddle, a constantly shifting dreamscape”.
Yes, it is probably true that “some part of them knows what they need to do.” This does not mean all of their options are clearly laid out before them and they constantly make the conscious and informed decision that you would be able to make in their situation, and think “yes, I will choose the obviously worse option, because I’m just that self-destructive and lazy.”
It means something more like “they are trapped in a cognitive whirlpool of suffering, and the set of options in their head is not enough to swim out of it.” Importantly, a complete sense of empathy must be recursive, where you recognize that the mental motions you would easily make to fix the situation (or to fix the inability to fix the situation, etc.) are not available to them.
If this feels too exculpatory: imagine your friend now has a device built into their head that gives them an electric shock every time they try to do math. The device also has an ejection mechanism, but they also get a shock every time they think about the device or how to remove it. (For whatever reason it’s impossible for anyone else to forcibly remove the device from your friend’s head.) Not only that, but thinking about “building up the willpower to withstand electric shocks” also gives them an electric shock!
Seeing a person in this situation would make me feel deeply sad, not disgusted. The part of them that wants to do math and the device in their head are in direct conflict, and at least in the current equilibrium there is no way for them to come to an agreement. Not only will they be unable to reap the many benefits of doing math, they will get a bunch of needless shocks every time they encounter a situation where math is needed—they will attempt to do math, get a shock, think about the stupid device, get another shock, think about the device again, get another shock… If they want to avoid being shocked, the most “rational” option available to them is to avoid math entirely, which is itself a pretty terrible and sad solution.
Having previously argued the other side of this, I’ll now say: I think the next question is “what useful thing is John’s disgust doing?”. It’s probably within John’s action space to (perhaps effortfully) switch from feeling disgust to feeling sadness here for these reasons.
Realistically, this is not near the top of John’s priorities regardless, but if I were John and if this were reasonably cheap, my crux would be “does making this change cost me something important/loadbearing”. (I guess in the worlds where it’s cheap to change aesthetics here, it’s probably not very costly, and if it’s expensive it’s because this is woven through a lot of other important decisionmaking processes for John)
((I’d bet it’s at least theoretically achievable to make that switch without John losing other things he cares about except the rewiring-time, but, nontrivial))
I think a lot of people automatically connect empathic-kindness to a ‘this is fine’ stance, I see a lot of it in how people phrase things in the comments of this post, and I notice it in myself because I, well, empathize with John because I have similar feelings at times even if seemingly not as strong.
So, it can feel risky to get rid of that, because in a way it is part of how I keep my standards up. That I desire/require more from people, that I dream for both myself and them to be better, and some amount of disquiet or even disgust is a useful tool there. I’m still polite, but it serves as a fuel.
It is certainly possible to get around without that. However I look at various people I respect that have high standards and they seem to have some degree of this though perhaps they don’t conceptualize it as related to empathy, and then I look at others who I do see lowering their standards and being more wishy-washy over time due to pure ~positive-tinged empathy.
Sadness at their faltering is a more passive drive in a lot of ways, disgust helps both in pushing oneself to improve and also in my experience with convincing friends of mine to try for more. Though, of course, I am going to be helpful and friendly even as I find their faltering disquieting.
So it feels like to deliberately switch in such a way risks part of the mind that maintains its own standards.
and if it’s expensive it’s because this is woven through a lot of other important decisionmaking processes for John
that is very interesting claim! why you believe it? my experience is that my aesthetics are part of ,y preference—not choose by me, almost impossible to change. I don’t feel disgust, but i don’t think i can switch easily if i decided so. in the same way i can’t decide mechs are cool, or dragons are uncool.
It’s like the old “dragon in the garage” parable: the woman is too good at systematically denying the things which would actually help to not have a working model somewhere in there
I think you’re still imagining too coherent an agent. Yes, perhaps there is a slice through her mind that contains a working model which, if that model were dropped into the mind of a more coherent agent, could be used to easily comprehend and fix the situation. But this slice doesn’t necessarily have executive conscious control at any given moment, and if it ever does, it isn’t necessarily the same slice that contains her baseline/reflectively endorsed personality.
E. g., perhaps, at any given moment, only part of that model is visible to the conscious mind, a 3D object sliding through a 2D plane, and the person can’t really take in the whole of it at once, realize how ridiculous they’re being, and act on it rationally. Or perhaps the thought of confronting the problem causes overwhelming distress due to malfunctioning emotional circuitry, and so do the thoughts of fixing that circuitry, in a way that recurses on itself indefinitely/in the style of TD learning. Or something else that’s just as messy.
Human brains haven’t solved self-interpretability, and human minds aren’t arranged into even the approximate shape of coherent agents by default. Just because there’s a module in there somewhere which implements a model of X doesn’t mean the person can casually reach in and do things with that module.
Edit: After reading your other responses, yeah, “not modeling them as creatures particularly similar to yourself” might just be the correct approach. I, uh, also don’t find people with weak metacognitive skills particularly relatable.
In my mental ontology, there’s a set of specific concepts and mental motions associated with accountability: viewing people as being responsible for their actions, being disappointed in or impressed by their choices, modeling the assignment of blame/credit as meaningful operations. Implicitly, this requires modeling other people as agents: types of systems which are usefully modeled as having control over their actions. To me, this is a prerequisite for being able to truly connect with someone.
When you apply the not-that-coherent-an-agent lens, you do lose that. Because, like, which parts of that person’s cognition should you interpret as the agent making choices, and which as parts of the malfunctioning exoskeleton the agent has no control over? You can make some decision about that, but this is usually pretty arbitrary. If someone is best modeled like this, they’re not well-modeled as an agent, and holding them accountable is a category error. They’re a type of system that does what it does.
You can still invoke the social rituals of “blame” and “responsibility” if you expect that to change their behavior, but the mental experience of doing so is very different. It’s more like calculating the nudges you need to make to prompt the desired mechanistic behavior, rather than as interfacing with a fellow person. In the latter case, you can sort of relax, communicate in a way focused on transferring information, instead of focusing on the form of communication, and trust them to make correct inferences. In the former case, you need to keep precise track of tone/wording/aesthetics/etc., and it’s less “communication” and more “optimization”.
I really dislike thinking of people in this way, and I try to adopt the viewing-them-as-a-person frame whenever it’s at all possible. But the other frame does unfortunately seem to be useful in many cases. Trying to do otherwise often feels like reaching out for someone’s hand and finding nothing there.
If this is what you meant by viewing others as cats, yeah, that tracks.
I think this is false. It’s like the old “dragon in the garage” parable: the woman is too good at systematically denying the things which would actually help to not have a working model somewhere in there. It’s very much a case of “some part of the person knows what they need to do to fix their situation but they can’t bring themselves to admit it and do it”, and that does not look from-the-inside like “a confused muddle, a constantly shifting dreamscape”.
Yes, it is probably true that “some part of them knows what they need to do.” This does not mean all of their options are clearly laid out before them and they constantly make the conscious and informed decision that you would be able to make in their situation, and think “yes, I will choose the obviously worse option, because I’m just that self-destructive and lazy.”
It means something more like “they are trapped in a cognitive whirlpool of suffering, and the set of options in their head is not enough to swim out of it.” Importantly, a complete sense of empathy must be recursive, where you recognize that the mental motions you would easily make to fix the situation (or to fix the inability to fix the situation, etc.) are not available to them.
If this feels too exculpatory: imagine your friend now has a device built into their head that gives them an electric shock every time they try to do math. The device also has an ejection mechanism, but they also get a shock every time they think about the device or how to remove it. (For whatever reason it’s impossible for anyone else to forcibly remove the device from your friend’s head.) Not only that, but thinking about “building up the willpower to withstand electric shocks” also gives them an electric shock!
Seeing a person in this situation would make me feel deeply sad, not disgusted. The part of them that wants to do math and the device in their head are in direct conflict, and at least in the current equilibrium there is no way for them to come to an agreement. Not only will they be unable to reap the many benefits of doing math, they will get a bunch of needless shocks every time they encounter a situation where math is needed—they will attempt to do math, get a shock, think about the stupid device, get another shock, think about the device again, get another shock… If they want to avoid being shocked, the most “rational” option available to them is to avoid math entirely, which is itself a pretty terrible and sad solution.
Having previously argued the other side of this, I’ll now say: I think the next question is “what useful thing is John’s disgust doing?”. It’s probably within John’s action space to (perhaps effortfully) switch from feeling disgust to feeling sadness here for these reasons.
Realistically, this is not near the top of John’s priorities regardless, but if I were John and if this were reasonably cheap, my crux would be “does making this change cost me something important/loadbearing”. (I guess in the worlds where it’s cheap to change aesthetics here, it’s probably not very costly, and if it’s expensive it’s because this is woven through a lot of other important decisionmaking processes for John)
((I’d bet it’s at least theoretically achievable to make that switch without John losing other things he cares about except the rewiring-time, but, nontrivial))
I think a lot of people automatically connect empathic-kindness to a ‘this is fine’ stance, I see a lot of it in how people phrase things in the comments of this post, and I notice it in myself because I, well, empathize with John because I have similar feelings at times even if seemingly not as strong.
So, it can feel risky to get rid of that, because in a way it is part of how I keep my standards up. That I desire/require more from people, that I dream for both myself and them to be better, and some amount of disquiet or even disgust is a useful tool there. I’m still polite, but it serves as a fuel.
It is certainly possible to get around without that. However I look at various people I respect that have high standards and they seem to have some degree of this though perhaps they don’t conceptualize it as related to empathy, and then I look at others who I do see lowering their standards and being more wishy-washy over time due to pure ~positive-tinged empathy. Sadness at their faltering is a more passive drive in a lot of ways, disgust helps both in pushing oneself to improve and also in my experience with convincing friends of mine to try for more. Though, of course, I am going to be helpful and friendly even as I find their faltering disquieting. So it feels like to deliberately switch in such a way risks part of the mind that maintains its own standards.
that is very interesting claim! why you believe it? my experience is that my aesthetics are part of ,y preference—not choose by me, almost impossible to change. I don’t feel disgust, but i don’t think i can switch easily if i decided so. in the same way i can’t decide mechs are cool, or dragons are uncool.
@Caleb Biddulph’s reply seems right to me. Another tack:
I think you’re still imagining too coherent an agent. Yes, perhaps there is a slice through her mind that contains a working model which, if that model were dropped into the mind of a more coherent agent, could be used to easily comprehend and fix the situation. But this slice doesn’t necessarily have executive conscious control at any given moment, and if it ever does, it isn’t necessarily the same slice that contains her baseline/reflectively endorsed personality.
E. g., perhaps, at any given moment, only part of that model is visible to the conscious mind, a 3D object sliding through a 2D plane, and the person can’t really take in the whole of it at once, realize how ridiculous they’re being, and act on it rationally. Or perhaps the thought of confronting the problem causes overwhelming distress due to malfunctioning emotional circuitry, and so do the thoughts of fixing that circuitry, in a way that recurses on itself indefinitely/in the style of TD learning. Or something else that’s just as messy.
Human brains haven’t solved self-interpretability, and human minds aren’t arranged into even the approximate shape of coherent agents by default. Just because there’s a module in there somewhere which implements a model of X doesn’t mean the person can casually reach in and do things with that module.
Edit: After reading your other responses, yeah, “not modeling them as creatures particularly similar to yourself” might just be the correct approach. I, uh, also don’t find people with weak metacognitive skills particularly relatable.
To expand on that...
In my mental ontology, there’s a set of specific concepts and mental motions associated with accountability: viewing people as being responsible for their actions, being disappointed in or impressed by their choices, modeling the assignment of blame/credit as meaningful operations. Implicitly, this requires modeling other people as agents: types of systems which are usefully modeled as having control over their actions. To me, this is a prerequisite for being able to truly connect with someone.
When you apply the not-that-coherent-an-agent lens, you do lose that. Because, like, which parts of that person’s cognition should you interpret as the agent making choices, and which as parts of the malfunctioning exoskeleton the agent has no control over? You can make some decision about that, but this is usually pretty arbitrary. If someone is best modeled like this, they’re not well-modeled as an agent, and holding them accountable is a category error. They’re a type of system that does what it does.
You can still invoke the social rituals of “blame” and “responsibility” if you expect that to change their behavior, but the mental experience of doing so is very different. It’s more like calculating the nudges you need to make to prompt the desired mechanistic behavior, rather than as interfacing with a fellow person. In the latter case, you can sort of relax, communicate in a way focused on transferring information, instead of focusing on the form of communication, and trust them to make correct inferences. In the former case, you need to keep precise track of tone/wording/aesthetics/etc., and it’s less “communication” and more “optimization”.
I really dislike thinking of people in this way, and I try to adopt the viewing-them-as-a-person frame whenever it’s at all possible. But the other frame does unfortunately seem to be useful in many cases. Trying to do otherwise often feels like reaching out for someone’s hand and finding nothing there.
If this is what you meant by viewing others as cats, yeah, that tracks.
Edit: Oh, nice timing.
Yeah, that.