I think it’s important to bring the distinction I initially missed, about what you mean by moral agency, into this conversation. From your comment in the other post:
I think this misses the distinction I’d consider relevant for moral agency.
I can put a marble on a ramp and it will roll down. But I have to set up the ramp and place the marble; it makes no sense for me to e.g. sign a contract with a marble and expect it to make itself roll down a ramp. The marble has no agency.
Likewise, I can stick a nonagentic human in a social environment where the default thing everyone does is take certain courses and graduate in four years, and the human will probably do that. I can condition a child with rewards and punishments to behave a certain way, and the child will probably do so. Like the marble, both of these are cases where the environment is set up in such a way that the desired outcome is the default outcome, without the candidate “agent” having to do any particular search or optimization to make the outcome happen.
What takes agency—moral agency—is making non-default things happen. (At least, that’s my current best articulation.) Mathematically, I’d frame this in terms of couterfactuals: credit assignment mostly makes sense in the context of comparison to counterfactual outcomes. Moral agency (insofar as it makes sense at all in a physically-reductive universe) is all about thinking of a thing as being capable of counterfactual impact.
This seems defensible, but nonstandard. Under a definition of “moral agency” that relies on counterfactual credit assignment, a lot of things that would normally be considered the actions of a moral agent acting goodly, wouldn’t count. (Unless I’m misunderstanding, in which case, please correct that misunderstanding.)
Examples:
1. I have an opportunity to cheat on my spouse, or do something else society clearly codes as wrong. I choose not to. Standard interpretation: Good choice, have a cookie. You are the sort of person I can collaborate with. Counterfactual credit analysis: Doesn’t seem very agentic. Most of the credit here goes to the society around you and your parents, who taught you through various forms of reinforcement to decide in that way in that kind of situation. Maybe you get a little credit for actually doing the expected thing when the opportunity arose, but very little. You’re basically a cat. A good cat rather than a bad cat, I guess?
2. I want there to be fewer people dying of things they don’t need to die of. So I read up GiveWell’s stuff, and donate a large amount to each of their top recommended charities. Standard interpretation: Again, good job, not many people do this at present and it’s obviously helpful on the object level for people to behave in this way. Counterfactual credit analysis: ~No points awarded. Everyone with lots of money is aware of GiveWell now, their top charities are not funding constrained (at least that’s what I understood to be the case a few years ago, don’t rely on this statement as fact without double-checking), and if you didn’t donate someone else would.
3. As a kid, my brother dies of cancer, so I vow to do what I can to make sure that happens to fewer people. (This didn’t happen to me, but it did happen to a friend). I go study hard for decades and become a doctor specializing in the kind of cancer my brother died of (my friend did not do this, but he went a significant way down that path). Through various medical means, my actions directly prevent many deaths during my career. Standard interpretation: Mission accomplished? You did what you set out to do, stayed true to the goals of your childhood self, and should look back on your life with happiness and pride. Counterfactual credit analysis: The 80,000 hours career path analysis which basically said “the counterfactual impact of becoming a doctor is low, try and do something neglected” is where the concept of counterfactual thinking clicked for me. Few points awarded, clearly falls within the “just being a marble doing the expected thing” category of life-choices.
I think disgust at people who aren’t agentic in terms of thinking about and optimizing for their counterfactual impact is the wrong move.
Background information that informs this view: Most people (not most people here, but most people generally) will have a reaction to “you’re not really agentic and may get a disgust reaction from me unless you’re optimizing for counterfactual impact” with either a blank confused stare because they’re not familiar with the relevant concepts, or something like “you mean like speculating about how things would be different if the Nazis won WW2? What does that have to do with whether i should cheat on my wife, or get credit for not doing so?”.
In this situation, most people are like the woman with the nail in her head, except she doesn’t know she has a nail in her head and isn’t going to be defensive about it unless you start telling her she’s a terrible/stupid person or cat who you are better than and you can’t work with her, for having a nail in her head and not doing anything about it. The standard person’s reaction to counterfactual reasoning, once it’s explained why that’s relevant, might be “well that definitely changes some of my life-plans”, which would be like the woman going “hey look, you’re right, there is a nail in there, thanks!” Although consistently updating how you think to incorporate and apply a new concept in all areas where it’s relevant rather than just applying it in its original context is also a skill that needs to be taught to many people, not something that happens automatically.
People who follow standard rules in standard situations are perfectly good collaborators in those situations. Also, reacting with disgust to those people regardless of their behaviour destroys the incentive structure which makes them good collaborators. People who don’t cheat on their spouse and do become oncologists should get a cookie, even though those are standard things to do with a low counterfactual impact. And there is a material difference between someone who is a doctor and someone who is a nonfunctional alcoholic, in terms of how grown-up and reliable they are, and that difference should be recognized, rather than putting them both in the same bucket as a cat.
Another counterpoint: On my understanding of how physics works (which admittedly may be wrong in important ways) we don’t have free will, there are no counterfactual universes, and we are all 0% actually-agentic. Your thoughts at moment t are a result of the various physical forces operating on the molecules of your brain, which strictly depend on the state of the system at t-1, and backward in time to the beginning of time. When pushed to the limit, “how much credit for this outcome goes to an agentic individual, vs. to environmental influences and marbles doing what marbles do?” has a correct answer “0% to the individual”. Any choice of how much agency to grant someone in your mind seems kind of an arbitrary choice, unless you believe the universe is doing something other than unfolding according to physical laws, and our choices can in actual fact change the future. “Agency” is a useful component to a model of the interaction between humans and animals, not the territory of base reality. And picking “only agency that requires familiarity with concepts most people don’t have familiarity with counts” seems like a choice that will have negative systematic effects.
With all that said… I agree with approximately all of this, from Thane, below:
The main impact is on the ability to coordinate with/trust/relax around that person. If they’re well-modeled as an agent, you can, to wit, model them as a game-theoretic agent: as someone who is going to pay attention to the relevant parts of any given situation and continually make choices within it that are consistent with the pursuit of some goal. They may make mistakes, but those would be well-modeled as the foibles of being a bounded agent.
On the other hand, people who can’t be modeled as agents (in a given context) can’t be expected to behave in this way. They may make decisions based on irrelevant parts of the situations, act in inconsistent ways, and can’t be trusted not to go careening in some random direction in response to random stimuli. Sort of like, ahem, an LLM.
Note that I think it isn’t a binary “There Are Two Types of People” thing.
What you in your post mean when you say “grown up” in this sentence:
Sometimes they mean they want to be treated as moral agents (i.e. treated as a grown-up, rather than a child or a cat).
Seems to me similar to what Thane is pointing at with “game-theoretic agent”, and what Harry in HPMOR would call a sane adult.
And, you get to choose who you will treat as an adult. If someone wants you to empathize with them, just do so, it’s an almost-costless action that many people value. But if they want you to treat them as an adult, it’s fine to say “sorry, there’s a nail in your head and you should know it’s there and want to remove it but apparently you don’t, and under this circumstance I find it difficult to take you seriously”. Not in those exact words, as that will quite often be perceived as an attack or extremely rude, but the message “there are requirements/standards if you want me to treat you as an adult, here is what they are, you’re not currently meeting them” is useful information for people who want you to treat them as adults. Basically, the message “in order to be treated as a Very Serious Adult, you have to have your life and your self in a certain degree of order” is standard, even people who want to be treated with more respect than you’re giving them may get it.
Sorry about the long posts, but I’m thinking and trying to model how things look from your perspective. EDIT TO ADD: Epistemic status: Speculation.
Hypothesis: You’re like, 30-40 points higher IQ than me, which would put you around 60 points above average (ballpark figures in each case).
If true, that would explain some things. There’s a certain intelligence level that I round down to “basically not intelligent”, and a certain intelligence level that I round up to “too smart for me to really understand things they find intellectually engaging unless they try really hard to dumb things down, or I spend hours where they spent minutes, so I can’t really have a conversation with them about it”—and in either case, it’s hard for me to see distinctions among people too far from my own intelligence-level in either direction. And the same is true for everyone, from what I’ve read. The barrier to mutual understanding seems to kick in around 30-40 IQ points. I understand that for people at a certain low IQ level, “this person went to community college” = “this person is really smart”, with the same reaction to “this person has a doctorate in physics” or “this person is the President”. And I can talk to and connect with people who are around average, as well as people who are pretty smart, while I find it hard to really put myself in the shoes of someone who’s significantly below average in intelligence, and there are people I tag with “too smart for me to really understand”, although relatively few, and I can still understand the parts of them that, ahem, aren’t particularly intelligent :D.
I picture what the world would look like if I was smarter, and thus concepts that took some prodding or prompting for me to get them (but I did get them) just seem obvious from age 5, the way utilitarianism did for me before I knew other people had thought of it and it had a name. Apparently this is something most people are only introduced to in university? Anyway, picturing what the world would look like if I moved up the intelligence scale, the thoughts that output sound like your posts. Most people are basically cats, if you expect to be treated like an adult you have to be trying to have a counterfactual impact. And my model of you as someone well outside the normal intelligence range predicts that my earlier post saying “counterfactual thinking isn’t something most people get without being taught” would get a response like “yes, exactly, most people are basically cats, and I’ve just downgraded my estimate of your intelligence”. The first part of which is a similar error to “a community college graduate and a top-level physicist are basically the same”.
The more carefully-worded version of “counterfactual thinking isn’t something most people get without being taught”, would be something like “counterfactual thinking isn’t something most people do without being taught, except in rare and fairly stereotypical circumstances, like ‘I was just almost in a car accident’ or ‘what would my life be like today if I had stayed with my first love?’”. I mean, yes, they do basic counterfactuals like “if I eat the cake I will get fat, if I don’t eat the cake I won’t get fat” (which, I note, cats do not), but thinking about the higher-order effects like “if I buy the cake that has this effect on the overall economy, and the world as a whole looks different 6 months from now in these subtle ways” is a thought-pattern most people have to be taught—but can be taught.
If your situation is that you can’t differentiate between average-intelligence people, below-average-intelligence people, and cats, because they all just don’t get things that seem obvious to you, and once they don’t get one obvious thing you worry about what other obvious things they will or won’t get and they just become unpredictable beings you don’t understand very well… then probably my encouragement to treat more people less like cats isn’t going to work for you.
Anyway, picturing what the world would look like if I moved up the intelligence scale, the thoughts that output sound like your posts. Most people are basically cats, if you expect to be treated like an adult you have to be trying to have a counterfactual impact.
This tangentially reminded me of this quote about John von Neumann by Edward Teller, himself a bright chap (father of the hydrogen bomb and all that):
von Neumann would carry on a conversation with my 3-year-old son, and the two of them would talk as equals, and I sometimes wondered if he used the same principle when he talked to the rest of us.
That said in John Wentworth’s case moral agency/ambition/tsuyoku naritai seems more key than intelligence, cf. what he said earlier:
What made it hurt wasn’t that they were stupid; this was a college where the median student got a perfect score on their math SATs, they were plenty smart. They just… hadn’t put in the effort. … The disappointment came from seeing what they could have been, and seeing that they didn’t even try for it. …
I think a core factor here is something like ambition or growth mindset. When I have shortcomings, I view them as shortcomings to be fixed or at least mitigated, not as part of my identity or as a subject for sympathy. On the positive side, I have goals and am constantly growing to better achieve them. Tsuyoku naritai. I see people who lack that attitude, who don’t even really want to grow stronger, and when empathy causes the suspension of disbelief to drop… that’s when I feel disgust or disappointment in my so-called fellow humans. Because if I were in their shoes, I would feel disgust or disappointment in myself.
You could be right, and thanks for the feedback. It’s a low-probability speculation, and that quote is evidence against.
There’s a difference between disappointment and disgust, and “can only have fun with people when he treats them as non-agents” is very different from how I think about people, and it is in my nature to try and figure out people who think very differently from me. So far I haven’t got a mental model that fits John’s outputs well in their entirety. My mind is still working on it in the background.
I think it’s important to bring the distinction I initially missed, about what you mean by moral agency, into this conversation. From your comment in the other post:
This seems defensible, but nonstandard. Under a definition of “moral agency” that relies on counterfactual credit assignment, a lot of things that would normally be considered the actions of a moral agent acting goodly, wouldn’t count. (Unless I’m misunderstanding, in which case, please correct that misunderstanding.)
Examples:
1. I have an opportunity to cheat on my spouse, or do something else society clearly codes as wrong. I choose not to.
Standard interpretation: Good choice, have a cookie. You are the sort of person I can collaborate with.
Counterfactual credit analysis: Doesn’t seem very agentic. Most of the credit here goes to the society around you and your parents, who taught you through various forms of reinforcement to decide in that way in that kind of situation. Maybe you get a little credit for actually doing the expected thing when the opportunity arose, but very little. You’re basically a cat. A good cat rather than a bad cat, I guess?
2. I want there to be fewer people dying of things they don’t need to die of. So I read up GiveWell’s stuff, and donate a large amount to each of their top recommended charities.
Standard interpretation: Again, good job, not many people do this at present and it’s obviously helpful on the object level for people to behave in this way.
Counterfactual credit analysis: ~No points awarded. Everyone with lots of money is aware of GiveWell now, their top charities are not funding constrained (at least that’s what I understood to be the case a few years ago, don’t rely on this statement as fact without double-checking), and if you didn’t donate someone else would.
3. As a kid, my brother dies of cancer, so I vow to do what I can to make sure that happens to fewer people. (This didn’t happen to me, but it did happen to a friend). I go study hard for decades and become a doctor specializing in the kind of cancer my brother died of (my friend did not do this, but he went a significant way down that path). Through various medical means, my actions directly prevent many deaths during my career.
Standard interpretation: Mission accomplished? You did what you set out to do, stayed true to the goals of your childhood self, and should look back on your life with happiness and pride.
Counterfactual credit analysis: The 80,000 hours career path analysis which basically said “the counterfactual impact of becoming a doctor is low, try and do something neglected” is where the concept of counterfactual thinking clicked for me. Few points awarded, clearly falls within the “just being a marble doing the expected thing” category of life-choices.
I think disgust at people who aren’t agentic in terms of thinking about and optimizing for their counterfactual impact is the wrong move.
Background information that informs this view: Most people (not most people here, but most people generally) will have a reaction to “you’re not really agentic and may get a disgust reaction from me unless you’re optimizing for counterfactual impact” with either a blank confused stare because they’re not familiar with the relevant concepts, or something like “you mean like speculating about how things would be different if the Nazis won WW2? What does that have to do with whether i should cheat on my wife, or get credit for not doing so?”.
In this situation, most people are like the woman with the nail in her head, except she doesn’t know she has a nail in her head and isn’t going to be defensive about it unless you start telling her she’s a terrible/stupid person or cat who you are better than and you can’t work with her, for having a nail in her head and not doing anything about it. The standard person’s reaction to counterfactual reasoning, once it’s explained why that’s relevant, might be “well that definitely changes some of my life-plans”, which would be like the woman going “hey look, you’re right, there is a nail in there, thanks!” Although consistently updating how you think to incorporate and apply a new concept in all areas where it’s relevant rather than just applying it in its original context is also a skill that needs to be taught to many people, not something that happens automatically.
People who follow standard rules in standard situations are perfectly good collaborators in those situations. Also, reacting with disgust to those people regardless of their behaviour destroys the incentive structure which makes them good collaborators. People who don’t cheat on their spouse and do become oncologists should get a cookie, even though those are standard things to do with a low counterfactual impact. And there is a material difference between someone who is a doctor and someone who is a nonfunctional alcoholic, in terms of how grown-up and reliable they are, and that difference should be recognized, rather than putting them both in the same bucket as a cat.
Another counterpoint: On my understanding of how physics works (which admittedly may be wrong in important ways) we don’t have free will, there are no counterfactual universes, and we are all 0% actually-agentic. Your thoughts at moment t are a result of the various physical forces operating on the molecules of your brain, which strictly depend on the state of the system at t-1, and backward in time to the beginning of time. When pushed to the limit, “how much credit for this outcome goes to an agentic individual, vs. to environmental influences and marbles doing what marbles do?” has a correct answer “0% to the individual”. Any choice of how much agency to grant someone in your mind seems kind of an arbitrary choice, unless you believe the universe is doing something other than unfolding according to physical laws, and our choices can in actual fact change the future. “Agency” is a useful component to a model of the interaction between humans and animals, not the territory of base reality. And picking “only agency that requires familiarity with concepts most people don’t have familiarity with counts” seems like a choice that will have negative systematic effects.
With all that said… I agree with approximately all of this, from Thane, below:
What you in your post mean when you say “grown up” in this sentence:
Seems to me similar to what Thane is pointing at with “game-theoretic agent”, and what Harry in HPMOR would call a sane adult.
And, you get to choose who you will treat as an adult. If someone wants you to empathize with them, just do so, it’s an almost-costless action that many people value. But if they want you to treat them as an adult, it’s fine to say “sorry, there’s a nail in your head and you should know it’s there and want to remove it but apparently you don’t, and under this circumstance I find it difficult to take you seriously”. Not in those exact words, as that will quite often be perceived as an attack or extremely rude, but the message “there are requirements/standards if you want me to treat you as an adult, here is what they are, you’re not currently meeting them” is useful information for people who want you to treat them as adults. Basically, the message “in order to be treated as a Very Serious Adult, you have to have your life and your self in a certain degree of order” is standard, even people who want to be treated with more respect than you’re giving them may get it.
Sorry about the long posts, but I’m thinking and trying to model how things look from your perspective. EDIT TO ADD: Epistemic status: Speculation.
Hypothesis: You’re like, 30-40 points higher IQ than me, which would put you around 60 points above average (ballpark figures in each case).
If true, that would explain some things. There’s a certain intelligence level that I round down to “basically not intelligent”, and a certain intelligence level that I round up to “too smart for me to really understand things they find intellectually engaging unless they try really hard to dumb things down, or I spend hours where they spent minutes, so I can’t really have a conversation with them about it”—and in either case, it’s hard for me to see distinctions among people too far from my own intelligence-level in either direction. And the same is true for everyone, from what I’ve read. The barrier to mutual understanding seems to kick in around 30-40 IQ points. I understand that for people at a certain low IQ level, “this person went to community college” = “this person is really smart”, with the same reaction to “this person has a doctorate in physics” or “this person is the President”. And I can talk to and connect with people who are around average, as well as people who are pretty smart, while I find it hard to really put myself in the shoes of someone who’s significantly below average in intelligence, and there are people I tag with “too smart for me to really understand”, although relatively few, and I can still understand the parts of them that, ahem, aren’t particularly intelligent :D.
I picture what the world would look like if I was smarter, and thus concepts that took some prodding or prompting for me to get them (but I did get them) just seem obvious from age 5, the way utilitarianism did for me before I knew other people had thought of it and it had a name. Apparently this is something most people are only introduced to in university? Anyway, picturing what the world would look like if I moved up the intelligence scale, the thoughts that output sound like your posts. Most people are basically cats, if you expect to be treated like an adult you have to be trying to have a counterfactual impact. And my model of you as someone well outside the normal intelligence range predicts that my earlier post saying “counterfactual thinking isn’t something most people get without being taught” would get a response like “yes, exactly, most people are basically cats, and I’ve just downgraded my estimate of your intelligence”. The first part of which is a similar error to “a community college graduate and a top-level physicist are basically the same”.
The more carefully-worded version of “counterfactual thinking isn’t something most people get without being taught”, would be something like “counterfactual thinking isn’t something most people do without being taught, except in rare and fairly stereotypical circumstances, like ‘I was just almost in a car accident’ or ‘what would my life be like today if I had stayed with my first love?’”. I mean, yes, they do basic counterfactuals like “if I eat the cake I will get fat, if I don’t eat the cake I won’t get fat” (which, I note, cats do not), but thinking about the higher-order effects like “if I buy the cake that has this effect on the overall economy, and the world as a whole looks different 6 months from now in these subtle ways” is a thought-pattern most people have to be taught—but can be taught.
If your situation is that you can’t differentiate between average-intelligence people, below-average-intelligence people, and cats, because they all just don’t get things that seem obvious to you, and once they don’t get one obvious thing you worry about what other obvious things they will or won’t get and they just become unpredictable beings you don’t understand very well… then probably my encouragement to treat more people less like cats isn’t going to work for you.
This tangentially reminded me of this quote about John von Neumann by Edward Teller, himself a bright chap (father of the hydrogen bomb and all that):
That said in John Wentworth’s case moral agency/ambition/tsuyoku naritai seems more key than intelligence, cf. what he said earlier:
So I think you’re misdiagnosing.
You could be right, and thanks for the feedback. It’s a low-probability speculation, and that quote is evidence against.
There’s a difference between disappointment and disgust, and “can only have fun with people when he treats them as non-agents” is very different from how I think about people, and it is in my nature to try and figure out people who think very differently from me. So far I haven’t got a mental model that fits John’s outputs well in their entirety. My mind is still working on it in the background.