Hmm. In all your examples, Albert goes against “goodness” and ends up with less “yumminess” as a result. But my point was about a different kind of situation: some hypothetical Albert goes against “goodness” and actually ends up with more “yumminess”, but someone else ends up with less. What do you think about such situations?
I would ask Albert: do you generally find it yummy when other people get more yumminess? Do you usually feel like shit when you screw over someone else? For most people, the answers to these are “yes”. Most people do not actually like screwing over other people, most of the time (though there are of course exceptions).
Insofar Albert is a sociopath, or is in one of those moods where he really does want to screw over someone else… I would usually say “Look man, I want you to pursue your best life and fulfill your values, so I wish you luck. But also I’m going to try to stop you, because I want the same for other people too, and I want higher-order nice things like high trust communities.”. One does not argue against the utility function, as the saying goes.
Most people do not actually like screwing over other people
I think this is very culturally dependent. For example, wars of conquest were considered glorious in most places and times, and that’s pretty much the ultimate form of screwing over other people. Or for another example, the first orphanages were built by early Christians, before that the orphans were usually disposed of. Or recall how common slavery and serfdom have been throughout history.
Basically my view is that human nature without indoctrination into “goodness” is quite nasty by default. Empathy is indeed a feeling we have, and we can feel it deeply (...sometimes). But we ended up with this feeling mainly due to indoctrination into “goodness” over generations. We wouldn’t have nearly as much empathy if that indoctrination hadn’t happened, and it probably wouldn’t stay long term if that indoctrination went away.
I do want to say that stuff is a true part of one’s values once it triggers the feelings of yumminess/yearning/etc, regardless of whether memes were involved in installing the values along the way. I want to distinguish that from the case where people “tie themselves in knots”, trying to act like they value something or telling themselves that they value something when the feelings are not in fact there, because they’ve been told they “should” value the thing.
So yeah, some of our actual values are installed culturally/memetically, and that doesn’t automatically make them bad or fake values. I’m on board with that, so long as the underlying feelings of yumminess/yearning/etc actually show up.
We can throw out the other junk of memetic egregore Goodness, without abandoning the stuff people actually feel good about.
But why do you think that people’s feelings of “yumminess” track the reality of whether an action is cooperate/cooperate? I’ve explained that it hasn’t been true throughout most of history: people have been able to feel “yummy” about very defecting actions. Maybe today the two coincide unusually well, but then that demands an explanation.
I think it’s just not true. There are too many ways to defect and end up better off, and people are too good at rationalizing why it’s ok for them specifically to take one of those ways. That’s why we need an evolving mechanism of social indoctrination, “goodness”, to make people choose the cooperative action even when it doesn’t feel “yummy” to them in the moment.
But why do you think that people’s feelings of “yumminess” track the reality of whether an action is cooperate/cooperate?
I don’t think that’s the right question here?
Let me turn it around: you say “That’s why we need an evolving mechanism of social indoctrination, “goodness”, to make people choose the cooperative action even when it doesn’t feel “yummy” to them in the moment.”. But, like, the memetic egregore “Goodness” clearly does not track that in a robust generalizable way, any more than people’s feelings of yumminess do. The egregore is under lots of different selection pressures besides just “get people to not defect”, and the egregore has indoctrinated people in different things over time. So why are you attached to the whole egregore, rather than wanting to jettison the bulk of the egregore and focus directly on getting people to not defect? Why do you think that the memetic egregore Goodness tracks the reality of whether an action is cooperate/cooperate?
But, like, the memetic egregore “Goodness” clearly does not track that in a robust generalizable way, any more than people’s feelings of yumminess do.
I feel you’re overstating the “any more” part, or at least it doesn’t match my experience. My feelings of “goodness” often track what would be good for other people, while my feelings of “yumminess” mostly track what would be good for me. Though of course there are exceptions to both.
So why are you attached to the whole egregore, rather than wanting to jettison the bulk of the egregore and focus directly on getting people to not defect?
This can be understood two ways. 1) A moral argument: “We shouldn’t have so much extra stuff in the morality we’re blasting in everyone’s ears, it should focus more on the golden rule / unselfishness”. That’s fine, everyone can propose changes to morality, go for it. 2) “Everyone should stop listening to morality radio and follow their feels instead”. Ok, but if nobody listens to the radio, by what mechanism do you get other people to not defect? Plenty of people are happy to defect by feels, I feel I’ve proved that sufficiently. Do you use police? Money? The radio was pretty useful for that actually, so I’m not with you on this.
Insofar Albert is a sociopath, or is in one of those moods where he really does want to screw over someone else… I would usually say “Look man, I want you to pursue your best life and fulfill your values, so I wish you luck. But also I’m going to try to stop you, because I want the same for other people too, and I want higher-order nice things like high trust communities.”. One does not argue against the utility function, as the saying goes.
This seems incoherent to me? I’d like it if all the sociopaths are duped by society into not pursuing their values, that’s great for my values, and because they’re evil I’d rather them not pursue their best life. However I still support distinguishing between goodness and human values for the same general-purpose reasons why often, even if its possible in principle to use some piece of information for evil, its still often better to spread & talk about that information than not.
More generally I think people are too quick to use the phrase “One does not argue against the utility function, as the saying goes.” Yes, you can’t argue against the utility function, but if someone has a bad utility function and is unaware what that utility function is, I’m not going to dissuade them from that (unless I think they’ll be happy to cooperate with me on bettering both our goals if I do, but sociopaths are not known for such behavior). That’s part of stopping them.
I’m quite confident my preferences are coherent here, it’s one of the parts of my values I’m most familiar with.
There’s both an instrumentalish and a terminalish component. The terminalish component is roughly a really strong preference to not try to mislead people about their own values; that in particular is just incredibly deeply wrong for me to do according to my own values. The instrumentalish component is… very similar to the thing where people are like “well we need to be a little hyperbolic or misleading or conceal our true intent in order to spread our political message successfully” and then over and over again that type of reasoning leads people to metaphorically smack themselves in the face, it’s a massive own goal, it just does not work.
Indeed, you could make a very reasonable argument that the entire reason AI might be dangerous is because once it’s able to automate away the entire economy, as an example, defection no longer has any cost and has massive benefits (at least conditional on no alignment in values).
The basic reason why you can’t defect easily and gain massive amounts of utility from social systems is a combo of humans not being able to evade enforcement reliably, due to logistics issues, combined with people being able to reliably detect defection in small groups due to reputation/honor systems, and combined with the fact that humans as individuals are far, far less powerful even selfishly as individuals than as cooperators.
This of course breaks once AGI/ASI is invented, but John Wentworth’s post doesn’t need to apply to post-AGI/ASI worlds.
Hmm. In all your examples, Albert goes against “goodness” and ends up with less “yumminess” as a result. But my point was about a different kind of situation: some hypothetical Albert goes against “goodness” and actually ends up with more “yumminess”, but someone else ends up with less. What do you think about such situations?
I would ask Albert: do you generally find it yummy when other people get more yumminess? Do you usually feel like shit when you screw over someone else? For most people, the answers to these are “yes”. Most people do not actually like screwing over other people, most of the time (though there are of course exceptions).
Insofar Albert is a sociopath, or is in one of those moods where he really does want to screw over someone else… I would usually say “Look man, I want you to pursue your best life and fulfill your values, so I wish you luck. But also I’m going to try to stop you, because I want the same for other people too, and I want higher-order nice things like high trust communities.”. One does not argue against the utility function, as the saying goes.
I think this is very culturally dependent. For example, wars of conquest were considered glorious in most places and times, and that’s pretty much the ultimate form of screwing over other people. Or for another example, the first orphanages were built by early Christians, before that the orphans were usually disposed of. Or recall how common slavery and serfdom have been throughout history.
Basically my view is that human nature without indoctrination into “goodness” is quite nasty by default. Empathy is indeed a feeling we have, and we can feel it deeply (...sometimes). But we ended up with this feeling mainly due to indoctrination into “goodness” over generations. We wouldn’t have nearly as much empathy if that indoctrination hadn’t happened, and it probably wouldn’t stay long term if that indoctrination went away.
I do want to say that stuff is a true part of one’s values once it triggers the feelings of yumminess/yearning/etc, regardless of whether memes were involved in installing the values along the way. I want to distinguish that from the case where people “tie themselves in knots”, trying to act like they value something or telling themselves that they value something when the feelings are not in fact there, because they’ve been told they “should” value the thing.
So yeah, some of our actual values are installed culturally/memetically, and that doesn’t automatically make them bad or fake values. I’m on board with that, so long as the underlying feelings of yumminess/yearning/etc actually show up.
We can throw out the other junk of memetic egregore Goodness, without abandoning the stuff people actually feel good about.
But why do you think that people’s feelings of “yumminess” track the reality of whether an action is cooperate/cooperate? I’ve explained that it hasn’t been true throughout most of history: people have been able to feel “yummy” about very defecting actions. Maybe today the two coincide unusually well, but then that demands an explanation.
I think it’s just not true. There are too many ways to defect and end up better off, and people are too good at rationalizing why it’s ok for them specifically to take one of those ways. That’s why we need an evolving mechanism of social indoctrination, “goodness”, to make people choose the cooperative action even when it doesn’t feel “yummy” to them in the moment.
I don’t think that’s the right question here?
Let me turn it around: you say “That’s why we need an evolving mechanism of social indoctrination, “goodness”, to make people choose the cooperative action even when it doesn’t feel “yummy” to them in the moment.”. But, like, the memetic egregore “Goodness” clearly does not track that in a robust generalizable way, any more than people’s feelings of yumminess do. The egregore is under lots of different selection pressures besides just “get people to not defect”, and the egregore has indoctrinated people in different things over time. So why are you attached to the whole egregore, rather than wanting to jettison the bulk of the egregore and focus directly on getting people to not defect? Why do you think that the memetic egregore Goodness tracks the reality of whether an action is cooperate/cooperate?
I feel you’re overstating the “any more” part, or at least it doesn’t match my experience. My feelings of “goodness” often track what would be good for other people, while my feelings of “yumminess” mostly track what would be good for me. Though of course there are exceptions to both.
This can be understood two ways. 1) A moral argument: “We shouldn’t have so much extra stuff in the morality we’re blasting in everyone’s ears, it should focus more on the golden rule / unselfishness”. That’s fine, everyone can propose changes to morality, go for it. 2) “Everyone should stop listening to morality radio and follow their feels instead”. Ok, but if nobody listens to the radio, by what mechanism do you get other people to not defect? Plenty of people are happy to defect by feels, I feel I’ve proved that sufficiently. Do you use police? Money? The radio was pretty useful for that actually, so I’m not with you on this.
This seems incoherent to me? I’d like it if all the sociopaths are duped by society into not pursuing their values, that’s great for my values, and because they’re evil I’d rather them not pursue their best life. However I still support distinguishing between goodness and human values for the same general-purpose reasons why often, even if its possible in principle to use some piece of information for evil, its still often better to spread & talk about that information than not.
More generally I think people are too quick to use the phrase “One does not argue against the utility function, as the saying goes.” Yes, you can’t argue against the utility function, but if someone has a bad utility function and is unaware what that utility function is, I’m not going to dissuade them from that (unless I think they’ll be happy to cooperate with me on bettering both our goals if I do, but sociopaths are not known for such behavior). That’s part of stopping them.
I’m quite confident my preferences are coherent here, it’s one of the parts of my values I’m most familiar with.
There’s both an instrumentalish and a terminalish component. The terminalish component is roughly a really strong preference to not try to mislead people about their own values; that in particular is just incredibly deeply wrong for me to do according to my own values. The instrumentalish component is… very similar to the thing where people are like “well we need to be a little hyperbolic or misleading or conceal our true intent in order to spread our political message successfully” and then over and over again that type of reasoning leads people to metaphorically smack themselves in the face, it’s a massive own goal, it just does not work.
Indeed, you could make a very reasonable argument that the entire reason AI might be dangerous is because once it’s able to automate away the entire economy, as an example, defection no longer has any cost and has massive benefits (at least conditional on no alignment in values).
The basic reason why you can’t defect easily and gain massive amounts of utility from social systems is a combo of humans not being able to evade enforcement reliably, due to logistics issues, combined with people being able to reliably detect defection in small groups due to reputation/honor systems, and combined with the fact that humans as individuals are far, far less powerful even selfishly as individuals than as cooperators.
This of course breaks once AGI/ASI is invented, but John Wentworth’s post doesn’t need to apply to post-AGI/ASI worlds.