There is no choice a supermajority of agents would make instrumentally.
For example: I am a happiness maximizer. If I could press a button that would give my future self free energy, this would result in my future self making people happy, so I would press it. If a happiness minimizer was given the same choice, and could press a button that would give my future self free energy, it would result in my future self making people happy, so he would not press it.
It’s only generally instrumentally useful when it’s future you that deals with it. That’s how things commonly work, but there’s nothing fundamental about it.
You realize that word just means “more than a majority”, right? Usually 2/3rds, but it seems like “majority” fits your argument just fine.
I’m also assuming you’re referring to the space of all possible agents, rather than the space of all actual agents, because goodness, there’s LOTS that 90+% of all humans agree on.
I’m also assuming you’re referring to the space of all possible agents, rather than the space of all actual agents, because goodness, there’s LOTS that 90+% of all humans agree on.
Yeah.
My point is that if you flip your utility function, it flips your actions. It just doesn’t seem that way because if you flip your utility function, and future!your utility function, then a lot of actions stay the same.
Not necessarily. Suppose there are two switches next to each other: switch A and switch B. If you leave switch A alone, I’ll write it “a”, if you flip it, it’s “A.” The payoffs look like this:
I mean, it flips a binary action. If you replace the “you” deciding whether or not to switch A with one with an opposite utility function, it will flip it so that future you will do less damage when you flip B.
There is no choice a supermajority of agents would make instrumentally.
For example: I am a happiness maximizer. If I could press a button that would give my future self free energy, this would result in my future self making people happy, so I would press it. If a happiness minimizer was given the same choice, and could press a button that would give my future self free energy, it would result in my future self making people happy, so he would not press it.
It’s only generally instrumentally useful when it’s future you that deals with it. That’s how things commonly work, but there’s nothing fundamental about it.
You realize that word just means “more than a majority”, right? Usually 2/3rds, but it seems like “majority” fits your argument just fine.
I’m also assuming you’re referring to the space of all possible agents, rather than the space of all actual agents, because goodness, there’s LOTS that 90+% of all humans agree on.
Yeah.
My point is that if you flip your utility function, it flips your actions. It just doesn’t seem that way because if you flip your utility function, and future!your utility function, then a lot of actions stay the same.
Not necessarily. Suppose there are two switches next to each other: switch A and switch B. If you leave switch A alone, I’ll write it “a”, if you flip it, it’s “A.” The payoffs look like this:
ab: +1
aB: −1
Ab: +3
AB: −3
I mean, it flips a binary action. If you replace the “you” deciding whether or not to switch A with one with an opposite utility function, it will flip it so that future you will do less damage when you flip B.
Ah, I see. Interesting distinction to make, but I don’t think it was intended by roko.