The fully general definition of utility function is defined over possible outcomes, so if “lots of paper-clips but no staples” makes you unhappy and “lots of staples but no paper-clips” makes you just as unhappy while “lots of paper-clips and lots of staples” makes you very happy, then we can just say you assign 0 utility to the former two and 1 utility to the third.
It may be that some agents have ‘fungible’ utility, where one unit of X has the same value regardless of how much you already have or how much of anything else you have, but these utility functions form tiny fraction of all possible functions, so if you don’t think your own function is of this type then it probably isn’t.
these utility functions form tiny fraction of all possible functions, so if you don’t think your own function is of this type then it probably isn’t.
I generally agree with your conclusions in this comment, but I don’t think that this is correct reasoning. The fact that a certain type of utility function is a really small fraction of all possible utility functions is not strong evidence for the conclusion that P(your utility function is not of this type | you don’t think it is) because there may be a certain tendancy in human utility functions towards a certain type of function, even though functions of that type occupy an infinitesimally small fraction of function-space.
Though note that utility is often fungible in small quantities, since I can always trade one good for another at market value (this fails when the quantities involved are large enough that the market can’t absorb them without affecting price, or if the object in question is difficult to liquidate, e.g. time or knowledge).
Yes but one has to be very careful. For humans scope-insensitivity usually occurs at ranges where the goods are still fungible. In the studies that Eliezer presents in that post, the issue is slightly different; here there are so many copies of a good X that adding or removing, say, 1000 of them does not affect the value of a single copy of X.
For instance, there are probably billions of birds in existence; if we would pay $80 to save 2000 birds when there are 1,000,000,000 of them, then we would probably also pay $80 to save 2000 birds when there are 999,998,000 of them. Repeating this argument a few times would mean that we should be willing to pay $800 to save 20000 birds, as opposed to the still $80 reported in the survey.
(For this argument to work entirely, we have to also argue that $800 is a small portion of a person’s total wealth, which is true in most first world countries.)
...some agents have ‘fungible’ utility, where one unit of X has the same value regardless of how much you already have or how much of anything else you have...
What about the human desire for positive bodily sensations? Given what we currently know about physics, it should be much more efficient to cause them unconditionally than to realize them as a result of some actual achievement. Humans value such fictitious sensations, see movies or daydreams. So the value of such sensations is non-negligible. If we can create them effectively enough to outweigh the utility we assign to their natural realization, then isn’t it rational to choose to indulge into unconditional satisfaction?
If only one of your values can be realized an unlimited times, then it only needs to yield one unit of utility per realization to outweigh all other values, as long as its realization is cost effective enough. Because as far as I know, the utility from realizing that one value is no different than the utility you can earn from any of your values, all that counts is the amount of utility you expect.
I do understand your argument, but I just explained why this need not be the case. My utility function does not have to assign a constant value to pleasant fictitious experiences. It does not need to explicitly assign any value to PFEs, only to outcomes. It may be possible to deduce from these outcomes a single unique value assigned to PFEs, but there’s no reason why this has to be the case.
For instance, maybe my value for PFEs can’t be realized an unlimited number of times because the more PFEs I have and the less real experiences I have the more value real experiences and the less I value PFEs. Even if watching a movie was the best part of my day, it does not mean I want to spend my whole day watching movies.
Not all functions are linear, or even analytic. Some are just pairs of numbers.
I do understand you as well. But I don’t see how some people here seem to be able to make value statements about certain activities, e.g. playing the lottery is stupid. Or it is rational to try to mitigate risks from AI. I am still clueless how this can be justified if utility isn’t objectively grounded, e.g. in units of bodily sensation. If I am able to arbitrarily assign utility to world states then I could as well assign utility to universes where I survive the Singularity without doing anything to mitigate it, enough to outweigh any others. In other words, I can do what I want and be rational as long as I am not epistemically confused about the consequences of my actions.
If that is the case, why are there problems like Pascal’s mugging or infinite ethics? If utility maximization does not lead to focusing on few values that promise large amounts of utility, then there seem to be no problems. Just because I would save my loved one’s doesn’t mean that I want to spend the whole day saving infinitely many people.
In other words, I can do what I want and be rational as long as I am not epistemically confused about the consequences of my actions.
So what.
There are much more important things than being rational, at least to me. The world, for one. If all you really want to do is sit at home all day basking in your own rationality, then there’s little I can do to argue that you aren’t, but I would hope there’s more to you than that (if there isn’t, feel free to tell me and we can end this discussion).
I’m not sure I can honestly say that I place absolutely no terminal value on rationality, but most of the reason I am pursuing it is its supposed usefulness in achieving everything else.
When we say playing the lottery is stupid, we assume that you don’t want to lose money, and when we say mitigating existential risk is rational we assume that you don’t want the world to end. Generally humans aren’t so very different that these assumptions aren’t mostly justified.
Just because I would save my loved one’s doesn’t mean that I want to spend the whole day saving infinitely many people.
Some people take this very approach, they call it ‘bounded utility’.
I don’t agree with them because it seems to me like along the dimension of human life my utility function really is linear, or at least I would like it to be, but that’s just me.
The general principle I’m trying to get at is to find what you actually want, as opposed to what is convenient, mathematically elegant or philosophically defensible, and make that your utility function. If you do this then expected utility should never lead you astray.
What I am trying to fathom is the difference between 1.) assigning utility arbitrarily (no objective grounding) 2.) grounding utility in units of bodily sensations 3.) grounding utility in units of human well-being (i.e. number of conscious beings whose life’s are worth living).
Retains complex values but might be inconsistent (e.g., how do you assign utility to novel goals that can’t be judged in terms of previous goals (e.g., hunter gatherer who learns about category theory)).
You assign most utility to universes that maximize desirable bodily sensations (likely effect: wireheading (reduction of complex values to a narrow set of values (e.g., the well-being of other humans is abandoned because bodily sensations do not grow linearly with the number of beings that are saved))).
Same as #2 except that complex values are now reduced to the well-being of other humans.
As you see, my problem is that to me as a complete layman expected utility maximization seems to lead to the reduction of complex values once it is measured in some objectively verifiable physical fact. In other words, as long as utility is dimensionless, it seems to be an inconsistent measure, if you add a dimension it leads to the destruction of complex values.
The downvoting of the OP seems to suggest that some people seem to suspect that I am not honest, but I am really interested to learn more about this and how I am wrong. I am not trying to claim some insight here but merely ask people for help who understand a lot more about it than me. I am also not selfish, as some people seem to think? I care strongly about other humans and even lower animals.
how do you assign utility to novel goals that can’t be judged in terms of previous goals
Don’t think in terms of choosing what value to assign, think in terms of figuring out what value your utility functions already assigns to it (your utility function is a mathematical object that always has and always will exist).
So the answer is that you can’t be expected to know yet what your value your utility function assigns to goals you haven’t thought of, and this doesn’t matter too much since uncertainty about your utility function can just be treated like any other uncertainty.
The downvoting of the OP seems to suggest that some people seem to suspect that I am not honest, but I am really interested to learn more about this and how I am wrong.
For the record, I voted the OP up, because it made me think and in particular made me realise my utility function wasn’t additive or even approximately additive, which I had been unsure of before.
Don’t think in terms of choosing what value to assign, think in terms of figuring out what value your utility functions already assigns to it...
I don’t think that is possible. Consider the difference between a hunter-gatherer, who cares about his hunting success and to become the new clan chief, and a member of lesswrong who wants to determine if a “sufficiently large randomized Conway board could turn out to converge to a barren ‘all off’ state.”
The utility of the success in hunting down animals and proving abstract conjectures about cellular automata is largely determined by factors such as your education, culture and environmental circumstances. The same hunter gatherer who cared to kill a lot of animals, to get the best ladies in its clan, might have under different circumstances turned out to be a vegetarian mathematicians solely caring about his understanding of the nature of reality. Both sets of values are to some extent mutually exclusive or at least disjoint. Yet both sets of values are what the person wants, given the circumstances. Change the circumstances dramatically and you change the persons values.
You might conclude that what the hunter-gatherer really wants is to solve abstract mathematical problems, he just doesn’t know about that. But there is no set of values that a person really wants. Humans are largely defined by the circumstances they reside in. If you already knew a movie, you wouldn’t watch it. To be able to get your meat from the supermarket changes the value of hunting.
If “we knew more, thought faster, were more the people we wished we were, and had grown up closer together” then we would stop to desire what we learnt, wish to think even faster, become even different people and get bored of and rise up from the people similar to us.
The fully general definition of utility function is defined over possible outcomes, so if “lots of paper-clips but no staples” makes you unhappy and “lots of staples but no paper-clips” makes you just as unhappy while “lots of paper-clips and lots of staples” makes you very happy, then we can just say you assign 0 utility to the former two and 1 utility to the third.
It may be that some agents have ‘fungible’ utility, where one unit of X has the same value regardless of how much you already have or how much of anything else you have, but these utility functions form tiny fraction of all possible functions, so if you don’t think your own function is of this type then it probably isn’t.
I generally agree with your conclusions in this comment, but I don’t think that this is correct reasoning. The fact that a certain type of utility function is a really small fraction of all possible utility functions is not strong evidence for the conclusion that P(your utility function is not of this type | you don’t think it is) because there may be a certain tendancy in human utility functions towards a certain type of function, even though functions of that type occupy an infinitesimally small fraction of function-space.
^ What he said.
Though note that utility is often fungible in small quantities, since I can always trade one good for another at market value (this fails when the quantities involved are large enough that the market can’t absorb them without affecting price, or if the object in question is difficult to liquidate, e.g. time or knowledge).
If utility isn’t fungible for large quantities, does that mean that it is rational to be scope-insensitive?
Yes but one has to be very careful. For humans scope-insensitivity usually occurs at ranges where the goods are still fungible. In the studies that Eliezer presents in that post, the issue is slightly different; here there are so many copies of a good X that adding or removing, say, 1000 of them does not affect the value of a single copy of X.
For instance, there are probably billions of birds in existence; if we would pay $80 to save 2000 birds when there are 1,000,000,000 of them, then we would probably also pay $80 to save 2000 birds when there are 999,998,000 of them. Repeating this argument a few times would mean that we should be willing to pay $800 to save 20000 birds, as opposed to the still $80 reported in the survey.
(For this argument to work entirely, we have to also argue that $800 is a small portion of a person’s total wealth, which is true in most first world countries.)
What about the human desire for positive bodily sensations? Given what we currently know about physics, it should be much more efficient to cause them unconditionally than to realize them as a result of some actual achievement. Humans value such fictitious sensations, see movies or daydreams. So the value of such sensations is non-negligible. If we can create them effectively enough to outweigh the utility we assign to their natural realization, then isn’t it rational to choose to indulge into unconditional satisfaction?
If only one of your values can be realized an unlimited times, then it only needs to yield one unit of utility per realization to outweigh all other values, as long as its realization is cost effective enough. Because as far as I know, the utility from realizing that one value is no different than the utility you can earn from any of your values, all that counts is the amount of utility you expect.
I do understand your argument, but I just explained why this need not be the case. My utility function does not have to assign a constant value to pleasant fictitious experiences. It does not need to explicitly assign any value to PFEs, only to outcomes. It may be possible to deduce from these outcomes a single unique value assigned to PFEs, but there’s no reason why this has to be the case.
For instance, maybe my value for PFEs can’t be realized an unlimited number of times because the more PFEs I have and the less real experiences I have the more value real experiences and the less I value PFEs. Even if watching a movie was the best part of my day, it does not mean I want to spend my whole day watching movies.
Not all functions are linear, or even analytic. Some are just pairs of numbers.
I do understand you as well. But I don’t see how some people here seem to be able to make value statements about certain activities, e.g. playing the lottery is stupid. Or it is rational to try to mitigate risks from AI. I am still clueless how this can be justified if utility isn’t objectively grounded, e.g. in units of bodily sensation. If I am able to arbitrarily assign utility to world states then I could as well assign utility to universes where I survive the Singularity without doing anything to mitigate it, enough to outweigh any others. In other words, I can do what I want and be rational as long as I am not epistemically confused about the consequences of my actions.
If that is the case, why are there problems like Pascal’s mugging or infinite ethics? If utility maximization does not lead to focusing on few values that promise large amounts of utility, then there seem to be no problems. Just because I would save my loved one’s doesn’t mean that I want to spend the whole day saving infinitely many people.
So what.
There are much more important things than being rational, at least to me. The world, for one. If all you really want to do is sit at home all day basking in your own rationality, then there’s little I can do to argue that you aren’t, but I would hope there’s more to you than that (if there isn’t, feel free to tell me and we can end this discussion).
I’m not sure I can honestly say that I place absolutely no terminal value on rationality, but most of the reason I am pursuing it is its supposed usefulness in achieving everything else.
When we say playing the lottery is stupid, we assume that you don’t want to lose money, and when we say mitigating existential risk is rational we assume that you don’t want the world to end. Generally humans aren’t so very different that these assumptions aren’t mostly justified.
Some people take this very approach, they call it ‘bounded utility’.
I don’t agree with them because it seems to me like along the dimension of human life my utility function really is linear, or at least I would like it to be, but that’s just me.
The general principle I’m trying to get at is to find what you actually want, as opposed to what is convenient, mathematically elegant or philosophically defensible, and make that your utility function. If you do this then expected utility should never lead you astray.
What I am trying to fathom is the difference between 1.) assigning utility arbitrarily (no objective grounding) 2.) grounding utility in units of bodily sensations 3.) grounding utility in units of human well-being (i.e. number of conscious beings whose life’s are worth living).
Retains complex values but might be inconsistent (e.g., how do you assign utility to novel goals that can’t be judged in terms of previous goals (e.g., hunter gatherer who learns about category theory)).
You assign most utility to universes that maximize desirable bodily sensations (likely effect: wireheading (reduction of complex values to a narrow set of values (e.g., the well-being of other humans is abandoned because bodily sensations do not grow linearly with the number of beings that are saved))).
Same as #2 except that complex values are now reduced to the well-being of other humans.
As you see, my problem is that to me as a complete layman expected utility maximization seems to lead to the reduction of complex values once it is measured in some objectively verifiable physical fact. In other words, as long as utility is dimensionless, it seems to be an inconsistent measure, if you add a dimension it leads to the destruction of complex values.
The downvoting of the OP seems to suggest that some people seem to suspect that I am not honest, but I am really interested to learn more about this and how I am wrong. I am not trying to claim some insight here but merely ask people for help who understand a lot more about it than me. I am also not selfish, as some people seem to think? I care strongly about other humans and even lower animals.
Don’t think in terms of choosing what value to assign, think in terms of figuring out what value your utility functions already assigns to it (your utility function is a mathematical object that always has and always will exist).
So the answer is that you can’t be expected to know yet what your value your utility function assigns to goals you haven’t thought of, and this doesn’t matter too much since uncertainty about your utility function can just be treated like any other uncertainty.
For the record, I voted the OP up, because it made me think and in particular made me realise my utility function wasn’t additive or even approximately additive, which I had been unsure of before.
I don’t think that is possible. Consider the difference between a hunter-gatherer, who cares about his hunting success and to become the new clan chief, and a member of lesswrong who wants to determine if a “sufficiently large randomized Conway board could turn out to converge to a barren ‘all off’ state.”
The utility of the success in hunting down animals and proving abstract conjectures about cellular automata is largely determined by factors such as your education, culture and environmental circumstances. The same hunter gatherer who cared to kill a lot of animals, to get the best ladies in its clan, might have under different circumstances turned out to be a vegetarian mathematicians solely caring about his understanding of the nature of reality. Both sets of values are to some extent mutually exclusive or at least disjoint. Yet both sets of values are what the person wants, given the circumstances. Change the circumstances dramatically and you change the persons values.
You might conclude that what the hunter-gatherer really wants is to solve abstract mathematical problems, he just doesn’t know about that. But there is no set of values that a person really wants. Humans are largely defined by the circumstances they reside in. If you already knew a movie, you wouldn’t watch it. To be able to get your meat from the supermarket changes the value of hunting.
If “we knew more, thought faster, were more the people we wished we were, and had grown up closer together” then we would stop to desire what we learnt, wish to think even faster, become even different people and get bored of and rise up from the people similar to us.