Utility functions are really bad match for human preferences, and one of the major premises we accept is wrong.
Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can’t model all that crap with one number.
The study of affective synchrony shows that humans have simultaneously-active positive and negative affect systems. At extreme levels in either system, the other is shut down, but the rest of the time, they can support or oppose each other. (And in positions of opposition, we experience conflict and indecision.)
Meanwhile, the activation of these systems is influenced by current state/context/priming, as well as the envisioned future. So unless your attempt at modeling a utility function includes terms for all these things, you’re sunk.
(Personally, this is where I think the idea of CEV has its biggest challenge: I know of no theoretical reason why humans must have convergent or consistent utility functions as individuals, let alone as a species.)
Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can’t model all that crap with one number.
I don’t really see why not (at least without further argument).
Relativity and contextuality introduce additional arguments into the utility function, they don’t imply that the output can’t be scalar. Lots of people include relativity and contextual concerns into scalar utility all the time.
Semi-independent positive and negative axes only prevent you from using scalar utility if you think they’re incommensurable. If you can assign weights to the positive and negative axes, then you can aggregate them into a single utility index. (How accurately you can do this is a separate question.)
Of course, if you do think there are fundamentally incommensurable values, then scalar utility runs into trouble.* Amartya Sen and others have done interesting work looking at plural/vector utility and how one might go about using it. (I guess if we’re sufficiently bad at aggregating different types of value, such methods might even work better in practice than scalar utility.)
* I’m sceptical; though less sceptical than I used to be. Most claims of incommensurability strike me as stemming from unwillingness to make trade-offs rather than inability to make trade-offs, but maybe there are some things that really are fundamentally incomparable.
I was pretty convinced for commensurability and thought cognitive biases would just introduce noise, but lack of success by me, and apparently by everyone else in this thread, changed my mind quite significantly.
Not knowing how to commensurate things doesn’t imply they’re incommensurable (though obviously, the fact that people have difficulty with this sort of thing is interesting in its own right).
As a (slight) aside, I’m still unclear about what you think would count as “success” here.
It’s not a hard implication, but it’s a pretty strong evidence against existence of traditional utility functions.
A success would be a list of events or states of reality and their weights, such that you’re pretty convinced that your preferences are reasonably consistent with this list, so that you know how many hours of standing in queues is losing 5kg worth and how much money is having one thousand extra readers of your blog worth.
It doesn’t sound like much, but I completely fail as soon as it goes out of very narrow domain, I’m surprised by this failure, and I’m surprised that others fail at this too.
I’m surprised at your surprise. Even granting that humans could possibly be innately reflectively self-consistent, there’s a huge curse of dimensionality problem in specifying the damn thing. ETA: The problem with the dimensionality is that interactions between the dimensions abound; ceteris paribus assumptions can’t get you very far at all.
Most claims of incommensurability strike me as stemming from unwillingness to make trade-offs rather than inability to make trade-offs, but maybe there are some things that really are fundamentally incomparable.
My point was that even if you can make a tradeoff, you’re likely to have at least some disutility for Omega making that tradeoff for you, rather than letting you make the tradeoff yourself.
My own personal observation, though, is that people don’t usually make good tradeoffs by weighing and combining the utility and disutility for each of their options; they’re happier (and their lives are more generally functional), when they work to maximize utility and then satisfice disutility, in that order.
Our hardware doesn’t do well at cross-comparison, but it can handle, “Which of these do I like best?”, followed by “What am I willing to trade off to get what I like best?” (It can also handle the reverse, but that road leads to a dysfunctional and ever-shrinking “comfort zone”.)
I assume that this is because the two affect systems were intended for approach and avoidance of predators, prey, and mates, rather than making rational tradeoffs between a wide array of future options.
Each system is quite capable of ranking threats or opportunities within its own value system, but there doesn’t seem to be a register or readout in the system that can hold a “pleasure minus pain” value. What appears to happen instead is that the conscious mind can decide to switch off the negative input, if there’s an inner consensus that the worst-case downside is stlil manageable.
This mechanism, however, appears to only operate on one goal at at time; it doesn’t seem to work to try to cram all your options into it at once.
In the aggregate, these mechanisms would be really difficult to model, since the disutility/worst-case scenario check often depends on the examination of more than one possible future and contemplating possible mitigations, risks, etc.
I guess what I’m trying to say is that not only is goal+context important, there’s also a cross-time or over-time input component as well, and that I don’t really see anything that allows a person’s preferences to be absolute, because the “tradeoff” part is something that can happen consciously or unconsciously, and is very sensitive to the steps undertaken to make the tradeoff. But despite this sensitivity, the emotional affect of having made the choice is the same—we defend it, because we own it.
In contrast, a rational weighing of ratios and scores can easily produce a different felt-sensation about the decision: one of not really having decided at all!
If a person “decides” based only on the objective/numerical criteria (even if this includes scoring and weighing his or her emotional responses!), this ownership/territory mechanism does not kick in, with resulting negative consequences for that person’s persistence and commitment.
For example, if you “decide” to go on a diet because you’re 20 pounds overweight, you may stop eating healthily (or at least cease to do so consistently) as you approach your desired weight.
Now, that’s not to say you can’t weigh all the objective information, and then make a decision that’s not conditional upon those facts, or is conditional upon those facts only at the point of time you originally received them. I’m just saying that if you just weigh up the facts and “let the facts decide”, you are just begging for an akrasia problem.
This is why, btw, effective decision makers and successful people tend to talk about “listening to all the input first, and then making their own decision”.
It’s because they need to make it theirs—and there’s no way for the math to do that, because the calculation has to run on the right brain hardware first. And the mathematical part of your brain ain’t it.
OK, there’s a lot of food for thought in there, and I can’t possibly hope to clarify everything I’d ideally like to, but what I think you’re saying is:
it’s theoretically possible to think about utility as a single number; but
it’s nonetheless a bad idea to do so, because (a) we’re not very good at it, and (b) thinking about things mathematically means we won’t “own” the decision, and therefore leads to akrasia problems
(FWIW, I was only claiming 1.) I’m fairly sympathetic to 2(a), although I would have thought we could get better at it with the right training. I can see how 2(b) could be a problem, but I guess I’m not really sure (i) that akrasia is always an issue, and (ii) why (assuming we could overcome 2(a)) we couldn’t decide mathematically, and then figure out how to “own” the decision afterwards. (This seems to have worked for me, at least; and stopping to do the math has at sometimes stopped me “owning” the wrong decision, which can be worse than half-heartedly following through on the right one.)
P.S.
My point was that even if you can make a tradeoff, you’re likely to have at least some disutility for Omega making that tradeoff for you, rather than letting you make the tradeoff yourself.
I didn’t think anyone was suggesting Omega should make the trade-off. I certainly wasn’t.
(FWIW, I was only claiming 1.) I’m fairly sympathetic to 2(a), although I would have thought we could get better at it with the right training. I can see how 2(b) could be a problem, but I guess I’m not really sure (i) that akrasia is always an issue, and (ii) why (assuming we could overcome 2(a)) we couldn’t decide mathematically, and then figure out how to “own” the decision afterwards.
To own it, you’d need to not mathematically decide; the math could only ever be a factor in your decision. There’s an enormous gap between, “the math says do this, so I guess I’ll do that”, and “after considering the math, I have decided to do this.” The felt-experience of those two things is very different, and it’s not merely an issue of using different words.
Regarding getting better at making decisions off of mathematics, I think perhaps you miss my point. For humans, the process by which decision-making is done, has consequences for how it’s implemented, and for the person’s experience and satisfaction regarding the decision itself. See more below...
(This seems to have worked for me, at least; and stopping to do the math has at sometimes stopped me “owning” the wrong decision, which can be worse than half-heartedly following through on the right one.)
I’d like to see an actual, non-contrived example of that. Mostly, my experience is that people are generally better off with a 50% plan executed 100% than a 100% plan executed 50%. It’s a bit of a cliche—one that I also used to be skeptical/cynical about—but it’s a cliche because it’s true. (Note also that in the absence of catastrophic failure, the primary downside of a bad plan is that you learn something, and you still usually make some progress towards your goals.)
It’s one of those places where in theory there’s no difference between theory and practice, but in practice there is. We just think differently when we’re considering something from when we’re committed to it—our brains just highlight different perceptions and memories for our attention, so much so that it seems like all sorts of fortunate coincidences are coming your way.
Our conscious thought process in System 2 is unchanged, but something on the System 1 level operates differently with respect to a decision that’s passed through the full process.
I used to be skeptical about this, before I grasped the system 1/system 2 distinction (which I used to call the “you” (S2) vs. “yourself” (S1) distinction). I assumed that I could make a better plan before deciding to do something or taking any action, and refused to believe otherwise. Now I try to plan just enough to get S1 buy-in, and start taking action so I can get feedback sooner.
the math could only ever be a factor in your decision.
Sure. I don’t think this is inconsistent with what I was suggesting, which was really just that that the math could start the process off.
For humans, the process by which decision-making is done, has consequences for how it’s implemented, and for the person’s experience and satisfaction regarding the decision itself.
All of which I agree with; but again, I don’t see how this rules out learning to use math better.
Mostly, my experience is that people are generally better off with a 50% plan executed 100% than a 100% plan executed 50%.
Fair enough. The examples I’m thinking of typically involve “owned” decisions that are more accurately characterised as 0% plans (i.e. do nothing) or -X% plans (i.e. do things that are actively counterproductive).
Now I try to plan just enough to get S1 buy-in, and start taking action so I can get feedback sooner.
How do you decide what to get S1 to buy in to?
What do you do in situations where feedback comes too late (long term investments with distant payoffs) or never (e.g. ethical decisions where the world will never let you know whether you’re right or not).
P.S. Yes, I’m avoiding the concrete example request. I actually have a few, but they’d take longer to write up than I have time available at the moment, and involve things I’m not sure I’m entirely comfortable sharing.
I already explained: you select options by comparing their positive traits. The devil is in the details, of course, but as you might imagine I do entire training CDs on this stuff. I’ve also written a few blog articles about this in the past.
What do you do in situations where feedback comes too late (long term investments with distant payoffs) or never (e.g. ethical decisions where the world will never let you know whether you’re right or not).
I don’t understand the question. If you’re asking how I’d know whether I made the best possible decision, I wouldn’t. Maximizers do very badly at long-term happiness, so I’ve taught myself to be a satisficer. I assume that the decision to invest something for the long term is better than investing nothing, and that regarding an ethical decision I will know by the consequences and my regrets or lack thereof whether I’ve done the “right thing”… and I probably won’t have to wait very long for that feedback.
..I’m not really sure… why [] we couldn’t decide mathematically, and then figure out how to “own” the decision afterwards.
There’s an enormous gap between, “the math says do this, so I guess I’ll do that”, and “after considering the math, I have decided to do this.” The felt-experience of those two things is very different, and it’s not merely an issue of using different words.
One can imagine a person who has committed emotionally to the maxim “shut up and multiply (when at all possible)” and made it an integral part of their identity. For such an individual, the commitment precedes the act of doing the math, and the enormous gap referred to above does not exist.
For such an individual, the commitment precedes the act of doing the math, and the enormous gap referred to above does not exist.
If such an individual existed, they would still have the same problem of shifting decisions, unless they also included a commitment to not recalculate before a certain point.
Consider, e.g. Newcomb’s problem. If you do the calculation before, you should one-box. But doing the calculation at the actual time, means you should two-box.
So, to stick to their commitments, human beings need to precommit to not revisiting the math, which is a big part of my point here.
Your hypothetical committed-to-the-math person is not committed to their “decisions”, they are committed to doing what the math says to do. This algorithm will not produce the same results as actual commitment will, when run on human hardware.
To put it more specifically, this person will not get the perceptual benefits of a committed decision for decisions which are not processed through the machinery I described earlier. They will be perceptually tuned to the math, not the situation, for example, and will not have the same level of motivation, due to a lack of personal stake in their decision.
In theory there’s no difference between theory and practice, but in practice there is. This is because System 2 is very bad at intuitively predicting System 1′s behavior, as we don’t have a built-in reflective model of our own decision-making and motivation machinery. Thus, we don’t know (and can’t tell) how bad our theories are without comparing decision-making strategies across different people.
Consider, e.g. Newcomb’s problem. If you do the calculation before, you should one-box. But doing the calculation at the actual time, means you should two-box.
This is incorrect. You are doing something very wrong if changing the time when you perform a calculation changes the result. That’s an important issue in decision theory being reflectively consistent.
This is incorrect. You are doing something very wrong if changing the time when you perform a calculation changes the result. That’s an important issue in decision theory being reflectively consistent.
That’s the major point I’m making: that humans are NOT reflectively consistent without precommitment… and that the precommitment in question must be concretely specified, with the degree of concreteness and specificity required being proportional to the degree of “temptation” involved.
That may usually be the case, but this is not a law. Certain people could conceivably precommit to being reflectively consistent, to follow the results of calculations whenever the calculations are available.
Certain people could conceivably precommit to being reflectively consistent, to follow the results of calculations whenever the calculations are available.
Of course they could. And they would not get as good results from either an experiential or practical perspective as the person who explicitly committed to actual, concrete results, for the reasons previously explained.
The brain makes happen what you decide to have happen, at the level of abstraction you specify. If you decide in the abstract to be a good person, you will only be a good person in the abstract.
In the same way, if you “precommit to reflective consistency”, then reflective consistency is all that you will get.
It is more useful to commit to obtaining specific, concrete, desired results, since you will then obtain specific, concrete assistance from your brain for achieving those results, rather than merely abstract, general assistance.
Edit to add: In particular, note that a precommitment to reflective consistency does not rule out the possibility of one’s exercising selective attention and rationalization as to which calculations to perform or observe. This sort of “commit to being a certain kind of person” thing tends to produce hypocrisy in practice, when used in the abstract. So much so, in fact, that it seems to be an “intentionally” evolved mechanism for self-deception and hypocrisy. (Which is why I consider it a particularly heinous form of error to try to use it to escape the need for concrete commitments—the only thing I know of that saves one from hypocrisy!)
A person who decides to be “a good person” will selectively perceive those acts that make them a “good person”, and largely fail to perceive those that do not, regardless of the proportions of these events, or whether these events are actually good in their effects. They will also be more likely perceive to be good, anything that they already want to do or which benefits them, and they will find ways to consider it a higher good to refrain from doing anything they’d rather not do in the first place.
Similarly, a person who decides to be “reflectively consistent” will not only selectively perceive their acts of reflective consistency, they will also fail to observe the lopsided way in which they apply the concept, nor will they notice how their “reflective consistency” is not, in itself, achieving any other results or benefits for themselves or others.
Brains operate on the level of abstraction you give them, so the more abstract the goal, the less connected to reality the results will be, and the more wiggle room there will be for motivated reasoning and selective perception.
So in theory you can precommit to reflective consistency, but in practice you will only get an illusion of reflective consistency.
(Edit to add: If you’re still confused by this, it’s probably because you’re thinking about thinking, and I’m talking about actual behavior.)
I can’t speak for Vladimir, but from my perspective, this is much clearer now. Thanks!
(ETA: FWIW, while most of your comments on this post leave me with a sense that you have useful information to share, I’ve also found them somewhat frustrating, in that I really struggle to figure out exactly what it is. I don’t know if this is your writing style, my slow-wittedness, or just the fact that there’s a lot of inferential distance between us; but I just thought it might be useful for you to know.)
FWIW, while most of your comments on this post leave me with a sense that you have useful information to share, I’ve also found them somewhat frustrating, in that I really struggle to figure out exactly what it is.
Since I’m trying to rapidly summarize a segment of what Robert Fritz took a couple of books to get across to me (“The Path of Least Resistance” and “Creating”), inferential distance is likely a factor.
It’s mostly his model of decisionmaking and commitment that I’m describing, with a few added twists of mine regarding the ranking bit, and the “worst that could happen” part, as well as links from it to the System 1⁄2 model. (And of course I’ve been talking about Fritz’s idea of the ideal-belief-reality-conflict in other threads, and that relates here as well.)
You: People can’t be reflectively consistent. Me: Yes they can, sometimes. You: Of course they can. Me: I’m confused. You: Of course people can be reflectively consistent. But only in the dreamland. If you are still confused, it’s probably because you are still thinking about the dreamland, while I’m talking about reality.
I think pjeby’s point was that reflective consistency is a way of thinking—so if you commit to thinking in a reflectively consistent way, you will think in that way when you think, but you may still wind up not acting according to that kind of thoughts every time you would want to, because you’re not entirely likely to notice that you need to think them in the first place.
Reflective consistency is not about a way of thinking. Decision theory, considered in the simplest case, talks about properties of actions, including future actions, while ignoring properties of the algorithm generating the actions.
Basically, our conversation went like this:
You: People can’t be reflectively consistent.
Me: Yes they can, sometimes.
You: Of course they can.
Me: I’m confused.
No, it went like this:
Me: People can't be reflectively consistent
You: But they can precommit to be
Me: But that won't *actually make them so*
You: But they could precommit to acting as if they were
Me: Of course they can, but it still won't actually make them so.
See also Abraham Lincoln’s, “If you call a tail a leg, how many legs does a dog have? Four, because calling a tail a leg doesn’t make it so.”
See also Abraham Lincoln’s, “If you call a tail a leg, how many legs does a dog have? Four, because calling a tail a leg doesn’t make it so.”
This is a diversion, but this has always struck me as a stupid answer to an even stupider question. I don’t really understand why people think it’s supposed to reveal some deep wisdom.
This is a diversion, but this has always struck me as a stupid answer to an even stupider question. I don’t really understand why people think it’s supposed to reveal some deep wisdom.
That’s Zen for you. ;-)
Seriously, the point (for me, anyhow) is that System 2 thinking routinely tries to call a tail a leg, and I think there’s a strong argument to be made that it’s an important part of what system 2 reasoning “evolved for”.
Huh? Reflective consistency is a property of behavior. If you behave as if you are reflectively consistent, you are.
And I am saying that a single precommitment to behaving in a reflectively consistent way, will not result in you actually behaving in the same way as you would if you individually committed to all of the specific decisions recommended by your abstract decision theory. Your perceptions and motivation will differ, and therefore your actual actions will differ.
People try to precommit in this fashion all the time, by adopting time management or organizational systems that purport to provide them with a consistent decision theory over some subdomain of decisions. They hope to then simply commit to that system, and thereby somehow escape the need for making (and committing to) the individual decisions. This doesn’t usually work very well, for reasons that have nothing to do with which decision theory they are attempting to adopt.
In my original comment, I specified that I only consider the situations “where the calculations are available”, that is you know (theoretically!) exactly what to do to be reflectively consistent in such situations and don’t need to achieve great artistic feats to pull that off.
You need to qualify what you are asserting, otherwise everything looks gray.
I’m asserting that people don’t actually do what they “decide” to do on the abstract level of System 2, unless certain System 1 processes are engaged with respect to the concrete, “near” aspects of the situation where the behavior is to be executed, and that merely precommitting to follow a certain decision theory is not a substitute for the actual, concrete, System 1commitment processes involved.
Now, could you commit to following a certain behavior under certain circumstances, that included the steps needed to also obtain System 1 commitment for the decision?
That I do not know. I think maybe you could. It would depend, I think, on how concretely you could define the circumstances when these steps would be taken… and doing that in a way that was both concrete and comprehensive would likely be difficult, which is why I’m not so sure about its feasibility.
Your model of human behavior doesn’t look in the least realistic to me, with its prohibition of reason, and requirements for difficult rituals of baptising reason into action.
Your model of human behavior doesn’t look in the least realistic to me, with its prohibition of reason, and requirements for difficult rituals of baptising reason into action.
Well, I suppose all the experiments that have been done on construal theory, and how concrete vs. abstract construal affects action and procrastination must be unrealistic, too, since that is a major piece of what I’m talking about here.
(If people were generally good at turning their reasoning into action, akrasia wouldn’t be such a hot topic here and in the rest of the world.)
Akrasia happens, but it’s not a universal mode. I object to you implying that akrasia is inevitable.
I never said it was inevitable. I said it happens when there are conflicts, and you haven’t really decided what to do about those conflicts, with enough detail and specificity for System 1 to automatically make the “right” choice in context. If you want different results, it’s up to you to specify them for yourself.
Newcomb’s problem is a bad example to use here, because it depends on which math the person has committed to, e.g., Eliezer claims to have worked out a general analysis that justifies one-boxing...
They will be perceptually tuned to the math, not the situation, for example, and will not have the same level of motivation, due to a lack of personal stake in their decision.
The personal stake I envision is defending their concept of their own identity. “I will do this because that’s the kind of person I am.”
The personal stake I envision is defending their concept of their own identity. “I will do this because that’s the kind of person I am.”
Then their perception will be attuned to what kind of person they are, instead of the result. You can’t cheat your brain—it tunes in on whatever you’ve decided your “territory” is, whatever you “own”. This is not a generalized abstraction, but a concrete one.
You know how, once you buy a car, you start seeing that model everywhere? That’s an example of the principle at work. Notice that it’s not that you start noticing cars in general, you notice cars that look like yours. When you “own” a decision, you notice things specifically connected with that particular decision or goal, not “things that match a mathematical model of decision-making”. The hardware just isn’t built for that.
You also still seem to be ignoring the part where, if your decisions are made solely on the basis of any external data, then your decision is conditional and can change when the circumstances do, which is a bad idea if your real goal or intent is unconditional.
I’ve already mentioned how a conditional decision based on one’s weight leads to stop-and-start dieting, but another good example is when somebody decides to start an exercise program when they’re feeling well and happy, without considering what will happen on the days they’re running late or feeling depressed. The default response in such cases may be to give up the previous decision, since the conditions it was made under “no longer apply”.
What I’m saying is, it doesn’t matter what conditions you base a decision on, if it is based solely on conditions, and not on actually going through the emotional decision process to un-conditionalize it, then you don’t actually have a commitment to the course of action. You just have a conditional decision to engage in that course, until conditions change.
And the practical difference between a commitment and a conditional decision is huge, when it comes to one’s personal and individual goals.
Thank you for this interesting discussion. Although I posed the “emotionally committed to math” case as a specific hypothetical, many of the things you’ve written in response apply more generally, so I’ve got a lot more material to incorporate into my understanding of the pjeby model of cognition. (I know that’s a misnomer, but since you’re my main source for this material, that’s how I think of it.) I’m going to have to go over this exchange more thoroughly after I get some sleep.
Of course, there are presumably situations where one’s decision should change with the conditions. (I do get that there’s a trade-off between retaining the ability to change with the right conditions and opening yourself up to changing with the wrong conditions though.)
Of course, there are presumably situations where one’s decision should change with the conditions. (I do get that there’s a trade-off between retaining the ability to change with the right conditions and opening yourself up to changing with the wrong conditions though.)
The trade-off optimum is usually in making decisions aimed at producing concrete results, while leaving one’s self largely free to determine how to achieve those results. But again, the level of required specificity is determined by the degree of conflict you can expect to arise (temptations and frustrations).
This is similar to one problem Austrians have with conventional economics. They think the details of transactions are extremely important and that too much information is lost when they are aggregated in GDP and the like; more information than the weak utility of the aggregates can justify.
Re: Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can’t model all that crap with one number.
That is not a coherent criticism of utilitarianism. Do you understand what it is that you are criticising?
That is not a coherent criticism of utilitarianism. Do you understand what it is that you are criticising?
Yes, I do… and it’s not utilitarianism. ;-)
What I’m criticizing is the built-in System 2 motivation-comprehending model whose function is predicting the actions of others, but which usually fails when applied to self, because it doesn’t model all of the relevant System 1 features.
If you try to build a human-values-friendly AI, or decide what would be of benefit to a person (or people), and you base it on System 2′s model, you will get mistakes, because System 2′s map of System 1 is flawed, in the same way that Newtonian physics is flawed for predicting near-light-speed mechanics: it leaves out important terms.
Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can’t model all that crap with one number.
Of course you can.
It just won’t be a very good model.
What do you think would work better as a simplified model of utility, then? It seems you think that having orthogonal utility and disutility values would be a start.
Personally, this is where I think the idea of CEV has its biggest challenge: I know of no theoretical reason why humans must have convergent or consistent utility functions as individuals, let alone as a species.
It’s been a while since I looked at CEV, but I thought the “coherent” part was meant to account for this. It assumes we have some relatively widespread, fairly unambiguous preferences, which may be easier to see in the light of that tired old example, paperclipping the light cone. If CEV outputs a null utility function, that would seem to imply that human preferences are completely symmetrically distributed, which seems hard to believe.
If CEV outputs a null utility function, that would seem to imply that human preferences are completely symmetrically distributed, which seems hard to believe.
If by “null utility function”, you mean one that says, “don’t DO anything”, then do note that it would not require that we all have balanced preferences, depending on how you do the combination.
A global utility function that creates more pleasure for me by creating pain for you would probably not be very useful. Heck, a function that creates pleasure for me by creating pain for me might not be useful. Pain and pleasure are not readily subtractable from each other on real human hardware, and when one is required to subtract them by forces outside one’s individual control, there is an additional disutility incurred.
These things being the case, a truly “Friendly” AI might well decide to limit itself to squashing unfriendly AIs and otherwise refusing to meddle in human affairs.
These things being the case, a truly “Friendly” AI might well decide to limit itself to squashing unfriendly AIs and otherwise refusing to meddle in human affairs.
I wouldn’t be particularly surprised by this outcome.
Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can’t model all that crap with one number.
The study of affective synchrony shows that humans have simultaneously-active positive and negative affect systems. At extreme levels in either system, the other is shut down, but the rest of the time, they can support or oppose each other. (And in positions of opposition, we experience conflict and indecision.)
Meanwhile, the activation of these systems is influenced by current state/context/priming, as well as the envisioned future. So unless your attempt at modeling a utility function includes terms for all these things, you’re sunk.
(Personally, this is where I think the idea of CEV has its biggest challenge: I know of no theoretical reason why humans must have convergent or consistent utility functions as individuals, let alone as a species.)
I don’t really see why not (at least without further argument).
Relativity and contextuality introduce additional arguments into the utility function, they don’t imply that the output can’t be scalar. Lots of people include relativity and contextual concerns into scalar utility all the time.
Semi-independent positive and negative axes only prevent you from using scalar utility if you think they’re incommensurable. If you can assign weights to the positive and negative axes, then you can aggregate them into a single utility index. (How accurately you can do this is a separate question.)
Of course, if you do think there are fundamentally incommensurable values, then scalar utility runs into trouble.* Amartya Sen and others have done interesting work looking at plural/vector utility and how one might go about using it. (I guess if we’re sufficiently bad at aggregating different types of value, such methods might even work better in practice than scalar utility.)
* I’m sceptical; though less sceptical than I used to be. Most claims of incommensurability strike me as stemming from unwillingness to make trade-offs rather than inability to make trade-offs, but maybe there are some things that really are fundamentally incomparable.
I was pretty convinced for commensurability and thought cognitive biases would just introduce noise, but lack of success by me, and apparently by everyone else in this thread, changed my mind quite significantly.
Not knowing how to commensurate things doesn’t imply they’re incommensurable (though obviously, the fact that people have difficulty with this sort of thing is interesting in its own right).
As a (slight) aside, I’m still unclear about what you think would count as “success” here.
It’s not a hard implication, but it’s a pretty strong evidence against existence of traditional utility functions.
A success would be a list of events or states of reality and their weights, such that you’re pretty convinced that your preferences are reasonably consistent with this list, so that you know how many hours of standing in queues is losing 5kg worth and how much money is having one thousand extra readers of your blog worth.
It doesn’t sound like much, but I completely fail as soon as it goes out of very narrow domain, I’m surprised by this failure, and I’m surprised that others fail at this too.
I’m surprised at your surprise. Even granting that humans could possibly be innately reflectively self-consistent, there’s a huge curse of dimensionality problem in specifying the damn thing. ETA: The problem with the dimensionality is that interactions between the dimensions abound; ceteris paribus assumptions can’t get you very far at all.
I was expecting noise, and maybe a few iterations before reaching satisfying results, but it seems we cannot even get that much, and it surprises me.
My point was that even if you can make a tradeoff, you’re likely to have at least some disutility for Omega making that tradeoff for you, rather than letting you make the tradeoff yourself.
My own personal observation, though, is that people don’t usually make good tradeoffs by weighing and combining the utility and disutility for each of their options; they’re happier (and their lives are more generally functional), when they work to maximize utility and then satisfice disutility, in that order.
Our hardware doesn’t do well at cross-comparison, but it can handle, “Which of these do I like best?”, followed by “What am I willing to trade off to get what I like best?” (It can also handle the reverse, but that road leads to a dysfunctional and ever-shrinking “comfort zone”.)
I assume that this is because the two affect systems were intended for approach and avoidance of predators, prey, and mates, rather than making rational tradeoffs between a wide array of future options.
Each system is quite capable of ranking threats or opportunities within its own value system, but there doesn’t seem to be a register or readout in the system that can hold a “pleasure minus pain” value. What appears to happen instead is that the conscious mind can decide to switch off the negative input, if there’s an inner consensus that the worst-case downside is stlil manageable.
This mechanism, however, appears to only operate on one goal at at time; it doesn’t seem to work to try to cram all your options into it at once.
In the aggregate, these mechanisms would be really difficult to model, since the disutility/worst-case scenario check often depends on the examination of more than one possible future and contemplating possible mitigations, risks, etc.
I guess what I’m trying to say is that not only is goal+context important, there’s also a cross-time or over-time input component as well, and that I don’t really see anything that allows a person’s preferences to be absolute, because the “tradeoff” part is something that can happen consciously or unconsciously, and is very sensitive to the steps undertaken to make the tradeoff. But despite this sensitivity, the emotional affect of having made the choice is the same—we defend it, because we own it.
In contrast, a rational weighing of ratios and scores can easily produce a different felt-sensation about the decision: one of not really having decided at all!
If a person “decides” based only on the objective/numerical criteria (even if this includes scoring and weighing his or her emotional responses!), this ownership/territory mechanism does not kick in, with resulting negative consequences for that person’s persistence and commitment.
For example, if you “decide” to go on a diet because you’re 20 pounds overweight, you may stop eating healthily (or at least cease to do so consistently) as you approach your desired weight.
Now, that’s not to say you can’t weigh all the objective information, and then make a decision that’s not conditional upon those facts, or is conditional upon those facts only at the point of time you originally received them. I’m just saying that if you just weigh up the facts and “let the facts decide”, you are just begging for an akrasia problem.
This is why, btw, effective decision makers and successful people tend to talk about “listening to all the input first, and then making their own decision”.
It’s because they need to make it theirs—and there’s no way for the math to do that, because the calculation has to run on the right brain hardware first. And the mathematical part of your brain ain’t it.
OK, there’s a lot of food for thought in there, and I can’t possibly hope to clarify everything I’d ideally like to, but what I think you’re saying is:
it’s theoretically possible to think about utility as a single number; but
it’s nonetheless a bad idea to do so, because (a) we’re not very good at it, and (b) thinking about things mathematically means we won’t “own” the decision, and therefore leads to akrasia problems
(FWIW, I was only claiming 1.) I’m fairly sympathetic to 2(a), although I would have thought we could get better at it with the right training. I can see how 2(b) could be a problem, but I guess I’m not really sure (i) that akrasia is always an issue, and (ii) why (assuming we could overcome 2(a)) we couldn’t decide mathematically, and then figure out how to “own” the decision afterwards. (This seems to have worked for me, at least; and stopping to do the math has at sometimes stopped me “owning” the wrong decision, which can be worse than half-heartedly following through on the right one.)
P.S.
I didn’t think anyone was suggesting Omega should make the trade-off. I certainly wasn’t.
To own it, you’d need to not mathematically decide; the math could only ever be a factor in your decision. There’s an enormous gap between, “the math says do this, so I guess I’ll do that”, and “after considering the math, I have decided to do this.” The felt-experience of those two things is very different, and it’s not merely an issue of using different words.
Regarding getting better at making decisions off of mathematics, I think perhaps you miss my point. For humans, the process by which decision-making is done, has consequences for how it’s implemented, and for the person’s experience and satisfaction regarding the decision itself. See more below...
I’d like to see an actual, non-contrived example of that. Mostly, my experience is that people are generally better off with a 50% plan executed 100% than a 100% plan executed 50%. It’s a bit of a cliche—one that I also used to be skeptical/cynical about—but it’s a cliche because it’s true. (Note also that in the absence of catastrophic failure, the primary downside of a bad plan is that you learn something, and you still usually make some progress towards your goals.)
It’s one of those places where in theory there’s no difference between theory and practice, but in practice there is. We just think differently when we’re considering something from when we’re committed to it—our brains just highlight different perceptions and memories for our attention, so much so that it seems like all sorts of fortunate coincidences are coming your way.
Our conscious thought process in System 2 is unchanged, but something on the System 1 level operates differently with respect to a decision that’s passed through the full process.
I used to be skeptical about this, before I grasped the system 1/system 2 distinction (which I used to call the “you” (S2) vs. “yourself” (S1) distinction). I assumed that I could make a better plan before deciding to do something or taking any action, and refused to believe otherwise. Now I try to plan just enough to get S1 buy-in, and start taking action so I can get feedback sooner.
Sure. I don’t think this is inconsistent with what I was suggesting, which was really just that that the math could start the process off.
All of which I agree with; but again, I don’t see how this rules out learning to use math better.
Fair enough. The examples I’m thinking of typically involve “owned” decisions that are more accurately characterised as 0% plans (i.e. do nothing) or -X% plans (i.e. do things that are actively counterproductive).
How do you decide what to get S1 to buy in to?
What do you do in situations where feedback comes too late (long term investments with distant payoffs) or never (e.g. ethical decisions where the world will never let you know whether you’re right or not).
P.S. Yes, I’m avoiding the concrete example request. I actually have a few, but they’d take longer to write up than I have time available at the moment, and involve things I’m not sure I’m entirely comfortable sharing.
I already explained: you select options by comparing their positive traits. The devil is in the details, of course, but as you might imagine I do entire training CDs on this stuff. I’ve also written a few blog articles about this in the past.
I don’t understand the question. If you’re asking how I’d know whether I made the best possible decision, I wouldn’t. Maximizers do very badly at long-term happiness, so I’ve taught myself to be a satisficer. I assume that the decision to invest something for the long term is better than investing nothing, and that regarding an ethical decision I will know by the consequences and my regrets or lack thereof whether I’ve done the “right thing”… and I probably won’t have to wait very long for that feedback.
One can imagine a person who has committed emotionally to the maxim “shut up and multiply (when at all possible)” and made it an integral part of their identity. For such an individual, the commitment precedes the act of doing the math, and the enormous gap referred to above does not exist.
If such an individual existed, they would still have the same problem of shifting decisions, unless they also included a commitment to not recalculate before a certain point.
Consider, e.g. Newcomb’s problem. If you do the calculation before, you should one-box. But doing the calculation at the actual time, means you should two-box.
So, to stick to their commitments, human beings need to precommit to not revisiting the math, which is a big part of my point here.
Your hypothetical committed-to-the-math person is not committed to their “decisions”, they are committed to doing what the math says to do. This algorithm will not produce the same results as actual commitment will, when run on human hardware.
To put it more specifically, this person will not get the perceptual benefits of a committed decision for decisions which are not processed through the machinery I described earlier. They will be perceptually tuned to the math, not the situation, for example, and will not have the same level of motivation, due to a lack of personal stake in their decision.
In theory there’s no difference between theory and practice, but in practice there is. This is because System 2 is very bad at intuitively predicting System 1′s behavior, as we don’t have a built-in reflective model of our own decision-making and motivation machinery. Thus, we don’t know (and can’t tell) how bad our theories are without comparing decision-making strategies across different people.
This is incorrect. You are doing something very wrong if changing the time when you perform a calculation changes the result. That’s an important issue in decision theory being reflectively consistent.
That’s the major point I’m making: that humans are NOT reflectively consistent without precommitment… and that the precommitment in question must be concretely specified, with the degree of concreteness and specificity required being proportional to the degree of “temptation” involved.
That may usually be the case, but this is not a law. Certain people could conceivably precommit to being reflectively consistent, to follow the results of calculations whenever the calculations are available.
Of course they could. And they would not get as good results from either an experiential or practical perspective as the person who explicitly committed to actual, concrete results, for the reasons previously explained.
The brain makes happen what you decide to have happen, at the level of abstraction you specify. If you decide in the abstract to be a good person, you will only be a good person in the abstract.
In the same way, if you “precommit to reflective consistency”, then reflective consistency is all that you will get.
It is more useful to commit to obtaining specific, concrete, desired results, since you will then obtain specific, concrete assistance from your brain for achieving those results, rather than merely abstract, general assistance.
Edit to add: In particular, note that a precommitment to reflective consistency does not rule out the possibility of one’s exercising selective attention and rationalization as to which calculations to perform or observe. This sort of “commit to being a certain kind of person” thing tends to produce hypocrisy in practice, when used in the abstract. So much so, in fact, that it seems to be an “intentionally” evolved mechanism for self-deception and hypocrisy. (Which is why I consider it a particularly heinous form of error to try to use it to escape the need for concrete commitments—the only thing I know of that saves one from hypocrisy!)
I can’t understand you.
A person who decides to be “a good person” will selectively perceive those acts that make them a “good person”, and largely fail to perceive those that do not, regardless of the proportions of these events, or whether these events are actually good in their effects. They will also be more likely perceive to be good, anything that they already want to do or which benefits them, and they will find ways to consider it a higher good to refrain from doing anything they’d rather not do in the first place.
Similarly, a person who decides to be “reflectively consistent” will not only selectively perceive their acts of reflective consistency, they will also fail to observe the lopsided way in which they apply the concept, nor will they notice how their “reflective consistency” is not, in itself, achieving any other results or benefits for themselves or others.
Brains operate on the level of abstraction you give them, so the more abstract the goal, the less connected to reality the results will be, and the more wiggle room there will be for motivated reasoning and selective perception.
So in theory you can precommit to reflective consistency, but in practice you will only get an illusion of reflective consistency.
(Edit to add: If you’re still confused by this, it’s probably because you’re thinking about thinking, and I’m talking about actual behavior.)
I can’t speak for Vladimir, but from my perspective, this is much clearer now. Thanks!
(ETA: FWIW, while most of your comments on this post leave me with a sense that you have useful information to share, I’ve also found them somewhat frustrating, in that I really struggle to figure out exactly what it is. I don’t know if this is your writing style, my slow-wittedness, or just the fact that there’s a lot of inferential distance between us; but I just thought it might be useful for you to know.)
Since I’m trying to rapidly summarize a segment of what Robert Fritz took a couple of books to get across to me (“The Path of Least Resistance” and “Creating”), inferential distance is likely a factor.
It’s mostly his model of decisionmaking and commitment that I’m describing, with a few added twists of mine regarding the ranking bit, and the “worst that could happen” part, as well as links from it to the System 1⁄2 model. (And of course I’ve been talking about Fritz’s idea of the ideal-belief-reality-conflict in other threads, and that relates here as well.)
Basically, our conversation went like this:
You: People can’t be reflectively consistent.
Me: Yes they can, sometimes.
You: Of course they can.
Me: I’m confused.
You: Of course people can be reflectively consistent. But only in the dreamland. If you are still confused, it’s probably because you are still thinking about the dreamland, while I’m talking about reality.
I think pjeby’s point was that reflective consistency is a way of thinking—so if you commit to thinking in a reflectively consistent way, you will think in that way when you think, but you may still wind up not acting according to that kind of thoughts every time you would want to, because you’re not entirely likely to notice that you need to think them in the first place.
Reflective consistency is not about a way of thinking. Decision theory, considered in the simplest case, talks about properties of actions, including future actions, while ignoring properties of the algorithm generating the actions.
No, it went like this:
See also Abraham Lincoln’s, “If you call a tail a leg, how many legs does a dog have? Four, because calling a tail a leg doesn’t make it so.”
This is a diversion, but this has always struck me as a stupid answer to an even stupider question. I don’t really understand why people think it’s supposed to reveal some deep wisdom.
That’s Zen for you. ;-)
Seriously, the point (for me, anyhow) is that System 2 thinking routinely tries to call a tail a leg, and I think there’s a strong argument to be made that it’s an important part of what system 2 reasoning “evolved for”.
Huh? Reflective consistency is a property of behavior. If you behave as if you are reflectively consistent, you are.
And I am saying that a single precommitment to behaving in a reflectively consistent way, will not result in you actually behaving in the same way as you would if you individually committed to all of the specific decisions recommended by your abstract decision theory. Your perceptions and motivation will differ, and therefore your actual actions will differ.
People try to precommit in this fashion all the time, by adopting time management or organizational systems that purport to provide them with a consistent decision theory over some subdomain of decisions. They hope to then simply commit to that system, and thereby somehow escape the need for making (and committing to) the individual decisions. This doesn’t usually work very well, for reasons that have nothing to do with which decision theory they are attempting to adopt.
In my original comment, I specified that I only consider the situations “where the calculations are available”, that is you know (theoretically!) exactly what to do to be reflectively consistent in such situations and don’t need to achieve great artistic feats to pull that off.
You need to qualify what you are asserting, otherwise everything looks gray.
I’m asserting that people don’t actually do what they “decide” to do on the abstract level of System 2, unless certain System 1 processes are engaged with respect to the concrete, “near” aspects of the situation where the behavior is to be executed, and that merely precommitting to follow a certain decision theory is not a substitute for the actual, concrete, System 1commitment processes involved.
Now, could you commit to following a certain behavior under certain circumstances, that included the steps needed to also obtain System 1 commitment for the decision?
That I do not know. I think maybe you could. It would depend, I think, on how concretely you could define the circumstances when these steps would be taken… and doing that in a way that was both concrete and comprehensive would likely be difficult, which is why I’m not so sure about its feasibility.
Your model of human behavior doesn’t look in the least realistic to me, with its prohibition of reason, and requirements for difficult rituals of baptising reason into action.
Well, I suppose all the experiments that have been done on construal theory, and how concrete vs. abstract construal affects action and procrastination must be unrealistic, too, since that is a major piece of what I’m talking about here.
(If people were generally good at turning their reasoning into action, akrasia wouldn’t be such a hot topic here and in the rest of the world.)
Akrasia happens, but it’s not a universal mode. I object to you implying that akrasia is inevitable.
I never said it was inevitable. I said it happens when there are conflicts, and you haven’t really decided what to do about those conflicts, with enough detail and specificity for System 1 to automatically make the “right” choice in context. If you want different results, it’s up to you to specify them for yourself.
Newcomb’s problem is a bad example to use here, because it depends on which math the person has committed to, e.g., Eliezer claims to have worked out a general analysis that justifies one-boxing...
The personal stake I envision is defending their concept of their own identity. “I will do this because that’s the kind of person I am.”
Then their perception will be attuned to what kind of person they are, instead of the result. You can’t cheat your brain—it tunes in on whatever you’ve decided your “territory” is, whatever you “own”. This is not a generalized abstraction, but a concrete one.
You know how, once you buy a car, you start seeing that model everywhere? That’s an example of the principle at work. Notice that it’s not that you start noticing cars in general, you notice cars that look like yours. When you “own” a decision, you notice things specifically connected with that particular decision or goal, not “things that match a mathematical model of decision-making”. The hardware just isn’t built for that.
You also still seem to be ignoring the part where, if your decisions are made solely on the basis of any external data, then your decision is conditional and can change when the circumstances do, which is a bad idea if your real goal or intent is unconditional.
I’ve already mentioned how a conditional decision based on one’s weight leads to stop-and-start dieting, but another good example is when somebody decides to start an exercise program when they’re feeling well and happy, without considering what will happen on the days they’re running late or feeling depressed. The default response in such cases may be to give up the previous decision, since the conditions it was made under “no longer apply”.
What I’m saying is, it doesn’t matter what conditions you base a decision on, if it is based solely on conditions, and not on actually going through the emotional decision process to un-conditionalize it, then you don’t actually have a commitment to the course of action. You just have a conditional decision to engage in that course, until conditions change.
And the practical difference between a commitment and a conditional decision is huge, when it comes to one’s personal and individual goals.
Thank you for this interesting discussion. Although I posed the “emotionally committed to math” case as a specific hypothetical, many of the things you’ve written in response apply more generally, so I’ve got a lot more material to incorporate into my understanding of the pjeby model of cognition. (I know that’s a misnomer, but since you’re my main source for this material, that’s how I think of it.) I’m going to have to go over this exchange more thoroughly after I get some sleep.
Of course, there are presumably situations where one’s decision should change with the conditions. (I do get that there’s a trade-off between retaining the ability to change with the right conditions and opening yourself up to changing with the wrong conditions though.)
The trade-off optimum is usually in making decisions aimed at producing concrete results, while leaving one’s self largely free to determine how to achieve those results. But again, the level of required specificity is determined by the degree of conflict you can expect to arise (temptations and frustrations).
This is similar to one problem Austrians have with conventional economics. They think the details of transactions are extremely important and that too much information is lost when they are aggregated in GDP and the like; more information than the weak utility of the aggregates can justify.
Re: Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can’t model all that crap with one number.
That is not a coherent criticism of utilitarianism. Do you understand what it is that you are criticising?
Yes, I do… and it’s not utilitarianism. ;-)
What I’m criticizing is the built-in System 2 motivation-comprehending model whose function is predicting the actions of others, but which usually fails when applied to self, because it doesn’t model all of the relevant System 1 features.
If you try to build a human-values-friendly AI, or decide what would be of benefit to a person (or people), and you base it on System 2′s model, you will get mistakes, because System 2′s map of System 1 is flawed, in the same way that Newtonian physics is flawed for predicting near-light-speed mechanics: it leaves out important terms.
Of course you can.
It just won’t be a very good model.
What do you think would work better as a simplified model of utility, then? It seems you think that having orthogonal utility and disutility values would be a start.
It’s been a while since I looked at CEV, but I thought the “coherent” part was meant to account for this. It assumes we have some relatively widespread, fairly unambiguous preferences, which may be easier to see in the light of that tired old example, paperclipping the light cone. If CEV outputs a null utility function, that would seem to imply that human preferences are completely symmetrically distributed, which seems hard to believe.
If by “null utility function”, you mean one that says, “don’t DO anything”, then do note that it would not require that we all have balanced preferences, depending on how you do the combination.
A global utility function that creates more pleasure for me by creating pain for you would probably not be very useful. Heck, a function that creates pleasure for me by creating pain for me might not be useful. Pain and pleasure are not readily subtractable from each other on real human hardware, and when one is required to subtract them by forces outside one’s individual control, there is an additional disutility incurred.
These things being the case, a truly “Friendly” AI might well decide to limit itself to squashing unfriendly AIs and otherwise refusing to meddle in human affairs.
I wouldn’t be particularly surprised by this outcome.