A novel approach to axiomatic decision theory
In the standard approach to axiomatic Bayesian decision theory, an agent (a decision maker) doesn’t prefer Act #1 to Act #2 because the expected utility of Act #1 exceeds that of Act #2. Instead, the agent states its preferences over a set of risky acts, and if these stated preferences are consistent with a certain set of axioms (e.g. the VNM axioms, or the Savage axioms), it can be proven that the agent’s decisions can be described as if the agent were assigning probabilities and utilities to outcomes and then maximizing expected utility. (Let’s call this the ex post approach.)
Peterson (2004) introduces a different approach, which he calls the ex ante approach. In many ways, this approach is more intuitive. The agent assigns probabilities and utilities directly to outcomes (not acts), and these assignments are used to generate preferences over acts. Using this approach, Peterson claims to have shown that the principle of expected utility maximization can be derived from just four axioms.
As Peterson (2009:75,77) explains:
The aim of the axiomatization [in the ex ante approach] is to show that the utility of an act equals the expected utility of its outcomes.
...The axioms… entail that the utility of an act equals the expected utility of its outcomes. Or, put in slightly different words, the act that has the highest utility (is most attractive) will also have the highest expected utility, and vice versa. This appears to be a strong reason for letting the expected utility principle guide one’s choices in decision under risk.
Jensen (2012:428) calls the ex ante approach “controversial,” but I can’t find any actual published rebuttals to Peterson (2004), so maybe Jensen just means that Peterson’s result is “new and not yet percolated to the broad community.”
Peterson (2008) explores the ex ante approach in more detail, under the unfortunate title of “non-Bayesian decision theory.” (No, Peterson doesn’t reject Bayesianism.) Cozic (2011) is a review of Peterson (2008) that may offer the quickest entry point into the subject of ex ante axiomatic decision theory.
Peterson (2009:210) illustrates the controversy nicely:
...even if [this] discussion may appear a bit theoretical… the controversy over [ex post and ex ante approaches] is likely to have important practical implications. For example, a forty-year-old woman seeking advice about whether to, say, divorce her husband, is likely to get very different answers from the [two approaches]. The [ex post approach] will advise the woman to first figure out what her preferences are over a very large set of risky acts, including the one she is thinking about performing, and then just make sure that all preferences are consistent with certain structural requirements. Then, as long as none of the structural requirements is violated, the woman is free to do whatever she likes, no matter what her beliefs and desires actually are… The [ex ante approach] will [instead] advise the woman to first assign numerical utilities and probabilities to her desires and beliefs, and then aggregate them into a decision by applying the principle of maximizing expected utility.
I’m not a decision theory expert, so I’d be very curious to hear what LW’s decision theorists think of the axiomatization in Peterson (2004) — whether it works, and how significant it is.
- 15 Sep 2013 1:49 UTC; 2 points) 's comment on What is the best paper explaining the superiority of Bayesianism over frequentism? by (
- 12 Jan 2013 18:30 UTC; 0 points) 's comment on A fungibility theorem by (
Off-topic, but I cannot resist sharing Peterson’s story of how he became interested in decision theory:
I spent quite a few hours going through Peterson’s 2008 book (online copy available here) to see if there were any interesting ideas, and found the time largely wasted. (This was my initial intuition, but I thought I’d take a closer look since Luke emailed me directly to ask me to comment.) It would take even more time to write up a good critique, so I’ll just point out the most glaring problem: Peterson’s proposal for how to derive a utility function from one’s subjective uncertainty about one’s own choices, as illustrated in this example:
What if we apply this idea to the choice between $20 and $30?
Peterson tries to solve this problem in section 5.3, but his solution makes no sense. From page 90:
So we end up with u($20)=1/3, u($30)=u(photo)=1, u($40)=3. But this utility function now implies that given a choice between $20 and $30, you’d choose $20 with probability 1⁄4, and $30 with probability 3⁄4, contradicting the initial assumption that you’d choose $30 with certainty. I have no idea how Peterson failed to notice this.
I’ve only read your comment, not anything by Peterson, so I’m just asking for clarification on what he claims to do:
In your first quote of him, he claims to derive utilities from a certain kind of subjective probability. But does he also claim to make the converse derivation? That is, does he also claim to derive those same subjective probabilities from utilities as you do in your final paragraph? It’s not clear to me that your first quote of him commits him to doing this.
No, he doesn’t commit to doing this, but taking this defense doesn’t really save his idea. Because what if instead of thinking I’d take $30 over $20 with probability 1, I think I’d make that choice with probability 0.99. Now u($30)/u($20) has to be 99, but u($30)=u(photo) and u(photo)/u($20)=3 still hold, so we can no longer obtain a consistent utility function using Peterson’s proposal. What sense does it make that we can derive a utility function if the probability of taking $30 over $20 is either 1 or 3⁄4, but not anything else? As far as I can tell, there is no reason to expect that our actual beliefs about hypothetical choices like these are such that Peterson’s proposal can output a utility function from them, and he doesn’t address the issue in his book.
It seems that he should account for the fact that this subjective probability will update. For example, you quoted him as saying
But once I know that u(salmon)/u(tuna) = 2, I know that I will choose salmon over tuna. I therefore no longer assign the prior subjective probabilities that led me to this utility-ratio. I assign a new posterior subject probability — namely, certainty that I will choose tuna. This new subjective probability can no longer be used to derive the utility-ratio u(salmon)/u(tuna) = 2. I have learned the utility-ratio, but, in doing so, I have destroyed the state of affairs that allowed me to learn it. I might remember how I derived the utility-ratio, but I can no longer re-derive it in the same way. I have, as it were, “burned up” some of my prior subjective uncertainty, so I can’t use it any more.
Now suppose that I am so unfortunate as to forget the value of the utility-ratio u(salmon)/u(tuna). However, I still retain the posterior subjective certainty that I choose salmon over tuna. Now how am I going to get that utility-ratio back? I’m going to have to find some other piece of prior subjective uncertainty to “burn”. For example, I might notice some prior uncertainty about whether I would choose salmon over $5 and about whether I would choose tuna over $5. Then I could proceed as Peterson describes in the photo example.
So maybe Peterson’s proposal can be saved by distinguishing between prior and posterior subjective probabilities for my choices in this way. Prior probabilities would be required to be consistent in the following sense: If
my prior odds of choosing A over B are 1:1, and
my prior odds of choosing B over C are 3:1,
then
my prior odds of choosing A over C have to be 3:1.
Thus, in the photo example, the prior probability of taking $30 over $20 has to be 3⁄4, given the other probabilities. It’s not allowed to be 0.99. But the posterior probability is allowed to be 1, provided that I’ve already “burned up” some piece of prior subjective uncertainty to arrive at that certainty. In this way, perhaps, it makes sense to say that “the probability of taking $30 over $20 is either 1 or 3⁄4, but not anything else”.
I wrote,
But, on reflection, the possibility of forgetting knowledge is probably a can of worms best left unopened. For, one could ask what would happen if I remembered that I was highly confident that I would choose salmon over tuna, but I forgot that I was absolutely certain about this. It would then be hard to see how to avoid inconsistent utility functions, as you describe.
Perhaps it’s better to suppose that you’ve shown, by some unspecified means, that u(salmon) > u(tuna), but that you did so without computing the exact utility-ratio. Then you become certain that you choose salmon over tuna, but you no longer have the prior subjective uncertainty that you need to compute the ratio u(salmon)/u(tuna) directly. That’s the kind of case where you might be able to find some other piece of prior subjective uncertainty, as I describe in the above paragraph.
Prediction from reading this post before I delved into the paper: the controversy is going to be about psychology, not decision theory. (After delving into the paper: I’m going to go with ‘prediction confirmed.’)
So, he uses six axioms. How do they map onto Howard’s 5 that I used? It looks like his 0 is essentially “you can state the problem,” his 1 and 2 are my choice, his 3 and 4 don’t seem to have mirrors, and his 5 is my equivalence. I find it a little worrisome that three of my axioms don’t appear to show up in his- probability, order, and substitution- except possibly in 0. They’re clearly present in his analysis, but they feel like things that should be axioms instead of just taken for granted, and it’s not clear to me why he needed to raise 3 and 4 to the level of axioms.
It’s also not clear to me why he puts such emphasis on the independence and “sure-thing” principle. The “sure-thing” principle is widely held to only apply to a certain class of utility functions / “sure things”, and there’s not a good reason to expect people should or do have those utility functions. (Outcomes, properly understood, are the entire future- and so a game in which I flip a coin and you win or lose $100 can be different from a game in which I flip a coin and give you either $0 or $200, because you’re $100 richer in the second game.) Similarly, independence only holds for normatively correct probability functions, and so if you allow normatively incorrect probability functions you have to throw out independence.
(The difference between independence and “sure-thing” is that “sure-thing” applies a transformation to all of the outcomes, which may not be the same transformation to all of the utilities of those outcomes, and independence applies a transformation to all of the probabilities, which will be the same transformation to all of the act utilities for normatively correct probability functions.)
For the controversy, I’ll quote him directly:
The probability complaint is uninteresting. The “how do we measure utilities?” complaint is serious but a little involved to discuss.
Basically, there are three branches of rationality: epistemic, instrumental, and terminal (I’m not sure I like “terminal rationality” as a name; please suggest alternatives): thinking about uncertainties and probabilities, thinking about actions, and thinking about outcomes.
Peterson’s root complaint is that traditional decision theory is silent on the valuable parts of terminal rationality- it only ensures that your elicited preferences are consistent and then uses them. If they’re consistent but insane, the expected utility maximization won’t throw up any error flags (for example, Danzig’s diet optimization which prescribed 500 gallons of vinegar a day), because checking for sanity is not its job.
But pointing that complaint at decision theory seems mistaken, because it’s a question of how you build the utility function. The traditional approach he describes in lukeprog’s post above (the woman considering divorce) uses casuistry, expecting that people can elicit their preferences well about individual cases and then extrapolate. I think the approach he prefers is deconstruction- isolate the different desires that are relevant to the outcomes in question, construct (potentially nonlinear) tradeoffs between them, then order the outcomes, then figure out the optimal action. The first checks primarily for internal consistency; the second checks primarily for reflective equilibrium. But both can do each, and they can be used as complements instead of substitutes.
TL;DR: There does not appear to be meat to the controversy over axioms, and if there is then Peterson’s axioms strike me as worse than Howard’s and possibly worse than vNM or Savage. There is meat to the controversy over discovering utility functions, but I don’t think the 2004 paper you link is a valuable addition to that controversy. Compare it to the chapter on choice under certainty from Thinking and Deciding.
I’m confused by several parts of your reply. I’ll select just two of them for discussion. Perhaps you’ll have the motivation to try to un-confuse me.
I don’t understand your points about the axioms he uses. He uses the axioms required to derive the useful results he aimed for, given his approach to formalizing decision problems, and no more than that. Do you reject the plausibility of one of his axioms, or do you disagree that his results follow from his axioms?
I don’t think Peterson denies the usefulness of traditional axiomatic decision theory for checking the consistency of one’s preferences, he’s just saying that it would also be nice to have a decision theory that can tell you what you should choose given what you believe and what you value. Indeed, this is what many/most people actually do when trying to make decisions “rationally,” but this norm wasn’t justified with an axiomatic approach until Peterson (as far as I can tell).
Can you give me two examples of useful results he derives from the axioms? That’ll help me target my response. (I should note that the commentary in the grandparent is targeted at the 2004 paper in the context of the other things you’ve quoted on this page; if there’s relevant material in one of the other links I probably missed it.)
Agreed. In this comment I want to differentiate between “decision theory” and a component of it, “expected utility theory” (I didn’t differentiate between them in the grandparent). The first studies how to make decisions, and the second studies a particular mathematical technique to isolate the highest scoring of a set of alternative actions. My claim is that expected utility theory is and should be silent on the design of human-appropriate utility functions, but that decision theory should include a component focused on the design of human-appropriate utility functions. That component will be primarily researched by psychologists- what makes humans happy, what do humans want, how do we align those, what common mistakes do humans make, what intuitions do humans have and when are those useful, and so on.
Peterson’s axioms look to me like trying to shoehorn human-appropriate utility functions into expected utility theory, which doesn’t seem to augment the math of calculating expected utilities or augment the actual design of human-appropriate utility functions. As far as I can tell, that field is too young to profit from an axiomatic approach.
But I said “profit” from axioms and you said “justified” with axioms, and those are different things. It’s not clear to me that Peterson’s axioms are useful at justifying the use of expected utility theory, and my hesitance hinges on the phrase “given what you believe and what you value” from the parent. That means that’s Peterson’s decision theory takes your beliefs and values as inputs and outputs decisions- which is exactly what traditional decision theory does, and so they look the same to me (and if they’re different, I think it’s because Peterson made his worse, not better). The underlying problem as I see it is that beliefs and values are not given, they have to be extracted- and traditional decision theory underestimated the difficulty of that extraction.
(Side note: decision theory underestimating the difficulty and decision theorists underestimating the difficulty are very different things. Indeed, it’s likely that decision theorists realized the problem was very hard, and so left it to the reader so they wouldn’t have to do it!)
Then the question is how much Peterson 2004 helps its readers extract their beliefs and values. As far as I can tell, there’s very little normative or prescriptive content.
What do you mean by “the design of human-appropriate utility functions”?
Actually, let me show you a section of Peterson (2009), which is an updated and (I think) clearer presentation of his axiomatic ex ante approach. It is a bit informal, but is mercifully succinct. (The longer, formal presentation is in Peterson 2008). Here is a PDF I made of the relevant section of Peterson (2009). It’s a bit blurry, but it’s readable.
A utility function that accurately reflects the beliefs and values of the human it’s designed for. Someone looking for guidance would get assistance in discovering what their beliefs and values about the situation are, rather than just math help and a consistency check. Similarly, someone could accidentally write a utility function that drowns them in vinegar, and it would be nice if the decision-making apparatus noticed and didn’t.
That’s my interpretation of “he’s just saying that it would also be nice to have a decision theory that can tell you what you should choose given what you believe and what you value.”
This looks like it boils down to “the utility of an act is the weighted sum of the utility of its consequences.” It’s not clear to me what good formulating it like that does, and I don’t like that axiom 4 from the 2009 version looks circular. (You’re allowed to adjust the utility of different equiprobable outcomes so long as the total utility of the act is preserved. But, uh, aren’t we trying to prove that we can calculate the utility of an act with multiple possible outcome utilities, and haven’t we only assumed that it works for acts with only one possible outcome utility?)
Was Thm 4.1 an example of a useful result?
Suggestions for replacing “terminal”:
“outcome”; “ultimate” (or “ultima” if you prefer Latin); “intrinsic” (to draw a contrast with “instrumental”); “telikos” (transliteration of the Greek for final, ultimate, terminal, or last); “endpoint”; “consequence”; “effect”; “impact”; “goal.”
Do any of those sound better to you?
Edit—slight change owing to formatting.
I like “goal”, and think I like “value” even more. Value rationality?
I’m not sure about “value” in this context. The term “value” could attach to either acts or outcomes, I think. So, if the goal is to distinguish rationality that cares about acts first from rationality that cares about outcomes first, then “value” doesn’t seem to do a very good job. Does that sound right to you, or am I missing something about the distinction that you want to draw?
That’s convinced me that “goal” is clearer than “value.”
Peterson’s way of formally representing a decision problem also seems more helpful to me than the ways proposed by Jeffrey and Savage. Peterson (2008) explains:
Hmm, if I’ve understood this correctly, it’s the way I’ve always thought about decision theory for as long as I’ve had a concept of expected utility maximisation. Which makes me think I must have missed some important aspect of the ex post version.
The objection to the standard approach is put nicely in chapter 10 of Peterson (2009):
I don’t currently view the two objections in the second-to-last paragraph above as necessitating a non-Bayesian decision theory, but the original concern that Bayesian decision theory doesn’t technically offer any action-guidance is serious, and the primary motivation for my interest in Peterson’s ex ante approach.
It’s not obvious to me that the ex ante approach would offer more action-guidance for FAI. Our preferences over acts seem easier to observe than our internal utilities over outcomes. An extrapolation effort might use both kinds of data, of course.
For the moment I was just thinking of the ex ante approach in the context of offering action guidance to humans. The ex post approach can’t offer any direct advice for what to do because an agent that can state its preferences over acts already knows what to do. What I want to do is state how much I value different outcomes and what probability distributions I have over states of affairs, and have a decision theory tell me which action I can take to maximize my expected utility. It seems that Peterson’s ex ante approach is the only approach that can provide this for me.
The final page of Peterson (2009) is also quotable: