On the purely mathematical side, I’ve an issue with the theorem as stated : it says :
then if P is pareto optimal,
such as P is a maximum of sum_{i=1}{i=n}c_itimesv_i .
Which is widely different from :
such as
then if P is pareto optimal, then P is a maximum of sum_{i=1}{i=n}c_itimesv_i .
In the way the theorem is stated, you don’t have a utility function with fixed coefficients you can use for every situation, but for every situation, you can find a set of coefficient that will work, which is not what being an optimizer is.
See my reply to Wei Dai’s comment. If you have a prior over which situations you will face, and if you’re able to make precommitments and we ignore computational difficulties, then there is only one situation. If you could decide now which decision rule you’ll use in the future, then in a sense that would be the last decision you ever make. And a decision rule that’s optimal with respect to a particular utility function is one that makes every subsequent decision using that same utility function.
From the vantage point of an agent with a prior today, the best thing it can do is adopt a utility function and precommit to maximizing it from now on no matter what. I hope that’s more clear.
In the way the theorem is stated, you don’t have a utility function with fixed coefficients you can use for every situation, but for every situation, you can find a set of coefficient that will work, which is not what being an optimizer is.
P is the space of situations*, and v_i is the space of individual preferences over situations.
The actual theorem says that, for any particular situation and any particular individual, you can find a personal weighting that aggregates their preferences over that situation, and this method is guaranteed to return Pareto optimal solutions. (Choosing between Pareto optimal solutions is up to you, and is done by your choice of weightings.)
Your second version says for any particular situation, there exists a magic weighting which will aggregates the preferences of any possible individual in a way that returns solutions which are simultaneously maximized by all Pareto optimal solutions any agent produces.
Of course, there is such a magic weighting. It is the vector of all zeros, because every point in P maximizes that function, and so the Pareto optimal points will as well.
* Well, strictly, it’s the space of “policies,” which is a combination of what will happen to the agent and how they will respond to it, which we’re describing elsewhere as a “world history.”
Hum, yes, indeed I got the P and V_i backwards, sorry.
The argument still holds, but with the other inversion between the \forall and the \exists :
such as forallP, then if P is pareto optimal, then P is a maximum of sum_{i=1}{i=n}c_itimesv_i .
Having an utility function means the weighting (the c_i) can vary between each individuals, but not between situations. If for each situation (“world history” more exactly) you chose a different set of coefficients, it’s no longer an utility function—and you can make about anything with that, just choosing the coefficients you want.
That doesn’t work, because v_i is defined as a mapping from P to the reals; if you change P, then you also change v_i, and so you can’t define them out of order.
I suspect you’re confusing p, the individual policies that an agent could adopt, and P, the complete collection of policies that the agent could adopt.
Another way to express the theorem is that there is a many-to-one mapping from choices of c_i to Pareto optimal policies that maximize that choice of c_i.
[Edit] It’s not strictly many-to-one, since you can choose c_is that make you indifferent between multiple Pareto optimal basic policies, but you recapture the many-to-one behavior if you massage your definition of “policy,” and it’s many-to-one for most choices of c_i.
On the purely mathematical side, I’ve an issue with the theorem as stated : it says :
Which is widely different from :
In the way the theorem is stated, you don’t have a utility function with fixed coefficients you can use for every situation, but for every situation, you can find a set of coefficient that will work, which is not what being an optimizer is.
See my reply to Wei Dai’s comment. If you have a prior over which situations you will face, and if you’re able to make precommitments and we ignore computational difficulties, then there is only one situation. If you could decide now which decision rule you’ll use in the future, then in a sense that would be the last decision you ever make. And a decision rule that’s optimal with respect to a particular utility function is one that makes every subsequent decision using that same utility function.
From the vantage point of an agent with a prior today, the best thing it can do is adopt a utility function and precommit to maximizing it from now on no matter what. I hope that’s more clear.
The v_i and c_i are not unique. If
P is the space of situations*, and v_i is the space of individual preferences over situations.
The actual theorem says that, for any particular situation and any particular individual, you can find a personal weighting that aggregates their preferences over that situation, and this method is guaranteed to return Pareto optimal solutions. (Choosing between Pareto optimal solutions is up to you, and is done by your choice of weightings.)
Your second version says for any particular situation, there exists a magic weighting which will aggregates the preferences of any possible individual in a way that returns solutions which are simultaneously maximized by all Pareto optimal solutions any agent produces.
Of course, there is such a magic weighting. It is the vector of all zeros, because every point in P maximizes that function, and so the Pareto optimal points will as well.
* Well, strictly, it’s the space of “policies,” which is a combination of what will happen to the agent and how they will respond to it, which we’re describing elsewhere as a “world history.”
Hum, yes, indeed I got the P and V_i backwards, sorry.
The argument still holds, but with the other inversion between the \forall and the \exists :
Having an utility function means the weighting (the c_i) can vary between each individuals, but not between situations. If for each situation (“world history” more exactly) you chose a different set of coefficients, it’s no longer an utility function—and you can make about anything with that, just choosing the coefficients you want.
That doesn’t work, because v_i is defined as a mapping from P to the reals; if you change P, then you also change v_i, and so you can’t define them out of order.
I suspect you’re confusing p, the individual policies that an agent could adopt, and P, the complete collection of policies that the agent could adopt.
Another way to express the theorem is that there is a many-to-one mapping from choices of c_i to Pareto optimal policies that maximize that choice of c_i.
[Edit] It’s not strictly many-to-one, since you can choose c_is that make you indifferent between multiple Pareto optimal basic policies, but you recapture the many-to-one behavior if you massage your definition of “policy,” and it’s many-to-one for most choices of c_i.