Ah! I finally get it! Unfortunately I haven’t gotten the math. Let me try to apply it, and you can tell me where (if?) I went wrong.
U = v + (Past Constants) →
U = w + E(v|v→v) - E(w|v→w) + (Past Constants).
Before, U = v + 0, setting (Past Constants) to 0 because we’re in the initial state. v = 0.1*Oxfam + 1*AMF.
Therefore, U = 10 utilitons.
After I met you, you want me to change my w to weight Oxfam higher, but only if a constant was given (the E terms) U’ = w + E(v|v->v) - E(w|v->w). w = 1*Oxfam + 0.1*AMF.
What we want is for U = U’.
E(v|v->v) = ? I’m guessing this term means, “Let’s say I’m a v maximiser. How much is v?” In that case, E(v|v->v) = 10 utilitons.
E(w|v->w) = ? I’m guessing this term means, “Let’s say I become a w maximiser. How much is w?” In that case, E(w|v->w) = 10 utilitons.
U’ = w + 10 − 10 = w.
Let’s try a different U*, with utility function w* = 1*Oxfam + 10*AMF (It acts the same as a v-maximiser) E(v|v->v) = 10 utilitons. E(w*|v->w*) = 100 utilitons. U* = w* + 10 − 100 = w* − 90.
Trying this out, we obviously will be donating 10 to AMF in both utility functions. U = v = 0.1*Oxfam + 1*AMF = 0.1*0 + 1*10 = 10 utilitons. U* = w* − 90 = 1*Oxfam + 10*AMF − 90 = 0 + 100 − 90 = 10 utilitons.
Obviously all these experiments are useless. v = 0.1*Oxfam + 1*AMF is a completely useless utility function. It may as well be 0.314159265*Oxfam + 1*AMF. Let’s try something that actually makes some sense, (economically.)
Let’s have a simple marginal utility curve, (note partial derivatives) dv/dOxfam = 1-0.1*Oxfam, dv/dAMF = 10-AMF. In both cases, donating more than 10 to either charity is plain stupid.
U = v v = (Oxfam-0.05*Oxfam^2) + (10*AMF-0.5*AMF^2) Maximising U leads to AMF = 100⁄11 ≈ 9.09, Oxfam ≈ 0.91 v happens to be: v = 555⁄11 ≈ 50.45
(Note: Math is mostly intuitive to me, but when it comes to grokking quadratic curves by applying them to utility curves which I’ve never dabbled with before, let’s just say I have a sizeable headache about now.)
Now you, because you’re so human and you think we simulated AI can so easily change our utility functions, come over to me and tell me to change v to w = (100*Oxfam-5*Oxfam^2) + (10*AMF-0.5*AMF^2). What you’re saying is to increase dw/dOxfam = 100 * dv/dOxfam, while leaving dw/dAMF = dv/dAMF. Again, partial derivatives.
U’ = w + E(v|v->v) - E(w|v->w). Maximising w leads to Oxfam = 100⁄11 ≈ 9.09, AMF = 0.91, the opposite of before. w = 5550⁄11 ≈ 504.5 U’ = w + 555⁄11 − 5550⁄11 = w − 4995⁄11 Which still checks out.
Also, I think I finally get the math too, after working this out numerically. It’s basically U = (Something), and trying to make the utility function change must preserve that (Something). U’ = (Something) is a requirement. so you have your U = v + (Constants), and you set U’ = U, just that you have to maximise v or w before determining your new set of (Constants) max(v) + (Constants) = max(w) + (New Constants)
(New Constants) = max(v) - max(w) + (Constants), which are your E(v|v->v) - E(w|v->w) + (Constants) terms, except under different names.
Huh. If only I had thought max(v) and max(w) from the start… but instead I got confused with the notation.
Question: I don’t understand your Oxfam/AMF example. According to me, if you decided to donate £10 to AMF, I see a that Oxfam, which I care about 0.1 times as much as AMF, has lost £1 worth of AMF donation, while AMF has gained £10. If I then decide to follow through with my perfect willingness, and I donate £10 to Oxfam, only then do I have equilibrium, because
Before: £10 0.1 utiliton + £10 1 utiliton = 11 utilitons.
After: £10 0.1 utiliton + £10 1 utiliton = 11 utilitons.
But in the second hypothetical,
After: £11 0.1 utiliton + £9 1 utiliton = 10.1 utilitons.
Which seems clearly inferior. In fact, even if you offered to switch donations with me, I wouldn’t accept, because I may not trust you to fulfil your end of the deal, resulting in a lower expected utility.
I’m clearly missing some really important point here, but I fail to see how the example is related to utility function updating...