Probabilistic Negotiation

Follow up to Deterministic Strategies Can Be Sub-optimal

The Ultimatum Game is a simple experiment. Two people have been allocated $10. One person decides how to divide the profits, and the other decides whether to Accept that allocation or to Deny it, in which case both participants get $0. Suppose you are the person whose job it is to choose whether to Accept or Deny an offer. What strategy could you use to maximize your returns?

Yudkowsky offers the following solution (NB: the original text splits $12, because sci-fi; I have changed the numbers inline/​without brackets, let me know if that offends)

It goes like this:

When somebody offers you a 6:4 split, instead of the 5:5 split that would be fair, you should accept their offer with slightly less than 56 probability. Their expected value from offering you 6:4, in this case, is 6 * slightly less than 56, or slightly less than 5. This ensures they can’t do any better by offering you an unfair split; but neither do you try to destroy all their expected value in retaliation. It could be an honest mistake, especially if the real situation is any more complicated than the original Ultimatum Game.

If they offer you 7:3, accept with probability slightly-more-less than 57, so they do even worse in their own expectation by offering you 7:3 than 6:4.

It’s not about retaliating harder, the harder they hit you with an unfair price—that point gets hammered in pretty hard to the kids, a Watcher steps in to repeat it. The circumstances under which you should ever go around carrying out counterfactual threats in real life are much more fraught and complicated than this, and nobody’s going to learn about them realistically for several years yet. This setup isn’t about retaliation, it’s about what both sides have to do, to turn the problem of dividing the gains, into a matter of fairness; to create the incentive setup whereby both sides don’t expect to do any better by distorting their own estimate of what is ‘fair’.

To be explicit: assume that you have in some way “locked in” some curve, , which tells you to hit “Accept” with probability when offered to let your conspirator keep dollars out of the 10 you are to split. You want to maximize your expected value, as does your conspirator: so, you should positively incentivize your conspirator to give you money.

Consider the following instantiation of this algorithm:

Note that there are many values for . For now, let’s not examine the “greedy” half of the algorithm (where your conspirator is offering you more than they are taking themselves), and model another instantiation:

Note that this maintains a positive incentive for your conspirator to give you more money, while not destroying as much value as the prior algorithm.

I work at a company which does year end performance reviews. I was promoted last year, and am not a particular “flight risk”. However, I still want to positively incentive my boss to accurately “rate” me—ie, if I performed above average I would like to be given the rating (and raise) for an above average performance, even if it means increasing exposure to a more flight-prone but poorer performance employee. So I published a curve to my boss demonstrating that I would stay with 100% chance if I got the highest rating I could get, would stay with 90% chance if I got an average rating, would stay with 70% chance if I got below average, and would stay with 50% chance if I got put on a performance improvement plan.

This was received well enough, because I run the Risk Analytics team at a FinTech company, so my entire stack is capable of working with uncertainty. In particular, I highlighted that even an average grade (which would put me in the top 70th percentile) would have me staying with 90% chance, which is above industry attrition rate. I ended up getting an average grade, and rolling a 6 on my d10, so I am staying with my company.

Traditional negotiations work by hemming and hawing. Yudkowsky offers a new solution: publish your flight curve and let your conspirator work towards their own incentives. Increase your legibility so that people don’t have to track your subtle indications that you are happy/​unhappy with an offer.

Yudkowsky’s newest novel is here: https://​​www.glowfic.com/​​posts/​​4582