Two New Newcomb Variants

Two Newcomb variants to add to the list of examples where optimal choice and optimal policy are diammetrically opposed. I don’t think problems these exist anywhere else yet.

4 Boxes Problem

In a game show there are 4 transparent boxes in a row, each of which starts off with $1 inside. Before the human player enters, the show’s 4 superintelligent hosts have a competition: they each have a box assigned to them, and they win if the human chooses their box. To motivate the human they cannot alter their box at all, but they are able to put $100 into any or all of the other 3 boxes.

Our first human is an adherent of CDT. Since the boxes are transparent, and he wants to get money, he obviously chooses the one with the most. It’s not like his decisions can change what the hosts have already done. Putting money into the other boxes would only have hurt the hosts’ chances, so they didn’t do that. Instead, all the boxes have only the original $1. CDT is disappointed, but picks one at random and goes home.

Our second human is an adherent of EDT. She knows if she sees how much is in the boxes, she’ll pick the one with the most, which will end up being only $1. Because of this, she blindfolds herself before walking on stage. She weighs the two left boxes together, to see how much is in them total, and the two on the right as well. “If the left two are heavier I’ll pick one of them at random,” she says, “if the two right I’ll pick one of those, and if the same I’ll pick any at random”. EDT was quite happy when she found out she was going to do this. She’d worried that she wouldn’t get the blindfold on in time and only get $1 but it seemed to work out. The hosts predicted this, of course, so the two on the left put $100 in eachothers boxes, to increase their own odds, and the two on the right did the same, and EDT picks at random and leaves with $101. [I’m not completely sure EDT can’t do better than this, so corrections with even more elaborate schemes encouraged]

Our third human is an adherent of FDT. She considers the policies she could implement, and decides that she will take one of the boxes with the least money. Not wanting to give their opponents an advantage, the hosts all put $100 in every other box, and FDT leaves with $301.

5 Boxes problem

This time the game show wants to see who is best at prediction. Each of the 4 superintelligent hosts must try to predict which box the human will pick, and is rewarded based on who has the most successful predictions. To make it fun, each host has $100 and 1 grenade (which kills you if you choose it) that they can put in any of the boxes, without the other hosts knowing which they’ve picked. Since they know humans dislike dying, there are 5 transparent boxes this time to make sure there’s a way for the human to avoid all grenades. The player can see how much money or grenades are in each, and they’re also told which box each host is betting on, and which incentive and grenade was theirs.

Our first human is CDT again. The hosts know he’d never pick a box with a grenade, and otherwise would pick the one with the most money. They each randomly picked a box to incentivise, selected it as their prediction, and boobytrapped a different box to maximise their chances of winning. CDT naturally picks the most valuable box that doesn’t have a grenade, and leaves with expected returns of ~$152. He thinks he did pretty well, all things considered.

EDT is really struggling to come up with a clever blindfold strategy this time, and doesn’t want to blow herself up by mistake. She mournfully abandons schemes to manage the news, looks at the boxes, and picks the best one. Expected return: ~$152.

FDT does have a plan, though it took some thinking. She decides she won’t pick a box that was predicted by a host, unless they have put their grenade in box 5 and their incentive in one of the first two boxes, and otherwise she’ll pick the one with the most helpful host predictions, and after that the most value to her. She commits to do this even if she has to pick a grenade and kill herself to do so. Since each host guessed she’d set this policy, and had no hope of winning if they didn’t play along, they have already put their grenades in box 5, predicted and incentivised one of the first two, and hoped to get lucky. This gets an expected return of ~$275, which I’m pretty sure is the maximum possible assuming hosts who are zero-sum competing and noncooperative unless positively incentivised.

[It’s a bit of a stretch of the rules, but EDT does have an ideal strategy as well, after this point: Blindfold herself, walk on stage, yell to FDT who is watching from the audience, ask which one FDT would endorse picking, and then pick that box without looking. If your decision theory ever tells you to do this, you probably have a bad decision theory. At least she’s not CDT, who doesn’t see how blindfolds and phoning a friend can cause boxes to have more money in them.]