Newcomb’s Paradox Simulation

shakelush18 Mar 2026 4:30 UTC

6 points

Hey!

I am trying to approach Newcomb’s paradox the most natural way a programmer can—writing a simulation. I wrote a nice simulation supporting the expected-utility approach (favoring picking 1 box).

However I am afraid I am baking up my assumptions of the system into the code, but I can’t even figure out how I would write a simulation that would support strategic-dominance strategy (favoring picking both boxes).

If I try to decouple the prediction from the actual choice, to “allow” the player to pick a different box than the predictor predicts,.… well than the predictor accuracy falls and thus the premise of the problem is defeated.

Is there any way at all to write a simulation that supports players picking both boxes?

shakelush18 Mar 2026 4:30 UTC

6 points

7 comments1 min readLW link

JBlack 18 Mar 2026 23:46 UTC
6 points
3
No, there is no way to write a simulation that supports taking both boxes while also upholding the conditions of the scenario.
Even with an imperfect predictor, you would have to make the predictor effectively useless at predicting, performing no better than 0.1% above chance. Even if it predicts some agents well and some poorly, you would need to p-hack the result by ignoring the agents for which it predicted better to get a recommendation to take both boxes.
- shakelush 3 Apr 2026 17:46 UTC
  1 point
  0
  Parent
  I wonder if there can be a way to prove that anything that can not be simulated is not possible. I think it should be easier than it seems, because we can make a lot of excuses—like we can take any amount of time and any powerful computer. But if we prove that if even with near infinite amount of compute and infinite time, we can’t simulate a scenario, does that make this scenario impossible?
  Not that it would be immediately practical, because we do not have nearly infinite compute or time, but it could be interesting.
  - JBlack 4 Apr 2026 5:36 UTC
    2 points
    0
    Parent
    The problem isn’t in the simulation part, but in the “supports” part.
    You can certainly write a simulation in which an agent decides to take both boxes. By the conditions of the scenario, they get $1000. Does this simulation “support” taking both boxes? No, unless you’re only comparing with alternative actions of not taking a box at all, or burning box B and taking the ashes, or other things that are worse than getting $1000.
    However, the scenario states that the agent could take 1 box, and it is a logical consequence of the scenario setup that that in the situations where they do, they get $1000000. That’s better than getting $1000 under the assumptions of the scenario, and so a simulation that actually follows the rules of the scenario cannot support taking two boxes.
Dagon 18 Mar 2026 5:20 UTC
6 points
0
Well, you can’t simulate it because the mechanism of prediction is unspecified, as is the mechanism of free will that makes the decision. You just don’t know if, in the thought experiment universe, you actually have an open option to choose.

You can very easily simulate the trivial case (ignore causality and decision theory, assume Omega cheats by changing the values after you decide before the result is revealed), which leads to one-boxing. Or the trivial-but-scenario-rejecting trivial case of the CDT assumption that your choice has literally no impact on the boxes which leads to two-boxing.
TAG 19 Mar 2026 12:37 UTC
4 points
1
Two boxing seems to be based on a mixture of:-
- disbelief in backwards causation
- belief in free will
- disbelief that the predictor could be all that good.
In particular they don’t believe in a Laplace’s Demon type of superpredictor that can foresee all physical events, and infer a future psychological history from them, including seemingly spontaneous changes of mind. They instead see the predictor as a Derren Brown style psychologist with basically human abilities.

If you consider a variation of the game where the predictor just accepts a promise from the player to one- or two- box, then the best strategy is to say you are going to one box, and then two box. Similarly, if the predictor is rather superficial and only reads the players intention at the start of the game, the player can also get the extra money by changing their mind.

So maybe there is a way of simulating two boxing with players that change strategy, and predictors that operate off limited information , like the first N runs.
StanislavKrym 18 Mar 2026 5:03 UTC
4 points
2
Isn’t the main problem the issue with simulating the Predictor? I also thought about this idea. The simplest variany is that the alleged Predictor estimates P(the user will choose to one-box) by using the user’s previous choices as P(next choice is an one-box) = (one-boxing trials +1)/(2+trials). This setup has the player set a probability p, then watch as the Predictor converges to p, thus punishing or not punishing the Player.

A more complex idea is to have the player choose to 2-box with probability p if the random number between 0 and 1 is at most r and with probability q if the random number is at least r; then the Predictor would learn the player’s random number, estimate the parameters and choose not to place the million with probabilities p_est or q_est, depending on whether the random number is bigger than r_est. In order to deal with precision issues, one can set p to 1 and q to 0, meaning that the Predictor is only to figure out the value of r.
However, this setup would have the player’s choice to 1-box or 2-box cause a subsequent iteration to receive or not to receive the million. I suspect that this is a way to derive FDT from mere superrationality.
- JBlack 18 Mar 2026 23:40 UTC
  6 points
  3
  Parent
  Variants with imperfect predictors still generally favour one-boxing unless the the predictor is absolutely terrible at predicting, simply because the loss you suffer from being predicted to take both is 1000x greater than the gain from actually taking both.