Isn’t the main problem the issue with simulating the Predictor? I also thought about this idea. The simplest variany is that the alleged Predictor estimates P(the user will choose to one-box) by using the user’s previous choices as P(next choice is an one-box) = (one-boxing trials +1)/(2+trials). This setup has the player set a probability p, then watch as the Predictor converges to p, thus punishing or not punishing the Player.
A more complex idea is to have the player choose to 2-box with probability p if the random number between 0 and 1 is at most r and with probability q if the random number is at least r; then the Predictor would learn the player’s random number, estimate the parameters and choose not to place the million with probabilities p_est or q_est, depending on whether the random number is bigger than r_est. In order to deal with precision issues, one can set p to 1 and q to 0, meaning that the Predictor is only to figure out the value of r.
However, this setup would have the player’s choice to 1-box or 2-box cause a subsequent iteration to receive or not to receive the million. I suspect that this is a way to derive FDT from mere superrationality.
Variants with imperfect predictors still generally favour one-boxing unless the the predictor is absolutely terrible at predicting, simply because the loss you suffer from being predicted to take both is 1000x greater than the gain from actually taking both.
Isn’t the main problem the issue with simulating the Predictor? I also thought about this idea. The simplest variany is that the alleged Predictor estimates P(the user will choose to one-box) by using the user’s previous choices as P(next choice is an one-box) = (one-boxing trials +1)/(2+trials). This setup has the player set a probability p, then watch as the Predictor converges to p, thus punishing or not punishing the Player.
A more complex idea is to have the player choose to 2-box with probability p if the random number between 0 and 1 is at most r and with probability q if the random number is at least r; then the Predictor would learn the player’s random number, estimate the parameters and choose not to place the million with probabilities p_est or q_est, depending on whether the random number is bigger than r_est. In order to deal with precision issues, one can set p to 1 and q to 0, meaning that the Predictor is only to figure out the value of r.
However, this setup would have the player’s choice to 1-box or 2-box cause a subsequent iteration to receive or not to receive the million. I suspect that this is a way to derive FDT from mere superrationality.
Variants with imperfect predictors still generally favour one-boxing unless the the predictor is absolutely terrible at predicting, simply because the loss you suffer from being predicted to take both is 1000x greater than the gain from actually taking both.