Suppose that when the agent enters the room, a note is either Absent, states that the Left box was predicted (and there’s no bomb in it), or that the Right box was predicted (and therefore there’s a bomb in Left). There is probability pN of the note being present in both prediction cases, and probability pT of it being truthful in both cases where it is present. (A more complete analysis would have these be different probabilities, and a consideration that they may be adversarially selected to extract the most damage from participants or to minimise the recorded rates of error of the predictor)
For simplicity, the probabilities of incorrect prediction in every combination of functional inputs and outputs are all B, which according to the scenario is very small, less than 10^-24. Again, in a more complete analysis one should consider different rates of error under different decision scenarios and possible optimization pressures.
The simplest decision function then maps type of note one sees mapped to which box to take, although a more complete analysis would permit mixed strategies. There are eight such functions instead of the two in the no-note scenario.
In particular, there is a possible function F(A) = L, F(L) = R, F(R) = L. In this case, the predictor leaves a truthful note with probability pN pT, and the prediction was wrong because the agent does the opposite. By the scenario constraint B < 10^-24, we know that the product pN pT < 10^-24.
So it is inconsistent with the scenario that the predictor always leaves a truthful note. The predictor can either be forced into error with probability very much greater than 10^-24 which violates the scenario premises, or almost always does not leave a truthful note. It is also possible that the predictor could arrange that the agent is unable to decide in this way, but that’s very much outside the bounds of decision theory.
It is still possible that pT is close to 1, but less than 1 - U(-$100)/U(death). In this case, pN ~= 10^-24 and your optimal decision function is F(A) = L, F(L) = R, F(R) = R. That is, take the left box unless you see a note saying Right, in which case you should pick Right. If you see no note then with probability at least 0.999999999999999999999999 you will save $100 and not burn to death. In the extremely unlikely case that you see a note, you pick the right box even if it says Left, because you can’t be sufficiently confident that it’s true.
If pT >= 1 - U(-$100) / U(death), then the optimum is F(A) = L, F(L) = L, F(R) = R because now you have near certainty that the note is truthful and the risk of picking the left box based on the note is now worth it.
You’ll also need to update the content of the note and the predictor’s decision process to take into account that the agent may see a note. In particular, the predictor needs to decide whether to show a note in the simulation, and may need to run multiple simulations.
That would have been a different set of example scenarios. In this one, I chose one in which the predictor had fixed probabilities. I am not assuming any particular means by which the predictor arrives at the prediction (such as simulation), and don’t care about anything to do with self-locating uncertainty (such as being a conscious entity created by such a simulation).
Feel free to follow up with FDT analysis of a different scenario, if you prefer it.
Though I will note that in a scenario where the agent is uncertain of the actual constraints of the scenario, the conclusion is almost always simpler: take the right box every time, because even a tiny fraction of uncertainty makes the left box too unsafe to be worth taking. The more moving parts you introduce into a scenario with this basic premise, the more likely it is that “always take the right box” is the correct decision, which is boring.
Does the note say that I was predicted to choose the right box regardless of what notes I am shown, and therefore the left box contains a bomb? Then the predictor is malfunctioning and I should pick the right box.
Does the note say that I was predicted to choose the right box when told that the left box contains a bomb, and therefore the left box contains a bomb? Then I should pick the left box, to shape what I am predicted to do when given that note.
In my scenario, there is a probability pN that the predictor leaves a note, regardless of what prediction the predictor makes. The note (when present) is always of the form “I predicted that you will pick the <right|left> box, and therefore <did|did not> put a bomb in the left box.” That is, the note is about what the predictor thinks you will do, which more closely matches your second paragraph. Your first paragraph concerns a prediction about what you counterfactually would have done in some other situations and is not relevant in this scenario.
However, your decision process should consider the probability 1-pT that the note is lying about the predictor’s actual prediction (and therefore bomb-placing).
If the note predicts the outcome of your decision after seeing the note, then you are free to diagonalize the note (do the opposite of what it predicts). Which would then contradict the premise that the predictor is good at making predictions (if the prediction must be this detailed and the note must remain available, even when diagonalized), because the prediction is going to be mostly wrong by construction, whatever it is. Transparent Newcomb for example is designed to avoid this issue, while gesturing at a similar phenomenon.
This kind of frame breaking thought experiment is not useful for illustrating the framings it breaks (in this case, FDT). It can be useful for illustrating or motivating some different (maybe novel) framing that does manage to make sense of the new thought experiment, but that’s only productive when that happens, and it’s easy to break framings (as opposed to finding a within-framing centrally-there error in an existing theory) without motivating any additional insight. So this is somewhat useful to recognize, to avoid too much unproductive confusion when the framings that usually make sense get broken.
In my scenario there is probability pT that the note is truthful (and the value of pT is assumed known to the agent making the decision). It is possible that pT = 1, but only for pN < 10^-24 so as to preserve the maximum 10^-24 probability of the predictor being incorrect.
Example scenario extended to allow notes
Suppose that when the agent enters the room, a note is either Absent, states that the Left box was predicted (and there’s no bomb in it), or that the Right box was predicted (and therefore there’s a bomb in Left). There is probability pN of the note being present in both prediction cases, and probability pT of it being truthful in both cases where it is present. (A more complete analysis would have these be different probabilities, and a consideration that they may be adversarially selected to extract the most damage from participants or to minimise the recorded rates of error of the predictor)
For simplicity, the probabilities of incorrect prediction in every combination of functional inputs and outputs are all B, which according to the scenario is very small, less than 10^-24. Again, in a more complete analysis one should consider different rates of error under different decision scenarios and possible optimization pressures.
The simplest decision function then maps type of note one sees mapped to which box to take, although a more complete analysis would permit mixed strategies. There are eight such functions instead of the two in the no-note scenario.
In particular, there is a possible function F(A) = L, F(L) = R, F(R) = L. In this case, the predictor leaves a truthful note with probability pN pT, and the prediction was wrong because the agent does the opposite. By the scenario constraint B < 10^-24, we know that the product pN pT < 10^-24.
So it is inconsistent with the scenario that the predictor always leaves a truthful note. The predictor can either be forced into error with probability very much greater than 10^-24 which violates the scenario premises, or almost always does not leave a truthful note. It is also possible that the predictor could arrange that the agent is unable to decide in this way, but that’s very much outside the bounds of decision theory.
It is still possible that pT is close to 1, but less than 1 - U(-$100)/U(death). In this case, pN ~= 10^-24 and your optimal decision function is F(A) = L, F(L) = R, F(R) = R. That is, take the left box unless you see a note saying Right, in which case you should pick Right. If you see no note then with probability at least 0.999999999999999999999999 you will save $100 and not burn to death. In the extremely unlikely case that you see a note, you pick the right box even if it says Left, because you can’t be sufficiently confident that it’s true.
If pT >= 1 - U(-$100) / U(death), then the optimum is F(A) = L, F(L) = L, F(R) = R because now you have near certainty that the note is truthful and the risk of picking the left box based on the note is now worth it.
You’ll also need to update the content of the note and the predictor’s decision process to take into account that the agent may see a note. In particular, the predictor needs to decide whether to show a note in the simulation, and may need to run multiple simulations.
That would have been a different set of example scenarios. In this one, I chose one in which the predictor had fixed probabilities. I am not assuming any particular means by which the predictor arrives at the prediction (such as simulation), and don’t care about anything to do with self-locating uncertainty (such as being a conscious entity created by such a simulation).
Feel free to follow up with FDT analysis of a different scenario, if you prefer it.
Though I will note that in a scenario where the agent is uncertain of the actual constraints of the scenario, the conclusion is almost always simpler: take the right box every time, because even a tiny fraction of uncertainty makes the left box too unsafe to be worth taking. The more moving parts you introduce into a scenario with this basic premise, the more likely it is that “always take the right box” is the correct decision, which is boring.
Let me try again:
Does the note say that I was predicted to choose the right box regardless of what notes I am shown, and therefore the left box contains a bomb? Then the predictor is malfunctioning and I should pick the right box.
Does the note say that I was predicted to choose the right box when told that the left box contains a bomb, and therefore the left box contains a bomb? Then I should pick the left box, to shape what I am predicted to do when given that note.
In my scenario, there is a probability pN that the predictor leaves a note, regardless of what prediction the predictor makes. The note (when present) is always of the form “I predicted that you will pick the <right|left> box, and therefore <did|did not> put a bomb in the left box.” That is, the note is about what the predictor thinks you will do, which more closely matches your second paragraph. Your first paragraph concerns a prediction about what you counterfactually would have done in some other situations and is not relevant in this scenario.
However, your decision process should consider the probability 1-pT that the note is lying about the predictor’s actual prediction (and therefore bomb-placing).
If the note predicts the outcome of your decision after seeing the note, then you are free to diagonalize the note (do the opposite of what it predicts). Which would then contradict the premise that the predictor is good at making predictions (if the prediction must be this detailed and the note must remain available, even when diagonalized), because the prediction is going to be mostly wrong by construction, whatever it is. Transparent Newcomb for example is designed to avoid this issue, while gesturing at a similar phenomenon.
This kind of frame breaking thought experiment is not useful for illustrating the framings it breaks (in this case, FDT). It can be useful for illustrating or motivating some different (maybe novel) framing that does manage to make sense of the new thought experiment, but that’s only productive when that happens, and it’s easy to break framings (as opposed to finding a within-framing centrally-there error in an existing theory) without motivating any additional insight. So this is somewhat useful to recognize, to avoid too much unproductive confusion when the framings that usually make sense get broken.
In my scenario there is probability pT that the note is truthful (and the value of pT is assumed known to the agent making the decision). It is possible that pT = 1, but only for pN < 10^-24 so as to preserve the maximum 10^-24 probability of the predictor being incorrect.