For the simulation-output variant of ASP, let’s say the agent’s possible actions/outputs consist of all possible simulations Si (up to some specified length), concatenated with “one box” or “two boxes”. To prove that any given action has utility greater than zero, the agent must prove that the associated simulation of the predictor is correct. Where does your algorithm have an opportunity to commit to one-boxing before completing the simulation, if it’s not yet aware that any of its available actions has nonzero utility? (Or would that commitment require a further modification to the algorithm?)

For the simulation-as-key variant of ASP, what principle would instruct a (modified) UDT algorithm to redact some of the inferences it has already derived?

simulation-output: It would require a modification to the algorithm. I don’t find this particularly alarming, though, since the algorithm was intended as a minimally-complex solution that behaves correctly for good reasons, not as a final, fully-general version. To do this, the agent would have to first (or at least, at some point soon enough for the predictor to simulate) look for ways to partition its output into pieces and consider choosing each piece separately. There would have to be some heuristic for deciding what partitionings of the output to consider and how much computational power to devote to each of them, and then which one actually gets chosen depends on which has the highest resulting utility you expect to get from them. Come to think of it, this might be trickier than I was thinking because you would run into self-trust issues if you need to prove that you will output the correct simulation of the predictor. This could be fixed by delegating the task of fully simulating the predictor to an easier-to-model subroutine, though that would require further modification to the algorithm.

Simulation-as-key: I don’t have a good answer to that.

For the simulation-output variant of ASP, let’s say the agent’s possible actions/outputs consist of all possible simulations Si (up to some specified length), concatenated with “one box” or “two boxes”. To prove that any given action has utility greater than zero, the agent must prove that the associated simulation of the predictor is correct. Where does your algorithm have an opportunity to commit to one-boxing before completing the simulation, if it’s not yet aware that any of its available actions has nonzero utility? (Or would that commitment require a further modification to the algorithm?)

For the simulation-as-key variant of ASP, what principle would instruct a (modified) UDT algorithm to redact some of the inferences it has already derived?

simulation-output: It would require a modification to the algorithm. I don’t find this particularly alarming, though, since the algorithm was intended as a minimally-complex solution that behaves correctly for good reasons, not as a final, fully-general version. To do this, the agent would have to first (or at least, at some point soon enough for the predictor to simulate) look for ways to partition its output into pieces and consider choosing each piece separately. There would have to be some heuristic for deciding what partitionings of the output to consider and how much computational power to devote to each of them, and then which one actually gets chosen depends on which has the highest resulting utility you expect to get from them. Come to think of it, this might be trickier than I was thinking because you would run into self-trust issues if you need to prove that you will output the correct simulation of the predictor. This could be fixed by delegating the task of fully simulating the predictor to an easier-to-model subroutine, though that would require further modification to the algorithm.

Simulation-as-key: I don’t have a good answer to that.