I don’t think Nomega has to simulate you interacting with Omega in order to know how to would react should you encounter it, in the same way that you can predict the output of many computer programs without simulating them.
By the time you get mugged, you could be 100% sure that you are in the Omega world, rather than the Nomega world, but the principle is that your decision in the Omega world affects the Nomega world, and so before knowing UDT commits to making the decision that maximizing EV across both worlds.
This logic operates in the same way for the coin coming up tails—when you see the tails, you know your in the tails world, but your decision in the tails world affects the heads world, so you have to consider it. Likewise, your decision in the Omega world affects the Nomega world (independent of any sort of simulation argument).
Thus, in a situation where UDT has seen Omega, it has influence over the Omega world and Nomega/Omega world, but no influence over the normal world and Nomega world. Since the Omega world has so much more weight than the Omega/Nomega world, UDT will effectively act as if it’s in the Omega world.
This argument would also suggest that by the time you see tails, you know you live in the tails world and thus should not pay up.
neat! this was a fun one to think about. i think the program should two-box though. Let c be the program that’s the same as b, except it makes the opposite decision upon seeing (x, x). For simplicity, let’s assume the only two programs in P are b and c (and ill assume they two-box when not playing against themselves). Then, if b one-boxes against itself it makes (1/2) * 1m + (1/2) * 1k, wheras if it two-boxes against itself if makes (1/2) * 1k + (1/2) * (1m + 1k). Thus, it should two-box against itself. (I think the intuition that says to one-box is cheating a little—it assumes that by one-boxing you increase the number of one-boxers in P, which i agree would be awesome but not something that b has the ability to change)