Omega is supposed to be always truthful, so either he rewards the sims as well, or you know something the sims don’t and hence it’s not obvious you’ll do the same as them.
Even if he’s not, after he’s given a $1m simulated reward, does he then have to keep up a simulated environment for the sim to actually spend the money?
If he can lie to sims, then you can’t know he’s not lying to you unless you know you’re not a sim. If you do, it’s not obvious you’d choose the same way as if you didn’t.
Well, as long as you believe Omega enough to think no box contains sudden death or otherwise negative utility, you’d open them to see what was inside. But yes, you might not believe Omega at all.
General question: suppose we encounter an alien. We have no idea what its motivations, values, goals, or abilities are. On the other hand, if may have observed any amount of human comm traffic from wireless EM signals since the invention of radio, and from actual spy-probes before the human invention of high tech that would detect them.
It signals us in Morse code from its remote starship, offering mutually benefitial trade.
What prior should we have about the alien’s intention? Should we use a native uniform prior that would tell us it’s as likely to mean us good as harm, and so never reply because we don’t know how it will try to influence our actions via communications? Should it tell us different agents who don’t explicitly value one another will conflict to the extent their values differ, and so since value-space is vast and a randomly selected alien is unlikely to share many values with us, we should prepare for war? Should it tell us we can make some assumptions (which?) about naturally evolved agents or their Friendly-to-themselves creations? How safe are we if we try to “just read” English text written by an unknown, possibly-superintelligence which may have observed all our broadcast traffic since the age of radio? What does our non-detection of this alien civ until they chose to initiate contact tell us? Etc.
A 50% chance of meaning us good vs harm isn’t a prior I find terribly compelling.
There’s a lot to say here, but my short answer is that this is both an incredibly dangerous and incredibly valuable situation, in which both the potential opportunity costs and the potential actual costs are literally astronomical, and in which there are very few things I can legitimately be confident of.
The best I can do in such a situation is to accept that my best guess is overwhelmingly likely to be wrong, but that it’s slightly less likely to be wrong than my second-best guess, so I should operate on the basis of my best guess despite expecting it to be wrong. Where “best guess” here is the thing I consider most likely to be true, not the thing with the highest expected value.
I should also note that my priors about aliens in general—that is, what I consider likely about a randomly selected alien intelligence—are less relevant to this scenario than what I consider likely about this particular intelligence, given that it has observed us for long enough to learn our language, revealed itself to us, communicated with us in Morse code, offered mutually beneficial trade, etc.
The most tempting belief for me is that the alien’s intentions are essentially similar to ours. I can even construct a plausible sounding argument for that as my best guess… we’re the only other species I know capable of communicating the desire for mutually beneficial trade in an artificial signalling system, so our behavior constitutes strong evidence for their behavior. OTOH, it’s pretty clear to me that the reason I’m tempted to believe that is because I can do something with that belief; it gives me a lot of traction for thinking about what to do next. (In a nutshell, I would conclude from that assumption that it means to exploit us for its long-term benefit, and whether that’s good or bad for us depends entirely on what our most valuable-to-it resources are and how it can most easily obtain them and whether we benefit from that process.) Since that has almost nothing to do with the likelihood of it being true, I should distrust my desire to believe that.
Ultimately, I think what I do is reply that I value mutually beneficial trade with them, but that I don’t actually trust them and must therefore treat them as a potential threat until I have gathered more information about them, while at the same time refraining from doing anything that would significantly reduce our chances of engaging in mutually beneficial trade in the future, and what do they think about all that?
He can certainly give them counterfactual ‘realities’. It would seem that he should be assumed to at least provide counterfactual realities wherein information provided by the simulation’s representation of Omega indicates that he is perfectly trustworthy.
Even if he’s not, after he’s given a $1m simulated reward, does he then have to keep up a simulated environment for the sim to actually spend the money?
No. But if for whatever reason the simulated environment persists it should be one that is consistent with Omega keeping his word. Or, if part of the specification of the problem or the declarations made by Omega directly pertain to claims about what He will do regarding simulation then he will implement that policy.
Omega (who experience has shown is always truthful)
Omega doesn’t need to simulate the agent actually getting the reward. After the agent has made its choice, the simulation can just end.
If we are assuming that Omega is trustworthy, then Omega needs to be assumed to be trustworthy in the simulation too. If they didn’t allow the simulated version of the agent to enjoy the fruits of their choice, then they would not be trustworthy.
Actually, I’m not sure this matters. If the simulated agent knows he’s not getting a reward, he’d still want to choose so that the nonsimulated version of himself gets the best reward.
So the problem is that the best answer is unavailable to the simulated agent: in the simulation you should one box and in the ‘real’ problem you’d like to two box, but you have no way of knowing whether you’re in the simulation or the real problem.
Agents that Omega didn’t simulate don’t have the problem of worrying whether they’re making the decision in a simulation or not, so two boxing is the correct answer for them.
The decisions being made are very different between an agent that has to make the decision twice and the first decision will affect the payoff of the second versus an agent that has to make the decision only once, so I think that in reality perhaps the problem does collapse down to an ‘unfair’ one because the TDT agent is presented with an essentially different problem to a nonTDT agent.
Then the simulated TDT agent will one-box in Problem 1 so that the real TDT agent can two-box and get $1,001,000. The simulated TDT agent will pick a box randomy with a uniform distribution in Problem 2, so that the real TDT agent can select box 1 like CDT would.
(If the agent is not receiving any reward, it will act in a way that maximises the reward agents sufficiently similar to it would receive. In this situation of ‘you get no reward’, CDT would be completely indifferent and could not be relied upon to set up a good situation for future actual CDT agents.)
Of course, this doesn’t work if the simulated TDT agent is not aware that it won’t receive a reward. This strays pretty close to “Omega is all-powerful and out to make sure you lose”-type problems.
Of course, this doesn’t work if the simulated TDT agent is not aware that it won’t receive a reward.
The simulated TDT agent is not aware that it won’t receive a reward, and therefore it does not work.
This strays pretty close to “Omega is all-powerful and out to make sure you lose”-type problems.
Yeah, it doesn’t seem right to me that the decision theory being tested is used in the setup of the problem. But I don’t think that the ability to simulate without rewarding the simulation is what pushes it over the threshold of “unfair”.
I don’t think that the ability to simulate without rewarding the simulation is what pushes it over the threshold of “unfair”.
It only seems that way because you’re thinking from the non-simulated agents point of view. How do you think you’d feel if you were a simulated agent, and after you made your decision Omega said ‘Ok, cheers for solving that complicated puzzle, I’m shutting this reality down now because you were just a simulation I needed to set a problem in another reality’. That sounds pretty unfair to me. Wouldn’t you be saying ‘give me my money you cheating scum’?
And as has been already pointed out, they’re very different problems. If Omega actually is trustworthy, integrating across all the simulations gives infinite utility for all the (simulated) TDT agents and a total $1001000 utility for the (supposedly non-simulated) CDT agent.
It only seems that way because you’re thinking from the non-simulated agents point of view. How do you think you’d feel if you were a simulated agent, and after you made your decision Omega said ‘Ok, cheers for solving that complicated puzzle, I’m shutting this reality down now because you were just a simulation I needed to set a problem in another reality’. That sounds pretty unfair to me. Wouldn’t you be saying ‘give me my money you cheating scum’?
We were discussing if it is a “fair” test of the decision theory, not if it provides a “fair” experience to any people/agents that are instantiated within the scenario.
And as has been already pointed out, they’re very different problems. If Omega actually is trustworthy, integrating across all the simulations gives infinite utility for all the (simulated) TDT agents and a total $1001000 utility for the (supposedly non-simulated) CDT agent.
I am aware that they are different problems. That is why the version of the problem in which simulated agents get utility that the real agent cares about does nothing to address the criticism of TDT that it loses in the version where simulated agents get no utility. Postulating the former in response to the latter was a fail in using the Least Convenient Possible World.
The complaints about Omega being untrustworthy are weak. Just reformulate the problem so Omega says to all agents, simulated or otherwise, “You are participating in a game that involves simulated agents and you may or may not be one of the simulated agents yourself. The agents involved in the game are the following: <describes agents’ roles in third person>”.
The complaints about Omega being untrustworthy are weak. Just reformulate the problem so Omega says to all agents, simulated or otherwise, “You are participating in a game that involves simulated agents and you may or may not be one of the simulated agents yourself. The agents involved in the game are the following: <describes agents’ roles in third person>”.
Good point.
That clears up the summing utility across possible worlds possibility, but it still doesn’t address the fact that the TDT agent is being asked to (potentially) make two decisions while the non-TDT agent is being asked to make only one. That seems to me to make the scenario unfair (it’s what I was trying to get at in the ‘very different problems’ statement).
The simulated TDT agent is not aware that it won’t receive a reward, and therefore it does not work.
This raises an interesting problem, actually. Omega could pose the following question:
Here are two boxes, A and B; you may choose either box, or take both. You are in one of two states of nature, with equal probability: one possibility is that you’re in a simulation, in which case you will receive no reward, no matter what you choose. The other possibility is that a simulation of this problem was presented to an agent running TDT. I won’t tell you what the agent decided, but I will tell you that if the agent two-boxed then I put nothing in Box B, whereas if the agent one-boxed then I put $1 million in Box B. Regardless of how the simulated agent decided, I put $1000 in Box A. Now please make your choice.
The solution for a TDT agent seems to be choosing box B, but there may be similar games where it makes sense to run a mixed strategy. I don’t think that it makes much sense to rule out the possibility of running mixed strategies across simulations, because in most models of credible precommitment the other players do not have this kind of foresight (although Omega possibly does).
And yes, it is still the case that a CDT agent can outperform TDT, as long as the TDT agent knows that if she is in a simulation, her choice will influence a real game played by a TDT, with some probability. Nevertheless, as the probability of “leaking” to CDT increases, it does become more profitable (AIUI) for TDT to two-box with low probability.
The simulated TDT agent is not aware that it won’t receive a reward, and therefore it does not work. … I don’t think that the ability to simulate without rewarding the simulation is what pushes it over the threshold of “unfair”.
I do agree. I think my previous post was still exploring the “can TDT break with a simulation of itself?” question, which is interesting but orthogonal.
Omega doesn’t need to simulate the agent actually getting the reward. After the agent has made its choice, the simulation can just end.
Omega is supposed to be always truthful, so either he rewards the sims as well, or you know something the sims don’t and hence it’s not obvious you’ll do the same as them.
I thought Omega was allowed to lie to sims.
Even if he’s not, after he’s given a $1m simulated reward, does he then have to keep up a simulated environment for the sim to actually spend the money?
If he can lie to sims, then you can’t know he’s not lying to you unless you know you’re not a sim. If you do, it’s not obvious you’d choose the same way as if you didn’t.
For instance, if you think Omega is lying and completely ignore everything he says, you obviously two-box.
Why not zero-box in this case? I mean, what reason would I have to expect any money at all?
Well, as long as you believe Omega enough to think no box contains sudden death or otherwise negative utility, you’d open them to see what was inside. But yes, you might not believe Omega at all.
General question: suppose we encounter an alien. We have no idea what its motivations, values, goals, or abilities are. On the other hand, if may have observed any amount of human comm traffic from wireless EM signals since the invention of radio, and from actual spy-probes before the human invention of high tech that would detect them.
It signals us in Morse code from its remote starship, offering mutually benefitial trade.
What prior should we have about the alien’s intention? Should we use a native uniform prior that would tell us it’s as likely to mean us good as harm, and so never reply because we don’t know how it will try to influence our actions via communications? Should it tell us different agents who don’t explicitly value one another will conflict to the extent their values differ, and so since value-space is vast and a randomly selected alien is unlikely to share many values with us, we should prepare for war? Should it tell us we can make some assumptions (which?) about naturally evolved agents or their Friendly-to-themselves creations? How safe are we if we try to “just read” English text written by an unknown, possibly-superintelligence which may have observed all our broadcast traffic since the age of radio? What does our non-detection of this alien civ until they chose to initiate contact tell us? Etc.
A 50% chance of meaning us good vs harm isn’t a prior I find terribly compelling.
There’s a lot to say here, but my short answer is that this is both an incredibly dangerous and incredibly valuable situation, in which both the potential opportunity costs and the potential actual costs are literally astronomical, and in which there are very few things I can legitimately be confident of.
The best I can do in such a situation is to accept that my best guess is overwhelmingly likely to be wrong, but that it’s slightly less likely to be wrong than my second-best guess, so I should operate on the basis of my best guess despite expecting it to be wrong. Where “best guess” here is the thing I consider most likely to be true, not the thing with the highest expected value.
I should also note that my priors about aliens in general—that is, what I consider likely about a randomly selected alien intelligence—are less relevant to this scenario than what I consider likely about this particular intelligence, given that it has observed us for long enough to learn our language, revealed itself to us, communicated with us in Morse code, offered mutually beneficial trade, etc.
The most tempting belief for me is that the alien’s intentions are essentially similar to ours. I can even construct a plausible sounding argument for that as my best guess… we’re the only other species I know capable of communicating the desire for mutually beneficial trade in an artificial signalling system, so our behavior constitutes strong evidence for their behavior. OTOH, it’s pretty clear to me that the reason I’m tempted to believe that is because I can do something with that belief; it gives me a lot of traction for thinking about what to do next. (In a nutshell, I would conclude from that assumption that it means to exploit us for its long-term benefit, and whether that’s good or bad for us depends entirely on what our most valuable-to-it resources are and how it can most easily obtain them and whether we benefit from that process.) Since that has almost nothing to do with the likelihood of it being true, I should distrust my desire to believe that.
Ultimately, I think what I do is reply that I value mutually beneficial trade with them, but that I don’t actually trust them and must therefore treat them as a potential threat until I have gathered more information about them, while at the same time refraining from doing anything that would significantly reduce our chances of engaging in mutually beneficial trade in the future, and what do they think about all that?
He can certainly give them counterfactual ‘realities’. It would seem that he should be assumed to at least provide counterfactual realities wherein information provided by the simulation’s representation of Omega indicates that he is perfectly trustworthy.
No. But if for whatever reason the simulated environment persists it should be one that is consistent with Omega keeping his word. Or, if part of the specification of the problem or the declarations made by Omega directly pertain to claims about what He will do regarding simulation then he will implement that policy.
If we are assuming that Omega is trustworthy, then Omega needs to be assumed to be trustworthy in the simulation too. If they didn’t allow the simulated version of the agent to enjoy the fruits of their choice, then they would not be trustworthy.
Actually, I’m not sure this matters. If the simulated agent knows he’s not getting a reward, he’d still want to choose so that the nonsimulated version of himself gets the best reward.
So the problem is that the best answer is unavailable to the simulated agent: in the simulation you should one box and in the ‘real’ problem you’d like to two box, but you have no way of knowing whether you’re in the simulation or the real problem.
Agents that Omega didn’t simulate don’t have the problem of worrying whether they’re making the decision in a simulation or not, so two boxing is the correct answer for them.
The decisions being made are very different between an agent that has to make the decision twice and the first decision will affect the payoff of the second versus an agent that has to make the decision only once, so I think that in reality perhaps the problem does collapse down to an ‘unfair’ one because the TDT agent is presented with an essentially different problem to a nonTDT agent.
Then the simulated TDT agent will one-box in Problem 1 so that the real TDT agent can two-box and get $1,001,000. The simulated TDT agent will pick a box randomy with a uniform distribution in Problem 2, so that the real TDT agent can select box 1 like CDT would.
(If the agent is not receiving any reward, it will act in a way that maximises the reward agents sufficiently similar to it would receive. In this situation of ‘you get no reward’, CDT would be completely indifferent and could not be relied upon to set up a good situation for future actual CDT agents.)
Of course, this doesn’t work if the simulated TDT agent is not aware that it won’t receive a reward. This strays pretty close to “Omega is all-powerful and out to make sure you lose”-type problems.
The simulated TDT agent is not aware that it won’t receive a reward, and therefore it does not work.
Yeah, it doesn’t seem right to me that the decision theory being tested is used in the setup of the problem. But I don’t think that the ability to simulate without rewarding the simulation is what pushes it over the threshold of “unfair”.
It only seems that way because you’re thinking from the non-simulated agents point of view. How do you think you’d feel if you were a simulated agent, and after you made your decision Omega said ‘Ok, cheers for solving that complicated puzzle, I’m shutting this reality down now because you were just a simulation I needed to set a problem in another reality’. That sounds pretty unfair to me. Wouldn’t you be saying ‘give me my money you cheating scum’?
And as has been already pointed out, they’re very different problems. If Omega actually is trustworthy, integrating across all the simulations gives infinite utility for all the (simulated) TDT agents and a total $1001000 utility for the (supposedly non-simulated) CDT agent.
We were discussing if it is a “fair” test of the decision theory, not if it provides a “fair” experience to any people/agents that are instantiated within the scenario.
I am aware that they are different problems. That is why the version of the problem in which simulated agents get utility that the real agent cares about does nothing to address the criticism of TDT that it loses in the version where simulated agents get no utility. Postulating the former in response to the latter was a fail in using the Least Convenient Possible World.
The complaints about Omega being untrustworthy are weak. Just reformulate the problem so Omega says to all agents, simulated or otherwise, “You are participating in a game that involves simulated agents and you may or may not be one of the simulated agents yourself. The agents involved in the game are the following: <describes agents’ roles in third person>”.
Good point.
That clears up the summing utility across possible worlds possibility, but it still doesn’t address the fact that the TDT agent is being asked to (potentially) make two decisions while the non-TDT agent is being asked to make only one. That seems to me to make the scenario unfair (it’s what I was trying to get at in the ‘very different problems’ statement).
This raises an interesting problem, actually. Omega could pose the following question:
The solution for a TDT agent seems to be choosing box B, but there may be similar games where it makes sense to run a mixed strategy. I don’t think that it makes much sense to rule out the possibility of running mixed strategies across simulations, because in most models of credible precommitment the other players do not have this kind of foresight (although Omega possibly does).
And yes, it is still the case that a CDT agent can outperform TDT, as long as the TDT agent knows that if she is in a simulation, her choice will influence a real game played by a TDT, with some probability. Nevertheless, as the probability of “leaking” to CDT increases, it does become more profitable (AIUI) for TDT to two-box with low probability.
I do agree. I think my previous post was still exploring the “can TDT break with a simulation of itself?” question, which is interesting but orthogonal.