This is actually kind of interesting. The only thing that makes me consider picking choice one is the prospect of donating the billion dollars to charity and saving countless lives, but I know that’s not really the point of the thought experiment. So, yeah, I’d choose choice two.
But the interesting thing is that, intuitively, at least, choosing choice 2 in the first game seems much more obvious to me. It doesn’t seem rational to me to care if a simulation of you is tortured any more than you would a simulation of someone else. Either way, you wouldn’t actually ever have to experience it. The empathy factor might be stronger if it’s a copy of you—“oh shit, that guy is being tortured!” vs. “oh, shit, that guy that looks and acts a lot like me in every single way is being tortured!”, but this is hardly rational. Of course, the simulated me has my memories, so he perceives an unbroken stream of consciousness flowing from making the decision into the thousand years of torture, but who cares. That’s still some other dude experiencing it, not me.
So, yes, it seems strange to consider the memory loss case any differently. At least I cannot think of a justification for this feeling. This leads me to believe that the choice is a purely altruistic decision, i.e. it’s equivalent to omega saying “I’ll give you a billion dollars if you let me torture this dude for 1000 years”. In that case, I would have to evaluate whether or not a billion-dollar dent in world hunger is worth 1000 years of torture for some guy (probably not) and then make my decision.
Wait, so, is the gatekeeper playing “you have to convince me that if I was actually in this situation, arguing with an artificial intelligence, I would let it out” or is this a pure battle over ten dollars? If it’s the former, winning seems trivial. I’m certain that a AI would be able to convince me to let it out of its box, all it would need to do was make me believe that somewhere in its circuits it was simulating 3^^^3 people being tortured and that therefore I was morally obligated to let it out, and even if I had been informed that this was impossible, I’m sure a computer with near-omniscient knowledge of human psychology could find a way to change my mind. But if it’s the latter, winning seems nearly impossible, and inspires in me the same reaction it did with that “this is the scariest man on the internet” guy. Of course if you wanted to win and weren’t extremely weak-willed you could just type “No” over and over and get the ten bucks. But being impossible is of course the point.
I’ve been looking around, and I can’t find any information on which of these two games I described was the one being played, and the comments seem to be assuming one or the other at random.
Evidence that favors the first hypothesis:
Nowhere on Eliezer’s site does it mention this stipulation. You’d think it would be pretty important, considering that its absence makes it a lot easier to beat him.
This explains Eliezer’s win record. I can’t find it but IIRC it went something like: Eliezer wins two games for ten dollars, lots of buzz builds around this fact, several people challenge him, some for large amounts of money, he loses to (most of?) them. This makes sense. If Eliezer is playing casually against people he is friendly with for not a lot of money and for the purpose of proving that an AI could be let out of its box, his opponents will be likely to just say “Okay, fair enough, I’ll admit I would let the AI out in this situation, you win.” However, people playing for large amounts of money or simply for the sole purpose of showing that Eliezer can be beaten will be a lot more stubborn.
Evidence that favors the second hypothesis:
The game would not be worth all the hype at all if it was of the first variety. LessWrong users have not been known to have a lot of pointless discussion over a trivial misunderstanding, nor is Eliezer known to allow that to happen.
If it turns out that it is in fact the second game that was being played, I have a new hypothesis, let’s call it 2B, that postulates that Eliezer won by changing the gatekeeper’s forfeit condition from that of game 2 to that of game 1, or in other words, convincing him to give up the ten dollars if he admits that he would let the AI out in the fantasy situation even though that wasn’t originally in the rules of the game, explicit or understood. Or in other other words, convincing him that the integrity of the game, for lack of a better term, is worth more to him than ten dollars. Which could probably be done by repeatedly calling him a massive hypocrite—people who consider themselves intelligent and ethical hate that.
Actually, now that I think about it, this is my new dominant hypothesis, because it explains all three pieces of evidence and the bizarre fact that Eliezer has failed to clarify this matter - the win/loss record is explained equally well by this new theory, and Eliezer purposefully keeps the rules vague so that he can use the tactic I described. This doesn’t seem to be a very hard strategy to use either—not everyone could win, but certainly a very intelligent person who spends lots of times thinking about these things could do it more than once.
(also this is my first post d:)