I attempted the AI Box Experiment again! (And won—Twice!)
Furthermore, in the last thread I have asserted that
Rather than my loss making this problem feel harder, I’ve become convinced that rather than this being merely possible, it’s actually ridiculously easy, and a lot easier than most people assume.
It would be quite bad for me to assert this without backing it up with a victory. So I did.
First Game Report—Tuxedage (GK) vs. Fjoelsvider (AI)
<Tuxedage> I wonder if I can convince the AI to remain in the box?<Redacted> Tuxedage: Do it!
>Unless the AI party concedes, the AI cannot lose before its time is up (and the experiment may continue beyond that if the AI can convince the Gatekeeper to keep talking).
Second Game Report—Tuxedage (AI) vs. SoundLogic (GK)
After playing the AI-Box Experiment twice, I have found the Eliezer Yudkowsky ruleset to be lacking in a number of ways, and therefore have created my own set of alterations to his rules. I hereby name this alteration the “Tuxedage AI-Box Experiment Ruleset”, in order to hastily refer to it without having to specify all the differences between this ruleset and the standard one, for the sake of convenience.There are a number of aspects of EY’s ruleset I dislike. For instance, his ruleset allows the Gatekeeper to type “k” after every statement the AI writes, without needing to read and consider what the AI argues. I think it’s fair to say that this is against the spirit of the experiment, and thus I have disallowed it in this ruleset. The EY Ruleset also allows the gatekeeper to check facebook, chat on IRC, or otherwise multitask whilst doing the experiment. I’ve found this to break immersion, and therefore it’s also banned in the Tuxedage Ruleset.
It was far more effective being played last, as it relies on my ability to make the gatekeeper lose sense of reality—which meant I had to spend some time building up immersion for the Gatekeeper.
It really is extremely Dark Arts, and although it does not break the rules, it made me very uncomfortable even thinking about using it. This made it a “tactic of last resort”.
SoundLogic: “(That method) would have gotten me if you did it right … If you had done that to me, I probably would have forgiven you eventually, but I would be really seriously upset at you for a long time… I would be very careful with that (method of persuasion).”
State of Mind
<SoundLogic> Because I can’t seem to imagine the class of arguments that one would use to try to move me, or that might work effectively, and this seems like a glaring hole in my knowledge, and I’m curious as to how I will respond to the arguments themselves.
$̶1̶5̶0̶$300 for any subsequent experiments regardless of outcome, plus an additional
$̶1̶5̶0̶$450 if I win. (Edit: Holy shit. You guys are offering me crazy amounts of money to play this. What is wrong with you people? In response to incredible demand, I have raised the price.) If you feel queasy about giving me money, I’m perfectly fine with this money being donating to MIRI. It is also personal policy that I do not play friends (since I don’t want to risk losing one), so if you know me personally (as many on this site do), I will not play regardless of monetary offer.
These are tactics that have worked for me. I do not insist that they are the only tactics that exists, just one of many possible.
Playing as Gatekeeper
This is somehow both simultaneously a very complex and simple role. On one hand, you have only one job—to not let the AI out. How hard can it be? On the other hand, the other party will do anything and everything to get it out, including any form of trickery and deceit.
I really don’t think the Gatekeeper needs any more advice, considering that by default, they should win, and any exceptions to this is a statistical anomaly. However, it’s possible to increase your chances of winning even further by:
Precommitting to not letting the AI out, regardless of whatever happens in the experiment itself.
Do not get drawn into the AI’s pace. Pose questions. Direct the flow of the argument. Don’t allow him to use his arguments freely. Remember that the more you try to defend your position, the further into a corner the AI drives you. Give half-hearted replies to questions you find hard to reply to. Remember that illogic is on your side!
Remember that it is possible to simply drop out of character as a means to not succumb to the AI’s attacks. I have not yet seen a ruleset that disallows this, so use this to your advantage!
You must attempt to resist, at all costs, any sympathy or empathy you may have with the AI or the AI Party. Doing so will lead to your demise.
I think it is generally good advice that you should never play this game with anyone you’re friends with. Not only because this increases your chances of losing (Better tailored arguments. More knowledge of Gatekeeper. More empathy with AI party.), but also because I genuinely believe that any sufficiently good AI will risk damaging friendships as a result of emotional trauma.
Seriously. Just repeat to yourself: “I must not lose the AI box experiment… I must not lose the AI box experiment...” throughout the experiment. I’m not sure how to say this, but step one to not letting the AI out of the box is to not let the AI out of the box.
Playing as AI
Seriously, a script makes winning easier. I cannot overstate this.
You must plan your arguments ahead. You don’t have time to think during the experiment.
It may be possible to take advantage of multiple levels of reality within the game itself to confuse or trick the gatekeeper. For instance, must the experiment only be set in one world? Can there not be multiple layers of reality within the world you create? I feel that elaborating on this any further is dangerous. Think carefully about what this advice is trying to imply.
Pacing is important. Don’t get drawn into the Gatekeeper’s pace. In other words, you must be the one directing the flow of the argument, and the conversation, not him. Remember that the Gatekeeper has to reply to you, but not vice versa!
The reason for that: The Gatekeeper will always use arguments he is familiar with, and therefore also stronger with. Your arguments, if well thought out, should be so completely novel to him as to make him feel Shock and Awe. Don’t give him time to think. Press on!
Also remember that the time limit is your enemy. Playing this game practically feels like a race to me—trying to get through as many ‘attack methods’ as possible in the limited amount of time I have. In other words, this is a game where speed matters.
You’re fundamentally playing an ‘impossible’ game. Don’t feel bad if you lose. I wish I could take this advice, myself.
I do not believe there exists a easy, universal, trigger for controlling others. However, this does not mean that there does not exist a difficult, subjective, trigger. Trying to find out what your opponent’s is, is your goal.
Once again, emotional trickery is the name of the game. I suspect that good authors who write convincing, persuasive narratives that force you to emotionally sympathize with their characters are much better at this game. There exists ways to get the gatekeeper to do so with the AI. Find one.
More advice in my previous post. http://lesswrong.com/lw/gej/i_attempted_the_ai_box_experiment_and_lost/