Tuxedage comments on I attempted the AI Box Experiment again! (And won—Twice!)

Tuxedage 6 Sep 2013 2:03 UTC
6 points
0
Thanks! I really appreciate it. I tried really hard to find a recorded case of a non-EY victory, but couldn’t. That post was obscure enough to evade my Google-Fu—I’ll update my post on this information.

Albeit I have to admit it’s disappointing that the AI himself didn’t write about his thoughts on the experiment—I was hoping for a more detailed post. Also, damn. That guy deleted his account. Still, thanks. At least I know I’m not the only AI that has won, now.
- RussellCunningham 6 Sep 2013 14:09 UTC
  8 points
  0
  Parent
  Who’s to say I’m not the AI player from that experiment?
  
  That experiment was played according to the standard EY ruleset, though I think your ruleset is an improvement. Like you, the AI player from that experiment was quite confident he would win before playing, but was overconfident in spite of the fact that he actually won.
  
  I think both you and Eliezer played a far better game than the AI player from that experiment. The AI player from that experiment did (independently) play in accordance with much of your advice, including:
  
  Always research the gatekeeper beforehand. Knowing his personality traits are a huge advantage.
  
  The first step during the experiment must always be to build rapport with the gatekeeper.
  
  You can’t use logic alone to win.
  
  Breaking immersion and going meta is not against the rules. In the right situation, you can use it to win. Just don’t do it at the wrong time.
  
  On the same note, look for signs that a particular argument is making the gatekeeper crack. Once you spot it, push it to your advantage.
  
  I agree with:
  
  I do not believe there exists a easy, universal, trigger for controlling others. However, this does not mean that there does not exist a difficult, subjective, trigger. Trying to find out what your opponent’s is, is your goal.
  
  I am <1% confident that humanity will successfully box every transhuman AI it creates, given that it creates at least one. Even if AIs #1, #2, and #3 get properly boxed (and I agree with the Gatekeeper from the experiment I referenced, that’s a very big if), it really won’t matter once AI #4 gets released a year later (because the programmers just assumed all of Eliezer’s (well-justified) claims about AI were wrong, and thought that one of them watching the terminal at a time would be safety enough).
  
  Anybody who still isn’t taking this experiment seriously should start listening for that tiny note of discord. A good start would be reading:
  At least I know I’m not the only AI that has won, now.
  
  Good thing to know, right? :D
  
  My own (admittedly, rather obvious) musings: The only reason more people haven’t played as the AI and won is that almost all people capable of winning as the AI are either unaware of the experiment, or are aware of it but just don’t have a strong enough incentive to play as the AI (note that you’ve asked for a greater incentive now that you’ve won just once as AI, and Eliezer similarly has stopped playing). I am ~96% confident that at least .01% of Earth’s population is capable of winning as the AI, and I increase that to >99% confident if all of Earth’s population was forced to stop and actually think about the problem for 5 minutes. Additionally, if the comment I linked to is truly the only record of a non-Eliezer AI win before you made this post, I am ~50% (adjusted downward from 70% upon further reflection) confident that an unrecorded AI win had occurred prior to you making this post.
  
  Anyways, congratulations on your victory! I do hope to see you win again as the AI, so I commit to donating $50 to MIRI if you do win again as the AI and post about it on Less Wrong similarly to how you made this post.
  - Tuxedage 6 Sep 2013 18:30 UTC
    4 points
    0
    Parent
    
    Who’s to say I’m not the AI player from that experiment?
    
    Are you? I’d be highly curious to converse with that player.
    
    I think you’re highly overestimating your psychological abilities relative to the rest of Earth’s population. The only reason more people haven’t played as the AI and won is that almost all people capable of winning as the AI are either unaware of the experiment, or are aware of it but just don’t have a strong enough incentive to play as the AI (note that you’ve asked for a greater incentive now that you’ve won just once as AI, and Eliezer similarly has stopped playing). I am ~96% confident that at least .01% of Earth’s population is capable of winning as the AI, and I increase that to >99% confident if all of Earth’s population was forced to stop and actually think about the problem for 5 minutes.
    
    I have neither stated nor believed that I’m the only person capable of winning, nor do I think this is some exceptionally rare trait. I agree that a significant number of people would be capable of winning once in a while, given sufficient experience in games, effort, and forethought. If I gave any impression of arrogance, or somehow claiming to be unique or special in some way, I apologize for that impression. Sorry. It was never my goal to.
    
    However, top .01% isn’t too shabby. Congratulations on your victory. I do hope to see you win again as the AI, so I commit to donating $50 to MIRI if you do win again as the AI and post about it on Less Wrong similarly to how you made this post.
    
    Thank you. I’ll see if I can win again.
    - RussellCunningham 7 Sep 2013 2:47 UTC
      3 points
      0
      Parent
      
      I have neither stated nor believed that I’m the only person capable of winning, nor do I think this is some exceptionally rare trait. I agree that a significant number of people would be capable of winning once in a while, given sufficient experience in games, effort, and forethought. If I gave any impression of arrogance, or somehow claiming to be unique or special in some way, I apologize for that impression. Sorry. It was never my goal to.
      
      This was my fault, not yours. I did not take any of those negative impressions away from your writing, but I was just too lazy / exhausted last night to rewrite my comment again. I’ve now edited it.
      
      Are you? I’d be highly curious to converse with that player.
      
      I’ll PM you regarding this as soon as I can get around to it.