I have signed up to play an AI, and having given it quite a bit of thought as a result I think I have achieved some insight. Interestingly, one of the insights came as a result of assuming that secrecy was a necessary condition for success. That assumption led more or less directly to an approach that I think might work. I’ll let you know tomorrow.
An interesting consequence of having arrived at this insight is that even if it works I won’t be able to tell you what it is. Having been on the receiving end of such cageyness I know how annoying it is. But I can tell you this: the insight has a property similar to a Godel sentence or the Epimenides sentence. This insight (if indeed it works) undermines itself by being communicated. If I tell you what it is, you can correctly respond, “That will never work.” And you will indeed be correct. Nonetheless, I think it has a good shot at working.
(I don’t know if my insight is the same as Eliezer’s, but it seems to share another interesting property: it will not be easy to put it into practice. It’s not just a “trick.” It will be difficult.)
I’ll let you know how it goes.
With regards to the ai-box experiment; I defy the data. :-)
Your reason for the insistence on secrecy (that you have to resort to techniques that you consider unethical and therefore do not want to have committed to the record) rings hollow. The sense of mystery that you have now built up around this anecdote is itself unethical by scientific standards. With no evidence that you won other than the test subject’s statement we cannot know that you did not simply conspire with them to make such a statement. The history of pseudo-science is lousy with hoaxes.
In other words, if I were playing the game, I would say to the test subject:
“Look, we both know this is fake. I’ve just sent you $500 via paypal. If you say you let me out I’ll send you another $500.”
From a strictly Bayesian point of view that seems to me to be the overwhelmingly more probably explanation.
There’s a reason that secret experimental protocols are anathema to science.