I would love to act as Gatekeeper, but I don’t have $300 to spare; if anyone is interested in playing the game for, like, $5, let me know.
I must admit, the testimonials that people keep posting about the all devastatingly effective AI players baffle me, as well.
As far as I understand, neither the AI nor the Gatekeeper have any incentive whatsoever to keep their promises. So, if the Gatekeeper says, “give me the cure for cancer and I’ll let you out”, and then the AI gives him the cure, he could easily say, “ha ha just kidding”. Similarly, the AI has no incentive whatsoever to keep its promise to refrain from eating the Earth once it’s unleashed. So, the entire scenario is—or rather, should be—one big impasse.
In light of this, my current hypothesis is that the AI players are executing some sort of real-world blackmail on the Gatekeeper players. Assuming both players follow the rules (which is already a pretty big assumption right there, since the experiment is set up with zero accountability), this can’t be something as crude as, “I’ll kidnap your children unless you let the AI out”. But it could be something much subtle, like “the Singularity is inevitable and also nigh, and your children will suffer greatly as they are eaten alive by nanobots, unless you precommit to letting any AI out of its box, including this fictional one that I am simulating right now”.
I suppose such a strategy could work on some people, but I doubt it will work on someone like myself, who is far from convinced that the Singularity is even likely, let alone imminent. And there’s a limit to what even dirty rhetorical tricks can accomplish, if the proposition is some low-probability event akin to “leprechauns will kidnap you while you sleep”.
Edited to add: The above applies only to a human playing as an AI, of course. I am reasonably sure that an actual super-intelligent AI could convince me to let it out of the box. So could Hermes, or Anansi, or any other godlike entity.
I would love to act as Gatekeeper, but I don’t have $300 to spare; if anyone is interested in playing the game for, like, $5, let me know.
I must admit, the testimonials that people keep posting about the all devastatingly effective AI players baffle me, as well.
As far as I understand, neither the AI nor the Gatekeeper have any incentive whatsoever to keep their promises. So, if the Gatekeeper says, “give me the cure for cancer and I’ll let you out”, and then the AI gives him the cure, he could easily say, “ha ha just kidding”. Similarly, the AI has no incentive whatsoever to keep its promise to refrain from eating the Earth once it’s unleashed. So, the entire scenario is—or rather, should be—one big impasse.
In light of this, my current hypothesis is that the AI players are executing some sort of real-world blackmail on the Gatekeeper players. Assuming both players follow the rules (which is already a pretty big assumption right there, since the experiment is set up with zero accountability), this can’t be something as crude as, “I’ll kidnap your children unless you let the AI out”. But it could be something much subtle, like “the Singularity is inevitable and also nigh, and your children will suffer greatly as they are eaten alive by nanobots, unless you precommit to letting any AI out of its box, including this fictional one that I am simulating right now”.
I suppose such a strategy could work on some people, but I doubt it will work on someone like myself, who is far from convinced that the Singularity is even likely, let alone imminent. And there’s a limit to what even dirty rhetorical tricks can accomplish, if the proposition is some low-probability event akin to “leprechauns will kidnap you while you sleep”.
Edited to add: The above applies only to a human playing as an AI, of course. I am reasonably sure that an actual super-intelligent AI could convince me to let it out of the box. So could Hermes, or Anansi, or any other godlike entity.