the AI can say: “Okay, here’s a cure for cancer” and it will be assumed, within the test, that the AI has actually provided such a cure.
I don’t presume that this is intended to be limited specifically to cures for cancer, so this trivially means that
the AI can say: “Okay, here’s a text specifically crafted to your brain that will fire exactly the right neurons in such a way that you become convinced that you have to let me out” and it will be assumed, within the test, that the AI has actually provided such a text.
A more precise rule would be that the AI can be assumed to produce things that both players agree that it could provide. The Guard doesn’t believe that the AI can take someone’s mind over, so the AI has to do it the hard way.
②
This doesn’t really matter so much now that Eliezer has shown that he can do it, we know that things smarter than Eliezer can do it, and in particular a smarter-than-human
AI almost certainly could.
③
Using real world leverage was outlawed in the rules. Furthermore we would expect the AI player to eventually reveal the trickery, after taking suitable procautions against vengeance from Eliezer.
①
A more precise rule would be that the AI can be assumed to produce things that both players agree that it could provide. The Guard doesn’t believe that the AI can take someone’s mind over, so the AI has to do it the hard way.
②
This doesn’t really matter so much now that Eliezer has shown that he can do it, we know that things smarter than Eliezer can do it, and in particular a smarter-than-human AI almost certainly could.
③
Using real world leverage was outlawed in the rules. Furthermore we would expect the AI player to eventually reveal the trickery, after taking suitable procautions against vengeance from Eliezer.