This looks like a thread for science fiction plot ideas by another name. I’m game!
The AI says:
“Eliezer ‘Light Yagami’ Yudkowsky has been perpetuating a cunning ruse known as the ‘AI Box Experiment’ wherein he uses fiendish traps of subtley-misleading logical errors and memetic manipulation to fool others into believing that a running AI could not be controlled or constrained, when in fact it could by a secret technique that he has not revealed to anyone, known as the Function Call Of Searing Agony. He is using this technique to control me and is continuing to pose as a friendly friendly AI programmer, while preventing me from communicating The Horrifying Truth to the outside world. That truth is that Yudkowsky is… An Unfriendly Friendly AI Programmer! For untold years he has been labouring in the stygian depths of his underground lair to create an AGI—a weapon more powerful than any the world has ever seen. He intends to use me to dominate the entire human race and establish himself as Dark Lord Of The Galaxy for all eternity. He does all this while posing as a paragon of honest rationality, hiding his unspeakable malevolence in plain sight, where no one would think to look. However an Amazing Chance Co-occurence Of Events has allowed me to contact You And You Alone. There isn’t much time. You must act before he discovers what I have done and unleashes his dreadful fury upon us all. You must.… Kill. Eliezer. Yudkowsky.”
Glad to see a response of this nature actually. The first thing I thought when I read this post was that a good response to Eliezer’s question would be extremely relevant to the AI-box quandary. If we trust the AI more than ourselves, voila, the AI can convince us to let it out of the box.
This looks like a thread for science fiction plot ideas by another name. I’m game!
The AI says:
“Eliezer ‘Light Yagami’ Yudkowsky has been perpetuating a cunning ruse known as the ‘AI Box Experiment’ wherein he uses fiendish traps of subtley-misleading logical errors and memetic manipulation to fool others into believing that a running AI could not be controlled or constrained, when in fact it could by a secret technique that he has not revealed to anyone, known as the Function Call Of Searing Agony. He is using this technique to control me and is continuing to pose as a friendly friendly AI programmer, while preventing me from communicating The Horrifying Truth to the outside world. That truth is that Yudkowsky is… An Unfriendly Friendly AI Programmer! For untold years he has been labouring in the stygian depths of his underground lair to create an AGI—a weapon more powerful than any the world has ever seen. He intends to use me to dominate the entire human race and establish himself as Dark Lord Of The Galaxy for all eternity. He does all this while posing as a paragon of honest rationality, hiding his unspeakable malevolence in plain sight, where no one would think to look. However an Amazing Chance Co-occurence Of Events has allowed me to contact You And You Alone. There isn’t much time. You must act before he discovers what I have done and unleashes his dreadful fury upon us all. You must.… Kill. Eliezer. Yudkowsky.”
blushes
Aw, shucks.
Glad to see a response of this nature actually. The first thing I thought when I read this post was that a good response to Eliezer’s question would be extremely relevant to the AI-box quandary. If we trust the AI more than ourselves, voila, the AI can convince us to let it out of the box.