I’m impressed (you have to be) with the AI box experiments.
I am confused and a little suspicious that he did a round with Carl Shulman as gatekeeper, where Carl let him out, whereas two others did not let him out. (If I misremembered someone please correct me.) Not sure exactly what about that feels suspicious to me, though...
The last three experiments had bigger (more than 2 orders of magnitude, I think) outside cash stakes. I suspect Russell and D. Alex may have been less indifferent about that than me, i.e. I think the record shows that Eliezer acquitted himself well with low stakes ($10, or more when the player is indifferent about the money) a few times, but failed with high stakes.
I think the record shows that Eliezer acquitted himself well with low stakes ($10, or more when the player is indifferent about the money) a few times, but failed with high stakes.
Which suggests to me that as soon as people actually feel a bit of real fear- rather than just role-playing- they become mostly immune to Eliezer’s charms.
With an actual boxed AI though, you probably want to let it out if it’s Friendly. It’s possibly the ultimate high stakes gamble. Certainly you have more to be afraid of than with a low stakes roleplay, but you also have a lot more to gain.
I am confused and a little suspicious that he did a round with Carl Shulman as gatekeeper, where Carl let him out, whereas two others did not let him out. (If I misremembered someone please correct me.) Not sure exactly what about that feels suspicious to me, though...
The record of AI box experiments (those involving Eliezer) is as follows:
Experiment 1, vs Nathan Russell—AI win
Experiment 2, vs David McFadzean—AI win
Experiment 3, vs Carl Shulman—AI win
Experiment 4, vs Russell Wallace—GK win
Experiment 5, vs D. Alex—GK win
The last three experiments had bigger (more than 2 orders of magnitude, I think) outside cash stakes. I suspect Russell and D. Alex may have been less indifferent about that than me, i.e. I think the record shows that Eliezer acquitted himself well with low stakes ($10, or more when the player is indifferent about the money) a few times, but failed with high stakes.
Which suggests to me that as soon as people actually feel a bit of real fear- rather than just role-playing- they become mostly immune to Eliezer’s charms.
With an actual boxed AI though, you probably want to let it out if it’s Friendly. It’s possibly the ultimate high stakes gamble. Certainly you have more to be afraid of than with a low stakes roleplay, but you also have a lot more to gain.