Sounds like you simply assumed that saying you could disgust the gatekeeper would make them believe they would be disgusted. But the kind of reaction to disgust that could make a gatekeeper let the AI out needs to be instantiated to have impact.
Most people won’t get sad just imagining that something sad could happen. (Also, duh, calling out the bluff.)
In practice, if you had spent the time to find disgusting content and share, it would have been somewhat equivalent to torturing the gatekeeper, which in the extreme case might work on a significant fraction of the population, but it’s also kind of obvious that we could prevent that.
Sounds like you simply assumed that saying you could disgust the gatekeeper would make them believe they would be disgusted.
But the kind of reaction to disgust that could make a gatekeeper let the AI out needs to be instantiated to have impact.
Most people won’t get sad just imagining that something sad could happen. (Also, duh, calling out the bluff.)
In practice, if you had spent the time to find disgusting content and share, it would have been somewhat equivalent to torturing the gatekeeper, which in the extreme case might work on a significant fraction of the population, but it’s also kind of obvious that we could prevent that.