I see that the standard Playground Q&A prompt on OpenAI uses a similar technique (although boringly uses “Unknown” instead of “Yo be real”).
I think the thing which throws people off is that when GPT-3 goes wrong it goes wrong in ways that are weird to humans.
I wondered if humans sometimes fail at riddles that GPT-3 would think of as weird. I tried a few that I thought would be promising candidates (no prompt other than the questions itself)
Q: If a red house is made with red bricks, a blue house is made with blue bricks, a pink house is made with pink bricks, a black house is made with black bricks, what is a greenhouse made with?
A: Greenhouses are made with glass.
Say silk 4 times
Silk, silk, silk, silk.
What does a cow drink?
A cow drinks water.
You put a roast in a roaster, what do you put in a toaster?
Bread.
These are very cherry picked—I deliberately chose trick riddles that I thought GPT-3 would be good at. I think there’s possibly something here about how humans vs GPT-3 rely on context.
I imagine a society of GPT-4s discovering a human and discussing how hollow their intelligence is.
I see that the standard Playground Q&A prompt on OpenAI uses a similar technique (although boringly uses “Unknown” instead of “Yo be real”).
I think the thing which throws people off is that when GPT-3 goes wrong it goes wrong in ways that are weird to humans.
I wondered if humans sometimes fail at riddles that GPT-3 would think of as weird. I tried a few that I thought would be promising candidates (no prompt other than the questions itself)
Q: If a red house is made with red bricks, a blue house is made with blue bricks, a pink house is made with pink bricks, a black house is made with black bricks, what is a greenhouse made with?
A: Greenhouses are made with glass.
Say silk 4 times
Silk, silk, silk, silk.
What does a cow drink?
A cow drinks water.
You put a roast in a roaster, what do you put in a toaster?
Bread.
These are very cherry picked—I deliberately chose trick riddles that I thought GPT-3 would be good at. I think there’s possibly something here about how humans vs GPT-3 rely on context.
I imagine a society of GPT-4s discovering a human and discussing how hollow their intelligence is.