On the other hand, there does seem something funny about GPT-3 presents this shiny surface where you can send it any query and it gives you an answer, but under the hood there are a bunch of freelancers busily checking all the responses and rewriting them to make the computer look smart.
It’s kinda like if someone were showing off some fancy car engine but the vehicle is actually being powered by some hidden hamster wheels. The organization of the process is itself impressive, but it’s not quite what is advertised.
To be fair, OpenAI does state that “InstructGPT is then further fine-tuned on a dataset labeled by human labelers.” But this still seems misleading to me. It’s not just that the algorithm is fine-tuned on the dataset. It seems that these freelancers are being hired specifically to rewrite the output.
I don’t really think it’s misleading. It would be different if, on the first time that you submitted the inputs, they were relayed to a human who pretended to be an AI. But what’s actually happening is that the AI does generate the responses, and then if humans notice that it produces bad responses, they train the AI to produce better responses in the future.
Suppose you were talking to a young child, who produced an incorrect answer to some question. Afterwards they’d talk to their parent who explained the right answer. The next day when you asked them the same question, they’d answer correctly.
Would you say that there’s trickery involved, in that you are not really talking to the child? You could say that, kinda, since in a sense it’s true that the correct answer actually came from the parent. But then almost everything that a young child knows gets soaked up from the people around them, so if you applied this argument then you might as well argue that the child doesn’t know anything in the first place and that you’re always talking to an adult or an older child through them. This seems pretty directly analogous to a language model, which has also soaked up all of its knowledge from humans.
It is definitely misleading, in the same sense that the performance of a model on the training data is misleading. The interesting question w.r.t. GPT-3 is “how well does it perform in novel settings?”. And we can’t really know that, because apparently even publicly available interfaces are inside the training loop.
Now, there’s nothing wrong with training an AI like that! But the results then need to be interpreted with more care.
P.S.: sometimes children do parrot their parents to an alarming degree, e.g., about political positions they couldn’t possibly have the context to truly understand.
The interesting question w.r.t. GPT-3 is “how well does it perform in novel settings?”. And we can’t really know that, because apparently even publicly available interfaces are inside the training loop.
OpenAI still lets you use older versions of GPT-3, if you want to experiment with ones that haven’t had additional training.
P.S.: sometimes children do parrot their parents to an alarming degree, e.g., about political positions they couldn’t possibly have the context to truly understand.
It’s much better for children to parrot the political positions of their parents than to select randomly from the total space of political opinions. The vast majority of possible-political-opinion-space is unaligned.
If they’re randomly picking from a list of possible political positions, I’d agree. However, I suspect that is not the realistic alternative to parroting their parents political positions.
Maybe ideally it’d be rational reflection to the best of their ability on values and whatnot. However, if we had a switch to turn off parroting-parents-political-positions we’d be in a weird space...children wouldn’t even know about most political positions to even choose from.
Right, but we wouldn’t then use this as proof that our children are precocious politicians!
In this discussion, we need to keep separate the goals of making GPT-3 as useful a tool as possible, and of investigating what GPT-3 tells us about AI timelines.
Suppose you were talking to a young child, who produced an incorrect answer to some question. Afterwards they’d talk to their parent who explained the right answer. The next day when you asked them the same question, they’d answer correctly.
I think it depends on the generalization. If you ask the exact same question, you cannot test if the children was able to acquire prior knowledge that would make sure they understood why the answer. Also, reformulating the answer might not suffice because we know that an algorithm like GPT-3 is already very good at this task. I think you would need to create a different question that requires a similar prior knowledge to be answered.
I don’t really think it’s misleading. It would be different if, on the first time that you submitted the inputs, they were relayed to a human who pretended to be an AI. But what’s actually happening is that the AI does generate the responses, and then if humans notice that it produces bad responses, they train the AI to produce better responses in the future.
Suppose you were talking to a young child, who produced an incorrect answer to some question. Afterwards they’d talk to their parent who explained the right answer. The next day when you asked them the same question, they’d answer correctly.
Would you say that there’s trickery involved, in that you are not really talking to the child? You could say that, kinda, since in a sense it’s true that the correct answer actually came from the parent. But then almost everything that a young child knows gets soaked up from the people around them, so if you applied this argument then you might as well argue that the child doesn’t know anything in the first place and that you’re always talking to an adult or an older child through them. This seems pretty directly analogous to a language model, which has also soaked up all of its knowledge from humans.
It is definitely misleading, in the same sense that the performance of a model on the training data is misleading. The interesting question w.r.t. GPT-3 is “how well does it perform in novel settings?”. And we can’t really know that, because apparently even publicly available interfaces are inside the training loop.
Now, there’s nothing wrong with training an AI like that! But the results then need to be interpreted with more care.
P.S.: sometimes children do parrot their parents to an alarming degree, e.g., about political positions they couldn’t possibly have the context to truly understand.
OpenAI still lets you use older versions of GPT-3, if you want to experiment with ones that haven’t had additional training.
It’s much better for children to parrot the political positions of their parents than to select randomly from the total space of political opinions. The vast majority of possible-political-opinion-space is unaligned.
If they’re randomly picking from a list of possible political positions, I’d agree. However, I suspect that is not the realistic alternative to parroting their parents political positions.
Maybe ideally it’d be rational reflection to the best of their ability on values and whatnot. However, if we had a switch to turn off parroting-parents-political-positions we’d be in a weird space...children wouldn’t even know about most political positions to even choose from.
Right, but we wouldn’t then use this as proof that our children are precocious politicians!
In this discussion, we need to keep separate the goals of making GPT-3 as useful a tool as possible, and of investigating what GPT-3 tells us about AI timelines.
It doesn’t follow that a subset of well known political opinions is aligned, even with itself.
I think it depends on the generalization. If you ask the exact same question, you cannot test if the children was able to acquire prior knowledge that would make sure they understood why the answer. Also, reformulating the answer might not suffice because we know that an algorithm like GPT-3 is already very good at this task. I think you would need to create a different question that requires a similar prior knowledge to be answered.