Kaj_Sotala comments on Humans pretending to be robots pretending to be human

Kaj_Sotala 29 Mar 2022 8:19 UTC
16 points
0
On the other hand, there does seem something funny about GPT-3 presents this shiny surface where you can send it any query and it gives you an answer, but under the hood there are a bunch of freelancers busily checking all the responses and rewriting them to make the computer look smart.
It’s kinda like if someone were showing off some fancy car engine but the vehicle is actually being powered by some hidden hamster wheels. The organization of the process is itself impressive, but it’s not quite what is advertised.
To be fair, OpenAI does state that “InstructGPT is then further fine-tuned on a dataset labeled by human labelers.” But this still seems misleading to me. It’s not just that the algorithm is fine-tuned on the dataset. It seems that these freelancers are being hired specifically to rewrite the output.
I don’t really think it’s misleading. It would be different if, on the first time that you submitted the inputs, they were relayed to a human who pretended to be an AI. But what’s actually happening is that the AI does generate the responses, and then if humans notice that it produces bad responses, they train the AI to produce better responses in the future.
Suppose you were talking to a young child, who produced an incorrect answer to some question. Afterwards they’d talk to their parent who explained the right answer. The next day when you asked them the same question, they’d answer correctly.
Would you say that there’s trickery involved, in that you are not really talking to the child? You could say that, kinda, since in a sense it’s true that the correct answer actually came from the parent. But then almost everything that a young child knows gets soaked up from the people around them, so if you applied this argument then you might as well argue that the child doesn’t know anything in the first place and that you’re always talking to an adult or an older child through them. This seems pretty directly analogous to a language model, which has also soaked up all of its knowledge from humans.
- Vitor 29 Mar 2022 8:53 UTC
  6 points
  0
  Parent
  It is definitely misleading, in the same sense that the performance of a model on the training data is misleading. The interesting question w.r.t. GPT-3 is “how well does it perform in novel settings?”. And we can’t really know that, because apparently even publicly available interfaces are inside the training loop.
  
  Now, there’s nothing wrong with training an AI like that! But the results then need to be interpreted with more care.
  
  P.S.: sometimes children do parrot their parents to an alarming degree, e.g., about political positions they couldn’t possibly have the context to truly understand.
  - Kaj_Sotala 29 Mar 2022 9:35 UTC
    5 points
    0
    Parent
    The interesting question w.r.t. GPT-3 is “how well does it perform in novel settings?”. And we can’t really know that, because apparently even publicly available interfaces are inside the training loop.
    OpenAI still lets you use older versions of GPT-3, if you want to experiment with ones that haven’t had additional training.
  - 9eB1 29 Mar 2022 12:16 UTC
    2 points
    0
    Parent
    
    P.S.: sometimes children do parrot their parents to an alarming degree, e.g., about political positions they couldn’t possibly have the context to truly understand.
    
    It’s much better for children to parrot the political positions of their parents than to select randomly from the total space of political opinions. The vast majority of possible-political-opinion-space is unaligned.
    - Dustin 29 Mar 2022 16:36 UTC
      2 points
      0
      Parent
      If they’re randomly picking from a list of possible political positions, I’d agree. However, I suspect that is not the realistic alternative to parroting their parents political positions.
      Maybe ideally it’d be rational reflection to the best of their ability on values and whatnot. However, if we had a switch to turn off parroting-parents-political-positions we’d be in a weird space...children wouldn’t even know about most political positions to even choose from.
    - Vitor 29 Mar 2022 19:38 UTC
      1 point
      0
      Parent
      Right, but we wouldn’t then use this as proof that our children are precocious politicians!
      
      In this discussion, we need to keep separate the goals of making GPT-3 as useful a tool as possible, and of investigating what GPT-3 tells us about AI timelines.
    - TAG 29 Mar 2022 12:39 UTC
      1 point
      0
      Parent
      It doesn’t follow that a subset of well known political opinions is aligned, even with itself.
- otthorn 29 Mar 2022 11:38 UTC
  1 point
  0
  Parent
  
  Suppose you were talking to a young child, who produced an incorrect answer to some question. Afterwards they’d talk to their parent who explained the right answer. The next day when you asked them the same question, they’d answer correctly.
  
  I think it depends on the generalization. If you ask the exact same question, you cannot test if the children was able to acquire prior knowledge that would make sure they understood why the answer. Also, reformulating the answer might not suffice because we know that an algorithm like GPT-3 is already very good at this task. I think you would need to create a different question that requires a similar prior knowledge to be answered.