mwatkins comments on Large language models can provide “normative assumptions” for learning human preferences

mwatkins 3 Jan 2023 21:12 UTC
3 points
2
I got the same results with those prompts using the ‘text-davinci-003’ model, whereas the original ‘davinci’ model produces a huge range of creative but unhelpful (for these purposes) outputs. The difference is that text-davinci-003 was trained using human feedback data.
As far as I can tell (see here), OpenAI haven’t revealed the details of the training process. But the fact is that particular decisions were made about how this was done, in order to create a more user-friendly product. And this could have been done in any number of ways, using different groups of humans, working to a range of possible specifications.
This seems a relevant consideration if we’re considering the future use of LLMs to bridge the inference gap in the value-learning problem for AGI systems. Will human feedback be required, and if so, how would this be organised?