I agree the difference between base and 002 is bigger than the difference between 002 and 003. The base model needs to be carefully coaxed into a scenario where plausible continuations of the prompt align with your intended output, and even then it’s very inclined to repeat stuff and degenerates quickly. By contrast, you can just tell 002 what to do, and it will usually at least try to do what you say.
Seems like you’re implying that davinci is the base model for 002 and 003. That’s not the case; davinci has one base model (GPT-3) and then 002 and 003 share a different base model (GPT-3.5).
Fair. I think the crucial question to Ajeya & Matthew’s discussion of “Why the hype now?” is exactly how much worse the non-RLHF models that had been available since at least last March (davinci, code-davinci-002, text-davinci-002) actually were than the RLHF models made available just recently (text-davinci-003 and ChatGPT’s underlying model). I stand by the opinion that the besides the new chat stuff, most of the improvement happened within the old cohort, rather than between cohorts, so I attribute the recent hype to the convenient and free chat interface.
I agree the difference between base and 002 is bigger than the difference between 002 and 003. The base model needs to be carefully coaxed into a scenario where plausible continuations of the prompt align with your intended output, and even then it’s very inclined to repeat stuff and degenerates quickly. By contrast, you can just tell 002 what to do, and it will usually at least try to do what you say.
Seems like you’re implying that davinci is the base model for 002 and 003. That’s not the case; davinci has one base model (GPT-3) and then 002 and 003 share a different base model (GPT-3.5).
Fair. I think the crucial question to Ajeya & Matthew’s discussion of “Why the hype now?” is exactly how much worse the non-RLHF models that had been available since at least last March (
davinci
,code-davinci-002
,text-davinci-002
) actually were than the RLHF models made available just recently (text-davinci-003
and ChatGPT’s underlying model). I stand by the opinion that the besides the new chat stuff, most of the improvement happened within the old cohort, rather than between cohorts, so I attribute the recent hype to the convenient and free chat interface.