sanxiyn comments on OpenAI lied about SFT vs. RLHF

sanxiyn 10 Feb 2025 4:40 UTC
4 points
6
I think if you weren’t carefully reading OpenAI’s documentation it was pretty easy to believe that text-davinci-002 was InstructGPT (and hence trained with RLHF).
Not only was it easy, in fact many people did (including myself). In fact, can you point a single case of people NOT making this reading mistake? As in, after January 2022 instruction following announcement, but before October 2022 model index for researchers. Jan Leike’s tweet you linked to postdates October 2022 and does not count. The allegation is that OpenAI lied (or at the very least was extremely misleading) for ten months of 2022. I am more ambivalent about post October 2022.