Evan R. Murphy comments on Update to Mysteries of mode collapse: text-davinci-002 not RLHF

Evan R. Murphy 22 Nov 2022 2:44 UTC
LW: 4 AF: 3
0
AF
Thanks for catching this and spreading the word!
Do we know if the following other models from OpenAI use true RLHF or also use this RLHF-like mystery method? (or something else!)
- text-curie-001
- text-babbage-001
- text-ada-001
- Evan R. Murphy 1 Dec 2022 20:28 UTC
  LW: 2 AF: 1
  0
  AF Parent
  The new model index from OpenAI contains most of the answers to this. Jérémy linked to it in another comment on this post. However, the model index doesn’t give info on ada and text-ada-001 yet: https://beta.openai.com/docs/model-index-for-researchers
- janus 23 Nov 2022 22:07 UTC
  LW: 2 AF: 1
  0
  AF Parent
  I don’t know :(