That’s a good point, says that the study collected data “in late 2021”. Instruction-following GPT-3 became OpenAI’s default model in January 2022, though the same article also mentions that the models “have been in beta on the API for more than a year”. I don’t know whether Replika had used those beta models or not.
That said, even though instruct-GPTs were technically trained with RLHF, the nature of that RLHF was quite different (they weren’t even chat models, so not trained for anything like continuing an ongoing conversation).
That’s a good point, says that the study collected data “in late 2021”. Instruction-following GPT-3 became OpenAI’s default model in January 2022, though the same article also mentions that the models “have been in beta on the API for more than a year”. I don’t know whether Replika had used those beta models or not.
That said, even though instruct-GPTs were technically trained with RLHF, the nature of that RLHF was quite different (they weren’t even chat models, so not trained for anything like continuing an ongoing conversation).