Uh? The OpenAssistant dataset would qualify as supervised learning/​fine-tuning, not RLHF, no?
Yeah, it would. Sorry, the post is now corrected.
Uh? The OpenAssistant dataset would qualify as supervised learning/​fine-tuning, not RLHF, no?
Yeah, it would. Sorry, the post is now corrected.