ArchiveSequencesAbout

QuestionsEventsShortformAlignment ForumAF Comments

HomeFeaturedAllTagsRecent Comments

Caleb Biddulph comments on Why not train reasoning models with RLHF?

Caleb Biddulph 30 Jan 2025 21:12 UTC
1 point
0
Yeah, but there are probably other interesting takeaways