Martin Vlach comments on Open Thread—Summer 2025

Martin Vlach 29 Jun 2025 22:06 UTC
1 point
0
What’s your view on sceptic claims about RL on transformer LMs like https://arxiv.org/abs/2504.13837v2 or one that CoT instruction yields better results than <thinking> training?