Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
anaguma comments on
Training Qwen-1.5B with a CoT legibility penalty
anaguma
9 Oct 2025 22:44 UTC
5
points
0
I would predict that that a 1.5B model is 1-3 OOMs too small to develop illegible but useful CoTs.
Back to top
I would predict that that a 1.5B model is 1-3 OOMs too small to develop illegible but useful CoTs.