I agree grammar can definitely go away in the presence of optimization pressure to have shorter reasoning (though I might guess that it isn’t just a length penalty, but something more explicitly destructive of grammar). When I said “by default” I was implicitly thinking of no length penalties or similar things, but I should have spelled that out.
But presumably the existence of any reasoning models that have good grammar is a refutation of the idea that you can reason about what a reasoning model’s thoughts will do based purely on the optimization power applied to the thoughts during reasoning training.
I agree grammar can definitely go away in the presence of optimization pressure to have shorter reasoning (though I might guess that it isn’t just a length penalty, but something more explicitly destructive of grammar). When I said “by default” I was implicitly thinking of no length penalties or similar things, but I should have spelled that out.
But presumably the existence of any reasoning models that have good grammar is a refutation of the idea that you can reason about what a reasoning model’s thoughts will do based purely on the optimization power applied to the thoughts during reasoning training.