Charlie Steiner comments on Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases

Charlie Steiner 12 Mar 2025 3:15 UTC
LW: 3 AF: 2
0
AF
I don’t get what experiment you are thinking about (most CoT end with the final answer, such that the summarized CoT often ends with the original final answer).
Hm, yeah, I didn’t really think that through. How about giving a model a fraction of either its own precomputed chain of thought, or the summarized version, and plotting curves of accuracy and further tokens used vs. % of CoT given to it? (To avoid systematic error from summaries moving information around, doing this with a chunked version and comparing at each chunk seems like a good idea.)
Anyhow, thanks for the reply. I have now seen last figure.