Fabien Roger comments on Daniel Kokotajlo’s Shortform

Fabien Roger 30 Sep 2025 12:58 UTC
LW: 30 AF: 14
24
AF
I’d really like someone at OpenAI to run this paraphrase + distill experiment on o3. I think it’s >50% likely that most of the actual load-bearing content of the CoT is the thing you (or a non-reasoning model trying to decipher the CoT) would guess it is. And I would update somewhat strongly based on the result of the experiment.