Really interesting post, thanks. A couple of thoughts:
Moreover, in multi-agent environments, the time it takes for an agent to process information could be playing a catalytic role in shaping its persona.
This seems like an especially important point—multi-agent environments suddenly make clock time (in the form of latency and throughput) dramatically more relevant for models. I’ve also seen claims that at least some recent frontier models are being deliberately trained to have a better understanding of time and duration so that they can function in agentic coding environments.
By engaging in chain-of-thought, the LLM creates a temporary history for its current response. This allows it to “remember” its own logic from three sentences ago
This seems like it works only to the extent that CoT is faithful, and as you’ve argued it isn’t always. It seems like there’s certainly incentive for models (even those which aren’t scored on CoT) to produce whatever CoT output makes them most likely to give correct answers, but that may or may not be faithful output, although I imagine it often is.
Really interesting post, thanks. A couple of thoughts:
This seems like an especially important point—multi-agent environments suddenly make clock time (in the form of latency and throughput) dramatically more relevant for models. I’ve also seen claims that at least some recent frontier models are being deliberately trained to have a better understanding of time and duration so that they can function in agentic coding environments.
This seems like it works only to the extent that CoT is faithful, and as you’ve argued it isn’t always. It seems like there’s certainly incentive for models (even those which aren’t scored on CoT) to produce whatever CoT output makes them most likely to give correct answers, but that may or may not be faithful output, although I imagine it often is.