Here is a concrete prediction: Claude 3.5 Sonnet, GPT-4o and Gemini 2 Pro will be able to understand the topics considered in 99%+ of Chain-of-Thoughts of all regular Transformers trained in 2025 and 2026 that were not deliberately trained or prompted to be harder to understand
Is something testable.
There might be empirical laws there
If we can check how much of the performance gains come from improved reading of CoT or from improved writing of CoT
Here is a concrete prediction: Claude 3.5 Sonnet, GPT-4o and Gemini 2 Pro will be able to understand the topics considered in 99%+ of Chain-of-Thoughts of all regular Transformers trained in 2025 and 2026 that were not deliberately trained or prompted to be harder to understand
Is something testable.
There might be empirical laws there
If we can check how much of the performance gains come from improved reading of CoT or from improved writing of CoT