Have you looked at samples of CoT of o1, o3, deepseek, etc. solving hard math problems?
Certainly (experimenting with r1′s CoTs right now, in fact). I agree that they’re not doing the brute-force stuff I mentioned; that was just me outlining a scenario in which a system “technically” clears the bar you’d outlined, yet I end up unmoved (I don’t want to end up goalpost-moving).
Though neither are they being “strategic” in the way I expect they’d need to be in order to productively use a billion-token CoT.
Anyhow, this is nice, because I do expect that probably something like this milestone will be reached before AGI
Yeah, I’m also glad to finally have something concrete-ish to watch out for. Thanks for prompting me!
Certainly (experimenting with r1′s CoTs right now, in fact). I agree that they’re not doing the brute-force stuff I mentioned; that was just me outlining a scenario in which a system “technically” clears the bar you’d outlined, yet I end up unmoved (I don’t want to end up goalpost-moving).
Though neither are they being “strategic” in the way I expect they’d need to be in order to productively use a billion-token CoT.
Yeah, I’m also glad to finally have something concrete-ish to watch out for. Thanks for prompting me!