I’ve updated away from the CoT hopes, due to the recurrent architecture paper, I’d now say it’s probably going to be 45-65% at best, and I expect it to keep dropping predictably (though the reason I’m not yet updating all the way is that compute constraints combined with people maybe not choosing to use these architectures means I can’t skip to the end and update all the way):
I’ve updated away from the CoT hopes, due to the recurrent architecture paper, I’d now say it’s probably going to be 45-65% at best, and I expect it to keep dropping predictably (though the reason I’m not yet updating all the way is that compute constraints combined with people maybe not choosing to use these architectures means I can’t skip to the end and update all the way):
https://arxiv.org/abs/2502.05171