I think we’d be talking about AI progress slowing down at this point if it weren’t for reasoning models.
Possibly, but 1) There are reasoning models, 2) Value per token may still raise faster than cost per token for non-reasoning models which could be enough to sustain progress, and 3) It’s possible that a more expensive non-reasoning model makes reasoning more efficient and/or effective by increasing the quality and complexity of each reasoning step.
At this point I pretty much never use 4o for anything. It’s o1, o1-pro, or o3-mini-high. Looking forward to testing 4.5 though.
It’s better than 4o across four of my benchmarks: Confabulations, Creative Writing, Thematic Generalization, and Extended NYT Connections. However, since it’s an expensive and huge model, I think we’d be talking about AI progress slowing down at this point if it weren’t for reasoning models.
Possibly, but 1) There are reasoning models, 2) Value per token may still raise faster than cost per token for non-reasoning models which could be enough to sustain progress, and 3) It’s possible that a more expensive non-reasoning model makes reasoning more efficient and/or effective by increasing the quality and complexity of each reasoning step.
At this point I pretty much never use 4o for anything. It’s o1, o1-pro, or o3-mini-high. Looking forward to testing 4.5 though.