Yeah there haven’t been any improvements that significantly changed how capable a model is on a hard task I need solved for like, at least a week, maybe more /j
/j was because I haven’t really kept track of how long it’s been. Gemini 2.5 pro was the last one I was somewhat impressed by. now, like, to be clear, it’s still flaky and still an LLM, still incremental improvement, but noticeably stronger on certain kinds of math and programming tasks. still mostly relevant when you want speed and some slop is ok.
Yeah there haven’t been any improvements that significantly changed how capable a model is on a hard task I need solved for like, at least a week, maybe more /j
Im out of the loop, can you point to an example please?
/j was because I haven’t really kept track of how long it’s been. Gemini 2.5 pro was the last one I was somewhat impressed by. now, like, to be clear, it’s still flaky and still an LLM, still incremental improvement, but noticeably stronger on certain kinds of math and programming tasks. still mostly relevant when you want speed and some slop is ok.