Anecdotal, but GPT-5 (mini, I guess? free plan with no thinking) is the first model to succeed at a poetry-based prompt I’ve tested on a lot of models.
I don’t want to mention it publicly, but it involves a fairly complex rhyming scheme and meter. All other models misunderstand entirely, but GPT-5 got it straight away.
Interestingly, when thinking mode kicked in after a few prompts, it performed a lot worse.
Anecdotal, but GPT-5 (mini, I guess? free plan with no thinking) is the first model to succeed at a poetry-based prompt I’ve tested on a lot of models.
I don’t want to mention it publicly, but it involves a fairly complex rhyming scheme and meter.
All other models misunderstand entirely, but GPT-5 got it straight away.
Interestingly, when thinking mode kicked in after a few prompts, it performed a lot worse.