Thanks, yea fair point, just tried them and would score them as follows:
Opus 4.6: 1
Opus 4.7: 2
GPT5.3 thinking mode: 3
So not a massive shift in the overall shape of the conclusions (still got the best result given what I was after from Sonnet + Aiden) but GPT5.3 seemed genuinely insightful in a way I will need to explore further. Main reasons they didn’t get higher scores was generally sticking with the goals as framed, and suggesting how to achieve them rather than suggesting alternative goals.
Thanks, yea fair point, just tried them and would score them as follows:
Opus 4.6: 1
Opus 4.7: 2
GPT5.3 thinking mode: 3
So not a massive shift in the overall shape of the conclusions (still got the best result given what I was after from Sonnet + Aiden) but GPT5.3 seemed genuinely insightful in a way I will need to explore further. Main reasons they didn’t get higher scores was generally sticking with the goals as framed, and suggesting how to achieve them rather than suggesting alternative goals.