I tried ChatGPT(-5.2-Thinking) on the original D&D.Sci challenge (which is tough, but not tricky) and it got almost a perfect answer, one point shy of the optimal.
I also tried ChatGPT on the second D&D.Sci challenge (which is tricky, but not tough), and it completely failed (albeit in a sensible and conservative manner). Repeated prompts of “You’re missing something, please continue with the challenge” didn’t help.
I tried ChatGPT(-5.2-Thinking) on the original D&D.Sci challenge (which is tough, but not tricky) and it got almost a perfect answer, one point shy of the optimal.
I also tried ChatGPT on the second D&D.Sci challenge (which is tricky, but not tough), and it completely failed (albeit in a sensible and conservative manner). Repeated prompts of “You’re missing something, please continue with the challenge” didn’t help.