Which GPT? The paper mentioned that GPT-5{,-mini,-nano} has only ~5% success rate. I tried it with o3 and got 2⁄3.
Not sure, I’m just using the openai website interface, it doesn’t list the exact verison.
Which GPT? The paper mentioned that GPT-5{,-mini,-nano} has only ~5% success rate. I tried it with o3 and got 2⁄3.
Not sure, I’m just using the openai website interface, it doesn’t list the exact verison.