Hi Karl, That’s a great suggestion and one that I’d already tested. Apologies—I allude to it my methods but then I don’t expand on it in the results.
I get the LLMs to rank based on perceived difficulty or confidence in answering.
The LLMs do selectively choose the most difficult questions to skip.
I took the ranking and compared to all the other LLMs to further check that LLms weren’t randomly assessing difficulty. With minor variations there is a consensus between the most and least difficult questions.
Hi Karl, That’s a great suggestion and one that I’d already tested. Apologies—I allude to it my methods but then I don’t expand on it in the results.
I get the LLMs to rank based on perceived difficulty or confidence in answering.
The LLMs do selectively choose the most difficult questions to skip.
I took the ranking and compared to all the other LLMs to further check that LLms weren’t randomly assessing difficulty. With minor variations there is a consensus between the most and least difficult questions.