Lawrence Phillips comments on A Guide For LLM-Assisted Web Research

Lawrence Phillips 27 Jun 2025 11:08 UTC
3 points
0
Good question. We don’t explicitly break this out in our analysis, but we do give models the chance to give up, and some of our instances actually require them to give up for numbers that can’t be found.

Anyway, from eyeballing results and traces, I get the sense that 70-80% of failures on the find number task are incorrect assertions rather than refusals to answer.