Good question. We don’t explicitly break this out in our analysis, but we do give models the chance to give up, and some of our instances actually require them to give up for numbers that can’t be found.
Anyway, from eyeballing results and traces, I get the sense that 70-80% of failures on the find number task are incorrect assertions rather than refusals to answer.
Good question. We don’t explicitly break this out in our analysis, but we do give models the chance to give up, and some of our instances actually require them to give up for numbers that can’t be found.
Anyway, from eyeballing results and traces, I get the sense that 70-80% of failures on the find number task are incorrect assertions rather than refusals to answer.