I am probably missing something, but when talking about “accuracy” .. how did you measure true and false negatives (thinking and not thinking about evals when not in an eval)?
I am probably missing something, but when talking about “accuracy” .. how did you measure true and false negatives (thinking and not thinking about evals when not in an eval)?