Great to see work on unverbalized eval awaraness! It’s straght forward and insightful.
I have 2 questions regarding to the experiment settings:
How did you decide the classification threshold for logit monitor and LLM-judge?
In Figure2(right), what is the VEA pass@100?
Great to see work on unverbalized eval awaraness! It’s straght forward and insightful.
I have 2 questions regarding to the experiment settings:
How did you decide the classification threshold for logit monitor and LLM-judge?
In Figure2(right), what is the VEA pass@100?