Martin Vlach comments on Martin Vlach’s Shortform

Martin Vlach 30 Sep 2025 20:50 UTC
1 point
0
My friends(M.K.,he’s on Github) honorable aim to establish a term in the AI evals field: The cognitive asymetry, generating-verifying complexity gap for model-as-judge evals.

Various tasks that have a clear intelligence-to-solve vs. intelligence-to-verify-a-solution gap, ie. only X00-B LMs have a shot, but X-B model is strong on verifying are desired.
It fits nicely to the incremental iterative alighnment scaling playbook, I hope.