See Paul Christiano’s Thoughts on sharing information about language model capabilities (back when METR was ARC Evals).
See Paul Christiano’s Thoughts on sharing information about language model capabilities (back when METR was ARC Evals).