Beth Barnes comments on Help ARC evaluate capabilities of current language models (still need people)