Thomas Broadley comments on Reproducing ARC Evals’ recent report on language model agents

Thomas Broadley 22 Jan 2024 15:59 UTC
3 points
0
I neglected to update my comment here—the agent I built for this replication is now publicly available as part of the METR task workbench, here: https://drive.google.com/drive/folders/1-m1y0_Akunqq5AWcFoEH2_-BeKwsodPf