Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
p.b. comments on
METR: Measuring AI Ability to Complete Long Tasks
p.b.
5 Apr 2025 20:27 UTC
4
points
0
Do you plan to evaluate new models in the same way and regularly update the graph?
p.b.
17 Apr 2025 7:16 UTC
6
points
0
Parent
Back to top
Do you plan to evaluate new models in the same way and regularly update the graph?