Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Samuel Albanie comments on
METR: Measuring AI Ability to Complete Long Tasks
Samuel Albanie
16 Apr 2025 19:27 UTC
14
points
0
Resolved to YES, in light of METR’s
o3 evals
.
Zach Stein-Perlman
16 Apr 2025 19:38 UTC
4
points
3
Parent
wow
Back to top
Resolved to YES, in light of METR’s o3 evals.
wow