METR have released Time Horizons 1.1

Link post

I just found out that METR released an updated version of their time horizons work with extra tasks and different evaluation infrastructure. This was released on 29th Jan and I think has been overshadowed by the Moltbook stuff.

Main points:

  • Similar overall trend since 2021

  • 50% time horizon doubling time went from 165 days with 1.0 to 131 days with 1.1 over the period since 2023

  • The top model, Claude 4.5 Opus, has gone from a 4h49 time horizon to 5h20

No comments.