Cleo Nardo comments on Shortform

Cleo Nardo 3 Feb 2026 22:45 UTC

47 points

How far is each lab from the frontier?

The Epoch Capabilities Index (ECI) stitches together 37 benchmarks into a single capability scale. ECI is calibrated so Claude 3.5 Sonnet (June 2024) = 130 and GPT-5 (August 2025) = 150.

Since April 2024, frontier models have improved at ~15 ECI points/year (~1.25 points/month, R^2=0.94).^[1] This steady rate lets us convert between ECI and time, e.g. a model with ECI 137.5 has capability equivalent to the frontier in February 2025.

For each lab, we track the minimum and maximum months behind the frontier. A negative value (*) means the lab was ahead of the trend line, i.e. their model exceeded what the linear frontier trend predicted for that date.

Lab	Min	Max
OpenAI	-1.6 mo* (Dec 2024)	5.7 mo (Sep 2024)
Google DeepMind	-0.6 mo* (May 2024)	7.2 mo (May 2024)
xAI	1.0 mo (Jul 2025)	11.5 mo (Apr 2025)
Anthropic	1.2 mo (Feb 2025)	7.1 mo (Feb 2025)
DeepSeek	1.8 mo (Jan 2025)	11.9 mo (Dec 2024)
Alibaba	3.3 mo (Jul 2025)	10.0 mo (Apr 2025)
Mistral	4.6 mo (Jul 2024)	17.8 mo (Feb 2026)
Meta	5.0 mo (Jul 2024)	19.5 mo (Feb 2026)

This conversion gives us two ways to visualize the AI landscape:

This left plot shows each lab’s capability expressed as a frontier-equivalent date. Lines stay flat between releases, jumping up at new releases, with a dashed diagonal is the frontier itself.

The right plot show how many months behind the frontier is each lab at any given time. Lines slope down at 45° between releases (labs fall behind at 1 month per month of real time), jumping up at new releases, with a dashed horizontal showing the frontier.

Epoch discovers the trend shifts around this time. ↩︎

What links here?

Cleo Nardo's comment on Shortform by Cleo Nardo (4 Feb 2026 1:14 UTC; 9 points)

Aidan Ewart 4 Feb 2026 16:41 UTC
12 points
8
Parent
Seems worth noting that the ECI seems like it might be biased away from the ways that Claude is good; as per this post by Epoch, the first two PCs of their benchmark data correspond to “general capability” and “claudiness”, so ECI (which is another, but different, 1-dimensional compression of their benchmark data) seems like it should also underrate Claude.

h/t @jake_mendel for discussion
- Cleo Nardo 4 Feb 2026 16:49 UTC
  3 points
  0
  Parent
  Yes, this makes sense.
  h/t @jake_mendel for discussion
lilkim2025 4 Feb 2026 1:00 UTC
3 points
0
Parent
It’s interesting that Facebook/Meta fell so far behind in AI despite the substantial resources on hand. ‘Metaverse’ was an inherently flawed idea that they thought they could make work through market leverage, but well-scoring LLMs have been done successfully by a wide variety of organizations, from Alibaba to X to OpenAI to Anthropic.
Is it something organizational? Does Facebook have any successful spinoff initiatives?
- Cleo Nardo 4 Feb 2026 1:04 UTC
  4 points
  0
  Parent
  potentially Meta cares less than others about whatever ECI measures, i.e. if they want AI to generate and curate “content” in instagram and facebook. I think the main reason is yeah, just a series of poor decisions. maybe some organisational issues, e.g. Meta is a one-man dictatorship, whereas OpenAI/Anth/GDM are much more researcher-led so were AGI-pilled for longer.
- MP 4 Feb 2026 3:00 UTC
  3 points
  1
  Parent
  Meta has basically shut down FAIR after their Llama 4 fiasco, fired the lead and Yann, and they are starting again by creating a new lab called Meta Superintelligence Labs. The guys Alexander Wang has assembled STARTED working toghether like during the summer.