METR uses the 10x researcher speedup as described by Ryan Greenblatt below as an important threshold of concern. The 10x constant seems quite important here because METR reports, compared to most other independent model assessments, seem very likely to influence lab + govt policy decisions.
Is there any work explaining why 10x, and not 15x or 5x?
METR uses the 10x researcher speedup as described by Ryan Greenblatt below as an important threshold of concern. The 10x constant seems quite important here because METR reports, compared to most other independent model assessments, seem very likely to influence lab + govt policy decisions.
Is there any work explaining why 10x, and not 15x or 5x?
Greenblatt laying out 3x, 5x, 10x acceleration qualitatively: https://www.alignmentforum.org/posts/LjgcRbptarrRfJWtR/a-breakdown-of-ai-capability-levels-focused-on-ai-r-and-d
METR report relying on 10x: https://evaluations.metr.org/gpt-5-1-codex-max-report/#extrapolating-on-trend-improvements-in-next-6-months