Oliver Sourbut comments on METR: How Does Time Horizon Vary Across Domains?

Oliver Sourbut 16 Jul 2025 12:28 UTC
3 points
0
By the way, if we take the exponential (rather than sigmoidal) constant hazard rate model^[1], there’s an easy mental trick for extrapolating to different success rates and different time horizons from a given measurement: the Rule of 72^[2].

The rule of 72 says, for smallish percentage multiplier rates $r$ (i.e. +- $r$ percent) per period^[3], the half life in periods is roughly $t = 72 / r$ , or conversely, to halve in time $t$ you need $r = 72 / t$ per period.

Ord writes:

Here are some useful comparisons for how the predicted time horizons over which an agent could get very high success rates compare to the measured time horizon for a 50% success rate:

$T_{80} \approx 1 / 3 T_{50} T_{90} \approx 1 / 7 T_{50} T_{99} \approx 1 / 70 T_{50} T_{99.9} \approx 1 / 700 T_{50}$

[and each additional ‘nine’ of reliability beyond this divides the time horizon by 10]

Written this way, it looks a bit cryptic. But the rule of 72 makes this easy to estimate.

When we say $T_{80}$ or $T_{90}$ , we’re actually saying, ‘what is the 20% failure horizon?’ or ‘what is the 10% failure horizon?’ respectively.

Well, if the half life (the 50% failure horizon) is $T_{50}$ , the rule says the 10% failure horizon is $(10 / 72) T_{50} \approx 1 / 7 T_{50}$ ^[4]. Similarly $T_{99} \approx 1 / 72 T_{50}$ etc. (whence ‘each additional ‘nine’… divides the time horizon by 10’). The rule gets skewiff for higher percentages, which is why the $T_{80}$ is a bit different than the rule would suggest (but not by much).

Interestingly when I was talking with Beth and Megan about this time horizon stuff in early 2024, we discussed the constant hazard rate model as a simple default (though we were all skeptical that no ‘error recovery’ at all was plausible as a model, except for very fatal errors!). So I’m mildly surprised that it didn’t make it into the paper in the end.
1. ↩︎
  The one Ord talks about
2. ↩︎
  variously ‘rule of 70’, ‘rule of 69.3’ and other rules; usually it doesn’t make much difference for Fermi-ish estimations
3. ↩︎
  ‘smallish’ because it relies on approximating a logarithm as linear, which works for smallish rates.
4. ↩︎
  The rate ( $r$ ) we’re interested in is $10$ . Half-life $t$ is $T_{50} / T_{10}$ (i.e. how many 10% failure horizons make up the 50% failure horizon), as given. So the rule $t = 72 / r$ says $T_{50} / T_{10} = 72 / 10$ .