For some models (especially older ones), the Artificial Analysis Intelligence Index score is labeled as “Estimate (independent evaluation forthcoming)”. It is unclear how these scores are determined, and they may not be a reliable estimate. The Artificial Analysis API does not clearly label such estimates and I did not manually remove them for secondary analysis. Ideally the capability levels that have these models (probably typically lower levels) would be weighted less, but I don’t do this due to uncertainty about which models have Estimates vs. independently tested scores.
IMO this is a potentially significant issue that this post should have spent more time addressing, since it means that the earlier sections of the trend lines are coming from a source we know nothing about.
The ECI goes back a bit farther than it does on the hub, and we use the older data to do this sort of analysis to check to see whether a break in the trend line is a good fit: https://epoch.ai/data-insights/ai-capabilities-progress-has-sped-up