Just to be sure: as in the METR results, ‘horizon’ here means ‘the time needed to complete the task for humans with appropriate expertise’, correct? I assume so but it would be useful to make that explicit (especially since many people who skimmed the METR results initially got the impression that it was ‘the time needed for the model to complete the task’).
Just to be sure: as in the METR results, ‘horizon’ here means ‘the time needed to complete the task for humans with appropriate expertise’, correct? I assume so but it would be useful to make that explicit (especially since many people who skimmed the METR results initially got the impression that it was ‘the time needed for the model to complete the task’).