Humans already have infinite horizon in this sense (under some infeasible idealized methodology). Horizon metrics quantify progress towards overcoming an obstruction that exists in current LLMs, so that we might see the gradual impact of even things like brute scaling.
Humans are not yet the singularity, and there might be more obstructions in LLMs that prevent them from doing better than humans. Perhaps at some point there will be a test time scaling method that doesn’t seem to have a bound on time horizons given enough test time compute, but even then it’s possible that feasible hardware wouldn’t make it either faster or more economical than humans at very long horizon lengths.
Humans already have infinite horizon in this sense (under some infeasible idealized methodology). Horizon metrics quantify progress towards overcoming an obstruction that exists in current LLMs, so that we might see the gradual impact of even things like brute scaling.
Humans are not yet the singularity, and there might be more obstructions in LLMs that prevent them from doing better than humans. Perhaps at some point there will be a test time scaling method that doesn’t seem to have a bound on time horizons given enough test time compute, but even then it’s possible that feasible hardware wouldn’t make it either faster or more economical than humans at very long horizon lengths.