That’s very useful, thanks! That’s exactly the argument I was trying to make here. I didn’t use the term drop-in remote worker but that’s the economic incentive I’m addressing (among more immediate ones- I think large incentives start long before you have a system that can learn any job).
Lack of episodic memory looks to me like the primary reason LLMs have weaknesses humans do not. The other is a well-developed skillset for managing complex trains of thought. o1 and o3 and maybe the other reasoning models have learned some of that skillset but only mastered it in the narrow domains that allowed training on verifiable answers. Scaffolding and/or training for executive function (thought management) and/or memory seems poised to increase the growth rate of long time-horizon task performance. It’s going to take some work still but I don’t think it’s wise to assume that the seven-month doubling period won’t speed up, or that some point it will just jump to infinity, while the complexity of the necessary subtasks is still a limiting factor.
Humans don’t train on tons of increasingly long tasks, we just learn some strategies and some skills for managing our thought, like checking carefully whether a step has been accomplished, searching memory for task structure and where we’re at in the plan if we lose our place, etc. Humans are worse at longer tasks, but any normal adult human can tackle a task of any length and at least keep getting better at it for as long as they decide to stick with it.
That’s very useful, thanks! That’s exactly the argument I was trying to make here. I didn’t use the term drop-in remote worker but that’s the economic incentive I’m addressing (among more immediate ones- I think large incentives start long before you have a system that can learn any job).
Lack of episodic memory looks to me like the primary reason LLMs have weaknesses humans do not. The other is a well-developed skillset for managing complex trains of thought. o1 and o3 and maybe the other reasoning models have learned some of that skillset but only mastered it in the narrow domains that allowed training on verifiable answers. Scaffolding and/or training for executive function (thought management) and/or memory seems poised to increase the growth rate of long time-horizon task performance. It’s going to take some work still but I don’t think it’s wise to assume that the seven-month doubling period won’t speed up, or that some point it will just jump to infinity, while the complexity of the necessary subtasks is still a limiting factor.
Humans don’t train on tons of increasingly long tasks, we just learn some strategies and some skills for managing our thought, like checking carefully whether a step has been accomplished, searching memory for task structure and where we’re at in the plan if we lose our place, etc. Humans are worse at longer tasks, but any normal adult human can tackle a task of any length and at least keep getting better at it for as long as they decide to stick with it.