Another is that humans are not infinitely intelligent; their position on the scale just says that they can make indefinite progress on a problem given infinite time, which they don’t have.
It’s not clear to me that an human, using their brain and a go board for reasoning could beat AlphaZero even if you give them infinite time.
For most problems, there are diminishing returns to additional human reasoning steps. For many reasoning tasks, humans are influenced by a lot of biases. If you do superforcasting, I don’t know of a way to remove the biases inherent in my forecast by just adding additional reasoning cycles.
If might be possible for a model to rate the reasoning quality of deep research and then train on the top 10% of deep research queries.
If you build language model-based agents there are plenty of tasks that those agents can do that have real-world feedback. The amount of resources invested go up a lot and currently the big labs haven’t deployed agents.
It’s not clear to me that an human, using their brain and a go board for reasoning could beat AlphaZero even if you give them infinite time.
I agree but I dispute that this example is relevant. I don’t think there is any step in between “start walking on two legs” to “build a spaceship” that requires as much strictly-type-A reasoning as beating AlphaZero at go or chess. This particular kind of capability class doesn’t seem to me to be very relevant.
Also, to the extent that it is relevant, a smart human with infinite time could outperform AlphaGo by programming a better chess/go computer. Which may sound silly but I actually think it’s a perfectly reasonable reply—using narrow AI to assist in brute-force cognitive tasks is something humans are allowed to do. And it’s something that LLMs are also allowed to do; if they reach superhuman performance on general reasoning, and part of how they do this is by writing python scripts for modular subproblems, then we wouldn’t say that this doesn’t count.
It’s not clear to me that an human, using their brain and a go board for reasoning could beat AlphaZero even if you give them infinite time.
For most problems, there are diminishing returns to additional human reasoning steps. For many reasoning tasks, humans are influenced by a lot of biases. If you do superforcasting, I don’t know of a way to remove the biases inherent in my forecast by just adding additional reasoning cycles.
If might be possible for a model to rate the reasoning quality of deep research and then train on the top 10% of deep research queries.
If you build language model-based agents there are plenty of tasks that those agents can do that have real-world feedback. The amount of resources invested go up a lot and currently the big labs haven’t deployed agents.
I agree but I dispute that this example is relevant. I don’t think there is any step in between “start walking on two legs” to “build a spaceship” that requires as much strictly-type-A reasoning as beating AlphaZero at go or chess. This particular kind of capability class doesn’t seem to me to be very relevant.
Also, to the extent that it is relevant, a smart human with infinite time could outperform AlphaGo by programming a better chess/go computer. Which may sound silly but I actually think it’s a perfectly reasonable reply—using narrow AI to assist in brute-force cognitive tasks is something humans are allowed to do. And it’s something that LLMs are also allowed to do; if they reach superhuman performance on general reasoning, and part of how they do this is by writing python scripts for modular subproblems, then we wouldn’t say that this doesn’t count.