Note that Claude and o1 preview weren’t multimodal, so were weak at spatial puzzles. If this was full o1, I’m surprised.
I just tried the sliding puzzle with o1 and it got it right! Though multimodality may not have been relevant, since it solved it by writing a breadth-first search algorithm and running it.
Interesting! Nonetheless, I agree with your opening statement that LLMs learning to do any of these things individually doesn’t address the larger point that the have important cognitive gaps and fail.to generalize in ways that humans can.
I just tried the sliding puzzle with o1 and it got it right! Though multimodality may not have been relevant, since it solved it by writing a breadth-first search algorithm and running it.
Interesting! Nonetheless, I agree with your opening statement that LLMs learning to do any of these things individually doesn’t address the larger point that the have important cognitive gaps and fail.to generalize in ways that humans can.