I should make this more clear, but the LLM can only do this by memorizing something about token 101830, not by counting the characters that it sees. So it can memorize that token 101830 is spelled s-t-r-a-w-b-e-r-r-y and print that out and then count the characters. My point is just that it can’t count the characters in the input.
Sure. But my point is that it’s not that the LLM doesn’t have the knowledge to solve the task. It has that knowledge. It just doesn’t know how to reach it. And doesn’t know that it should try.
I should make this more clear, but the LLM can only do this by memorizing something about token 101830, not by counting the characters that it sees. So it can memorize that token 101830 is spelled s-t-r-a-w-b-e-r-r-y and print that out and then count the characters. My point is just that it can’t count the characters in the input.
Sure. But my point is that it’s not that the LLM doesn’t have the knowledge to solve the task. It has that knowledge. It just doesn’t know how to reach it. And doesn’t know that it should try.