As far as current systems are concerned, the answer is that (as far as anyone knows) they don’t find things rewarding or want things. But they can still run a search to optimize a training signal, and that gives you an agent.
As far as current systems are concerned, the answer is that (as far as anyone knows) they don’t find things rewarding or want things. But they can still run a search to optimize a training signal, and that gives you an agent.