It will not meaningfully generalize beyond domains with easy verification
Why can’t we make every domain have automated verification? (I wont claim easy, but easy enough to do with finite resources) Agency, for instance, is verifiable in competitive games of arbitrary difficulty and scale. Just check who won. DeepMind has already done this to some degree with language models and virtual agents a year ago. https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/
Every other trait we care about is instrumental in agency to some degree, and the games can be customized to focus on various aspects as well, just like you focus a class in school.
Why can’t we make every domain have automated verification? (I wont claim easy, but easy enough to do with finite resources) Agency, for instance, is verifiable in competitive games of arbitrary difficulty and scale. Just check who won. DeepMind has already done this to some degree with language models and virtual agents a year ago. https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/
Every other trait we care about is instrumental in agency to some degree, and the games can be customized to focus on various aspects as well, just like you focus a class in school.