I think there’s more evidence that Ilya thought self-play was key. From the Musk vs. Altman Emails:
Self play as a key path to AGI:
Self play in multiagent environments is magical: if you place agents into an environment, then no matter how smart (or not smart) they are, the environment will provide them with the exact level of challenge, which can be faced only by outsmarting the competition. So for example, if you have a group of children, they will find each other’s company to be challenging; likewise for a collection of super intelligences of comparable intelligence. So the “solution” to self-play is to become more and more intelligent, without bound.
Self-play lets us get “something out of nothing.” The rules of a competitive game can be simple, but the best strategy for playing this game can be immensely complex. [motivating example: https://www.youtube.com/watch?v=u2T77mQmJYI].
Training agents in simulation to develop very good dexterity via competitive fighting, such as wrestling. Here is a video of ant-shaped robots that we trained to struggle: <redacted>
I think there’s more evidence that Ilya thought self-play was key. From the Musk vs. Altman Emails: