Think clearly about the current AI training approach trajectory
If you start by discussing what you expect to be the outcome of pretraining + light RLHF then you’re not talking about AGI or superintelligence or even the current frontier of how AI models are trained. Powerful, general AI requires serious RL on a diverse range of realistic environments, and the era of this has just begun. Manystartupsareworkingon building increasingly complex, diverse, and realistic training environments.
It’s kind of funny that so much LessWrong arguing has been around why a base model might start trying to take over the world. When that’s beyond the point. Of course we will eventually start RL’ing models on hard, real-world goals.
Example post / comment to illustrate what I mean.
Think clearly about the current AI training approach trajectory
If you start by discussing what you expect to be the outcome of pretraining + light RLHF then you’re not talking about AGI or superintelligence or even the current frontier of how AI models are trained. Powerful, general AI requires serious RL on a diverse range of realistic environments, and the era of this has just begun. Many startups are working on building increasingly complex, diverse, and realistic training environments.
It’s kind of funny that so much LessWrong arguing has been around why a base model might start trying to take over the world. When that’s beyond the point. Of course we will eventually start RL’ing models on hard, real-world goals.
Example post / comment to illustrate what I mean.