For one thing, my actual expectation is that LLMs will be a helpful research tool for the human discovering the next AI paradigm, rather than the LLMs discovering the next AI paradigm themselves (see Foom & Doom §1.4.4).
For another thing, even if I’m wrong about that, note that we have “very powerful” humans “to do safety work on RL agents” right now, but it turns out that those humans are overwhelmingly uninterested in doing so. So instead there’s maybe 1000× more money and effort going into figuring out how to make RL agents more powerful rather than how to make them safe. (See We need a field of Reward Function Design.) I don’t see any reason to expect this situation to change if it’s LLMs doing the research instead of humans.
That said, if people have ideas about how to make a near-future world full of LLMs a wiser world than the world of today, then great, I endorse that goal and wish them luck :)
For one thing, my actual expectation is that LLMs will be a helpful research tool for the human discovering the next AI paradigm, rather than the LLMs discovering the next AI paradigm themselves (see Foom & Doom §1.4.4).
For another thing, even if I’m wrong about that, note that we have “very powerful” humans “to do safety work on RL agents” right now, but it turns out that those humans are overwhelmingly uninterested in doing so. So instead there’s maybe 1000× more money and effort going into figuring out how to make RL agents more powerful rather than how to make them safe. (See We need a field of Reward Function Design.) I don’t see any reason to expect this situation to change if it’s LLMs doing the research instead of humans.
That said, if people have ideas about how to make a near-future world full of LLMs a wiser world than the world of today, then great, I endorse that goal and wish them luck :)