Lorec comments on Seven sources of goals in LLM agents

Lorec 27 Feb 2025 19:33 UTC
1 point
0

alignment researchers are clearly not in charge of the path we take to AGI

If that’s the case, we’re doomed no matter what we try. So we had better back up and change it.

Don’t springboard by RLing LLMs; you will get early performance gains and alignment will fail. We need to build something big we can understand. We probably need to build something small we can understand first.