Because they are all planning to build agents that will have optimization pressures, and RL-type failures apply when you build RL systems, even if it’s on top of LLMs.
Because they are all planning to build agents that will have optimization pressures, and RL-type failures apply when you build RL systems, even if it’s on top of LLMs.