prosaic alignment is clearly not scalable to the types of systems they are actively planning to build
Why do you believe this?
(FWIW I think it’s foolish that all (?) frontier companies are all-in on prosaic alignment, but I am not convinced that it “clearly” won’t work.)
Because they are all planning to build agents that will have optimization pressures, and RL-type failures apply when you build RL systems, even if it’s on top of LLMs.
Why do you believe this?
(FWIW I think it’s foolish that all (?) frontier companies are all-in on prosaic alignment, but I am not convinced that it “clearly” won’t work.)
Because they are all planning to build agents that will have optimization pressures, and RL-type failures apply when you build RL systems, even if it’s on top of LLMs.