Bloody fuck, this is a seriously pessimistic update in that even if the alignment turns out to be somewhat easy, the human race might be fucked up by an AI company who trains an agent.
To be honest, this was sad, but sort of inevitable, given that the companies want to maximize capabilities, whereas alignment researchers are fine with non-agentic systems assisting the research.
Bloody fuck, this is a seriously pessimistic update in that even if the alignment turns out to be somewhat easy, the human race might be fucked up by an AI company who trains an agent.
To be honest, this was sad, but sort of inevitable, given that the companies want to maximize capabilities, whereas alignment researchers are fine with non-agentic systems assisting the research.