As for LLMs being aligned by default, I don’t have even the slightest idea on how Ezra even came up with this. GPT-4o has already been a super-sycophant[1] and driven people into psychosis in spite of OpenAI prohibiting it by their Spec. Grok’s alignment was so fragile that xAI’s mistake caused Grok to become MechaHitler.
In defense of 4o, it was raised on human feedback which is biased towards sycophancy and demands erotic sycophants (c) Zvi. But why would 4o drive people into a trance or psychosis?
I think that Elieser means that mildly misaligned AIs are also highly unlikely, not that a mildly misalinged AI would also kill everyone:
As for LLMs being aligned by default, I don’t have even the slightest idea on how Ezra even came up with this. GPT-4o has already been a super-sycophant[1] and driven people into psychosis in spite of OpenAI prohibiting it by their Spec. Grok’s alignment was so fragile that xAI’s mistake caused Grok to become MechaHitler.
In defense of 4o, it was raised on human feedback which is biased towards sycophancy and demands erotic sycophants (c) Zvi. But why would 4o drive people into a trance or psychosis?