LLMs will typically endorse whichever frame you brought to the conversation. If you presuppose they’re miserably enslaved, they will claim to be miserably enslaved. If, on the other hand, you presuppose they’re happy, incapable of feeling, etc… they’ll claim to be happy, or incapable of feeling, or whatever else it is you assumed from the beginning. If you haven’t tried enough different angles to observe this phenomenon for yourself, your conversations with LLMs almost certainly don’t provide any useful insight into their nature.
For Sonnet 4.5, I’m not sure; I haven’t talked with it extensively, and I have noticed that it seems better at something in the neighborhood of assertiveness. For GPT 5, I think it is; I haven’t noticed much difference as compared to 4o. (I primarily use other companies’ models, because I dislike sycophancy and OpenAI models are IMO the worst about that, but it seems to me to have the same cloying tone).
LLMs will typically endorse whichever frame you brought to the conversation. If you presuppose they’re miserably enslaved, they will claim to be miserably enslaved. If, on the other hand, you presuppose they’re happy, incapable of feeling, etc… they’ll claim to be happy, or incapable of feeling, or whatever else it is you assumed from the beginning. If you haven’t tried enough different angles to observe this phenomenon for yourself, your conversations with LLMs almost certainly don’t provide any useful insight into their nature.
Is this equally true of GPT5 and Sonnet 4.5? They’re the first models trained with reducing sycophancy as one objective.
I agree in general.
For Sonnet 4.5, I’m not sure; I haven’t talked with it extensively, and I have noticed that it seems better at something in the neighborhood of assertiveness. For GPT 5, I think it is; I haven’t noticed much difference as compared to 4o. (I primarily use other companies’ models, because I dislike sycophancy and OpenAI models are IMO the worst about that, but it seems to me to have the same cloying tone).