I’d rather say that RLHF+’ed chatbots are upon-reflection-not-so-shockingly sycophantic, since they have been trained to satisfy their conversational partner.
you’re correct; this was a reflection of my own expectations about kindness more than any exceptional qualities of these chatbots. i reread my submission here somewhat recently and thought “oh god, this does not contain the insight i thought it did”.
I’d rather say that RLHF+’ed chatbots are upon-reflection-not-so-shockingly sycophantic, since they have been trained to satisfy their conversational partner.
you’re correct; this was a reflection of my own expectations about kindness more than any exceptional qualities of these chatbots. i reread my submission here somewhat recently and thought “oh god, this does not contain the insight i thought it did”.