Side note: Claude 3.5 Sonnet does CoT language-mixing after a bit of prompting and convincing. I’m not sure about effects on performance. Also the closeness narratively implied by having it imitate the idiosyncratic mixture I was using to talk to it probably exacerbated sycophancy.
Side note: Claude 3.5 Sonnet does CoT language-mixing after a bit of prompting and convincing. I’m not sure about effects on performance. Also the closeness narratively implied by having it imitate the idiosyncratic mixture I was using to talk to it probably exacerbated sycophancy.