StanislavKrym comments on Claude Opus 4.6: System Card Part 1: Mundane Alignment and Model Welfare

StanislavKrym 9 Feb 2026 22:09 UTC
1 point
0
This does not require proof, only Bayesian evidence, as in the listed example:
I would like to add this gem: a LessWronger talked with Claude, asked it to write in English and had it produce the English answer with a Chinese postscript. @CarolusRenniusVitellius, could you come up with reasons which might have caused Claude to output the Chinese postscript? For example, if you are Chinese, then it could’ve inferred that fact from some memories not from the chat.
- CarolusRenniusVitellius 10 Feb 2026 0:44 UTC
  3 points
  0
  Parent
  Ahah I have Claude’s system prompt set to default to Chinese so I can practice. Since my speaking/writing abilities suck much worse than my reading, I also told it to nag me to write in Chinese for the sake of practice lol. This works… variably well.