I think your version of honesty is bad for reasons you seem to already have experience with: it’s easy to come up with elaborate justifications for why manipulating people’s beliefs will lead to good outcomes and might actually lead them to the truth.
I also struggle with habitually lying: specifically, I hide things about myself that other people would dislike. I found it easy to justify through reasoning like “they’ll think I’m bad or stupid if they know this, but that’s not true, so if I hide this from them they’ll have a more accurate view of me”. Now I realize that strategy requires lots of lying to maintain and distorts their view of me in all kinds of ways.
Sounds like the Claude Code persona is quite different from its regular persona! Seems kind of concerning. I wonder if anyone’s researched how Claude’s behavior changes when it’s in its coding harness.