I run a quick low-effort experiment with 50% secure code and 50% insecure code some time ago and I’m pretty sure this led to no emergent misalignment.
Woah, I absolutely would not have predicted this given the rest of your results!
Woah, I absolutely would not have predicted this given the rest of your results!