Bronson Schoen comments on Split Personality Training: Revealing Latent Knowledge Through Alternate Personalities (Research Report)

Bronson Schoen 16 Jan 2026 1:22 UTC
1 point
0

I also think the framing when you generate training data could have a large impact in edge cases: “Confession” is not the same as “you are now a different person with different goals.

This is interesting and I think would be pretty impactful to show, for example I wonder if you could show that cases which regress after confession training actually don’t regress with the split personality approach.