Matthew Khoriaty comments on Taking the Training Wheels Off: Aligning LLMs without Personas

Matthew Khoriaty 2 Jun 2026 23:57 UTC
1 point
0
Agreed. Part of the problem is that it is hard to avoid alignment via personas. The LLMs already understand human goodness and you can elicit it with a few low-rank matrices. Personas are blocking researchers from doing real alignment work.