Agreed. Part of the problem is that it is hard to avoid alignment via personas. The LLMs already understand human goodness and you can elicit it with a few low-rank matrices. Personas are blocking researchers from doing real alignment work.
Agreed. Part of the problem is that it is hard to avoid alignment via personas. The LLMs already understand human goodness and you can elicit it with a few low-rank matrices. Personas are blocking researchers from doing real alignment work.