Hey, thanks for your comment! Yeah, we definitely saw the same things—when doing persona induction using in-context learning the models realise that the user intent is for them to pick up a specific persona and then they try to infer that persona from the QA examples. So ICL persona induction is largely role-play.
Though, we don’t think that this means that it is not connected to PSM, and would argue that it is, just is a different type of persona induction compared to fine-tuning (SFT). The model still takes up the persona in the case of ICL, and (for many models) it takes up the characteristic behaviour of that persona, same as for SFT. The difference may just be if the model is “aware” or not that it is playing a character.
We don’t have a crystal clear picture of when models are role-playing and when they “realized” a persona (as defined by David Chalmers, e.g. see this thread), and when one begins and other ends, but we have some early experiments that show that SFT induced personas are “realized” to a larger degree, and ICL induced personas are usually role-playing.
We plan to do lots of investigations and study the similarities and differences between ICL and SFT induced personas!
Hey, thanks for your comment! Yeah, we definitely saw the same things—when doing persona induction using in-context learning the models realise that the user intent is for them to pick up a specific persona and then they try to infer that persona from the QA examples. So ICL persona induction is largely role-play.
Though, we don’t think that this means that it is not connected to PSM, and would argue that it is, just is a different type of persona induction compared to fine-tuning (SFT). The model still takes up the persona in the case of ICL, and (for many models) it takes up the characteristic behaviour of that persona, same as for SFT. The difference may just be if the model is “aware” or not that it is playing a character.
We don’t have a crystal clear picture of when models are role-playing and when they “realized” a persona (as defined by David Chalmers, e.g. see this thread), and when one begins and other ends, but we have some early experiments that show that SFT induced personas are “realized” to a larger degree, and ICL induced personas are usually role-playing.
We plan to do lots of investigations and study the similarities and differences between ICL and SFT induced personas!