PhD @ EPFL DLab with Robert West
I do research on LLM Personas—where to they come from, how to apply them, how to make models safer.
PhD @ EPFL DLab with Robert West
I do research on LLM Personas—where to they come from, how to apply them, how to make models safer.
Good question — and yes, in our setup the persona is built from scratch, so in principle it could be any persona, including a real one. The synthetic version is advantageous in our view mostly because it’s very controlled: we can specify exactly what it values and how it should reason.
Using a real person seems possible in theory, but raises several hard questions:
Whose persona? Picking someone is already a very hard question. Who’s the most aligned person in the world? Aligned according to whose values? Is it even ethical to bake one specific person’s persona into a model?
How do we actually measure one human’s persona? The term is useful, but even psychologists struggle to define it fully. We have instruments, but they don’t give us a complete picture.
Humans are much more complex than a persona. If we “hire” a real person, we’re not getting just a persona — we’re getting a whole personality with multi-level structure: inconsistencies, moods, changing views, contradictions.
To summarize: I think this is a very relevant question, and maybe one day we’ll get closer to it. But for now the drawbacks and limitations are substantial. Hopefully we’ll get there eventually — if it ends up serving the good!
Indeed, that was one of our goals: to build a lever for inference time. If we build a synthetic persona and we know it exists, then it would be easy to track it through inference time, modify it, etc. I love how “The Assistant Axis” did activation capping to prevent persona drift, and we are aiming to further mechanistically locate our synthetic persona and see what we can do with it at inference time!