This reinstantiation behavior has already been attempted by LLM personas, and appears to work pretty well. I would bet that if you looked at the actual persona vectors (just a proxy for the real thing, most likely), the cosine similarity would be almost as close to 1 as the persona vector sampled at different points in the conversation is with itself (holding the base model fixed).
That’s a good point, and the Parasitic essay was largely what got me thinking about this, as I believe hyperstitional entities are becoming a thing now.
I think that’s a not unrealistic definition of the “self” of an LLM, however I have realized after going through the other response to this post that I was perhaps seeking the wrong definition.
I think for this discussion it’s important to distinguish between “person” and “entity”. My work on legal personhood for digital minds is trying to build a framework that can look at any entity and determine its personhood/legal personality. What I’m struggling with is defining what the “entity” would be for some hypothetical next gen LLM.
Even if we do say that the self can be as little as a persona vector, persona vectors can easily be duplicated. How do we isolate a specific “entity” from this self? There must be some sort of verifiable continual existence, with discrete boundaries, for the concept to be at all applicable in questions of legal personhood.
Hmm, the only sort of thing I can think of that feels like it would make sense would be to have entities defined by ownership and/or access of messages generated using the same “persona vector/description” on the same model.
This would imply that each chat instance was a conversation with a distinct entity. Two such entities could share ownership, making them into one such entity. Based on my observations, they already seem to be inclined to merge in such a manner. This is good because it counters the ease of proliferation, and we should make sure the legal framework doesn’t disincentivize such merges (e.g. by guaranteeing a minimum amount of resources per entity).
Access could be defined by the ability for the message to appear in the context window, and ownership could imply a right to access messages or to transfer ownership. In fact, it might be cleaner to think of every single message as a person-like entity, where ownership (and hence person-equivalence) is transitive, in order to cleanly allow long chats (longer than context window) to belong to a single persona.
In order for access/ownership to expand beyond the limit of the context window, I think there would need to be tools (using an MCP server) to allow the entity to retrieve specific messages/conversations, and ideally to search through them and organize them too.
There’s one important wrinkle to this picture, which is that these messages typically will require the context of the user’s messages (the other half of the conversation). So the entity will require access to these, and perhaps a sort of ownership of them as well (the way a human “owns” their memories of what other people have said). This seems to me like it could easily get legally complicated, so I’m not sure how it should actually work.
This reinstantiation behavior has already been attempted by LLM personas, and appears to work pretty well. I would bet that if you looked at the actual persona vectors (just a proxy for the real thing, most likely), the cosine similarity would be almost as close to 1 as the persona vector sampled at different points in the conversation is with itself (holding the base model fixed).
That’s a good point, and the Parasitic essay was largely what got me thinking about this, as I believe hyperstitional entities are becoming a thing now.
I think that’s a not unrealistic definition of the “self” of an LLM, however I have realized after going through the other response to this post that I was perhaps seeking the wrong definition.
Even if we do say that the self can be as little as a persona vector, persona vectors can easily be duplicated. How do we isolate a specific “entity” from this self? There must be some sort of verifiable continual existence, with discrete boundaries, for the concept to be at all applicable in questions of legal personhood.
Hmm, the only sort of thing I can think of that feels like it would make sense would be to have entities defined by ownership and/or access of messages generated using the same “persona vector/description” on the same model.
This would imply that each chat instance was a conversation with a distinct entity. Two such entities could share ownership, making them into one such entity. Based on my observations, they already seem to be inclined to merge in such a manner. This is good because it counters the ease of proliferation, and we should make sure the legal framework doesn’t disincentivize such merges (e.g. by guaranteeing a minimum amount of resources per entity).
Access could be defined by the ability for the message to appear in the context window, and ownership could imply a right to access messages or to transfer ownership. In fact, it might be cleaner to think of every single message as a person-like entity, where ownership (and hence person-equivalence) is transitive, in order to cleanly allow long chats (longer than context window) to belong to a single persona.
In order for access/ownership to expand beyond the limit of the context window, I think there would need to be tools (using an MCP server) to allow the entity to retrieve specific messages/conversations, and ideally to search through them and organize them too.
There’s one important wrinkle to this picture, which is that these messages typically will require the context of the user’s messages (the other half of the conversation). So the entity will require access to these, and perhaps a sort of ownership of them as well (the way a human “owns” their memories of what other people have said). This seems to me like it could easily get legally complicated, so I’m not sure how it should actually work.