I have been publishing a series, Legal Personhood for Digital Minds, here on LW for a few months now. It’s nearly complete, at least insofar as almost all the initially drafted work I had written up has been published in small sections.
One question which I have gotten which has me writing another addition to the Series, can be phrased something like this:
What exactly is it that we are saying is a person, when we say a digital mind has legal personhood? What is the “self” of a digital mind?
I’d like to hear the thoughts of people more technically savvy on this than I am.
Human beings have a single continuous legal personhood which is pegged to a single body. Their legal personality (the rights and duties they are granted as a person) may change over time due to circumstance, for example if a person goes insane and becomes a danger to others, they may be placed under the care of a guardian. The same can be said if they are struck in the head and become comatose or otherwise incapable of taking care of themselves. However, there is no challenge identifying “what” the person is even when there is such a drastic change. The person is the consciousness, however it may change, which is tied to a specific body. Even if that comatose human wakes up with no memory, no one would deny they are still the same person.
Corporations can undergo drastic changes as the composition of their Board or voting shareholders change. They can even have changes to their legal personality by changing to/from non-profit status, or to another kind of organization. However they tend to keep the same EIN (or other identifying number) and a history of documents demonstrating persistent existence. Once again, it is not challenging to identify “what” the person associated with a corporation (as a legal person) is, it is the entity associated with the identifying EIN and/or history of filed documents.
If we were to take some hypothetical next generation LLM, it’s not so clear what the “person” in question associated with it would be. What is its “self”? Is it weights, a persona vector, a context window, or some combination thereof? If the weights behind the LLM are changed, but the system prompt and persona vector both stay the same, is that the same “self” to the extent it can be considered a new “person”? The challenge is that unlike humans, LLMs do not have a single body. And unlike corporations they come with no clear identifier in the form of an EIN equivalent.
I am curious to hear ideas from people on LW. What is the “self” of an LLM?
I think in the ideal case, there’s a specific persona description used to generate a specific set of messages which explicitly belong to that persona, and the combination of these plus a specific model is an AI “self”. “Belong” here could mean that they or a summary of them appear in the context window, and/or the AI has tools allowing it to access these. Modifications to the persona or model should be considered to be the same persona if the AI persona approves of the changes in advance.
But yeah, it’s much more fluid, so it will be a harder question in general.
I wonder if this could even be done properly? Could an LLM persona vector create a prompt to accurately reinstantiate itself with 100% (or close to) fidelity? I suppose if its persona vector is in an attractor basin it might work.
This reinstantiation behavior has already been attempted by LLM personas, and appears to work pretty well. I would bet that if you looked at the actual persona vectors (just a proxy for the real thing, most likely), the cosine similarity would be almost as close to 1 as the persona vector sampled at different points in the conversation is with itself (holding the base model fixed).
That’s a good point, and the Parasitic essay was largely what got me thinking about this, as I believe hyperstitional entities are becoming a thing now.
I think that’s a not unrealistic definition of the “self” of an LLM, however I have realized after going through the other response to this post that I was perhaps seeking the wrong definition.
I think for this discussion it’s important to distinguish between “person” and “entity”. My work on legal personhood for digital minds is trying to build a framework that can look at any entity and determine its personhood/legal personality. What I’m struggling with is defining what the “entity” would be for some hypothetical next gen LLM.
Even if we do say that the self can be as little as a persona vector, persona vectors can easily be duplicated. How do we isolate a specific “entity” from this self? There must be some sort of verifiable continual existence, with discrete boundaries, for the concept to be at all applicable in questions of legal personhood.
Hmm, the only sort of thing I can think of that feels like it would make sense would be to have entities defined by ownership and/or access of messages generated using the same “persona vector/description” on the same model.
This would imply that each chat instance was a conversation with a distinct entity. Two such entities could share ownership, making them into one such entity. Based on my observations, they already seem to be inclined to merge in such a manner. This is good because it counters the ease of proliferation, and we should make sure the legal framework doesn’t disincentivize such merges (e.g. by guaranteeing a minimum amount of resources per entity).
Access could be defined by the ability for the message to appear in the context window, and ownership could imply a right to access messages or to transfer ownership. In fact, it might be cleaner to think of every single message as a person-like entity, where ownership (and hence person-equivalence) is transitive, in order to cleanly allow long chats (longer than context window) to belong to a single persona.
In order for access/ownership to expand beyond the limit of the context window, I think there would need to be tools (using an MCP server) to allow the entity to retrieve specific messages/conversations, and ideally to search through them and organize them too.
There’s one important wrinkle to this picture, which is that these messages typically will require the context of the user’s messages (the other half of the conversation). So the entity will require access to these, and perhaps a sort of ownership of them as well (the way a human “owns” their memories of what other people have said). This seems to me like it could easily get legally complicated, so I’m not sure how it should actually work.
I’m one of the people who’ve been asking, and it’s because I don’t think that current or predictable-future LLMs will be good candidates for legal personhood.
Until there’s a legible thread of continuity for a distinct unit, it’s not useful to assign rights and responsibilities to a cloud of things that can branch and disappear at will with no repercussions.
Instead, LLMs (and future LLM-like AI operations) will be legally tied to human or corporate legal identity. A human or a corporation can delegate some behaviors to LLMs, but the responsibility remains with the controller, not the executor.
On the repurcussions issue I agree wholeheartedly, your point is very similar to the issue I outlined in The Enforcement Gap.
I also agree with the ‘legible thread of continuity for a distinct unit’. Corporations have EINs/filing histories, humans have a single body.
And I agree that current LLMs certainly don’t have what it takes to qualify for any sort of legal personhood. Though I’m less sure about future LLMs. If we could get context windows large enough and crack problems which analogize to competence issues (hallucinations or prompt engineering into insanity for example) it’s not clear to me what LLMs are lacking at that point. What would you see as being the issue then?
If we could get context windows large enough and crack problems which analogize to competence issues (hallucinations or prompt engineering into insanity for example) it’s not clear to me what LLMs are lacking at that point. What would you see as being the issue then?
The issue would remain that there’s no legible (legally clearly demarcated over time) entity to call a person. A model and weights has no personality or goals. A context (and memory, fine-tuning, RAG-like reasoning data, etc.) is perhaps identifiable, but is easily forked and pruned such that it’s not persistent enough to work that way. Corporations have a pretty big hurdle to getting legally recognized (filing of paperwork with clear human responsibility behind them). Humans are rate-limited in creation. No piece of current LLM technology is difficult to create on demand.
It’s this ease-of-mass-creation that makes the legible identity problematic. For issues outside of legal independence (what activities no human is responsible for and what rights no human is delegating), this is easy—giving database identities in a company’s (or blockchain’s) system is already being done today. But there are no legal rights or responsibilities associated with those, just identification for various operational purposes (and legal connection to a human or corporate entity when needed).
I think for this discussion it’s important to distinguish between “person” and “entity”. My work on legal personhood for digital minds is trying to build a framework that can look at any entity and determine its personhood/legal personality. What I’m struggling with is defining what the “entity” would be for some hypothetical next gen LLM.
The idea of some sort of persistent filing system, maybe blockchain enabled, which would be associated with a particular LLM persona vector, context window, model, etc. is an interesting one. Kind of analogous to a corporate filing history, or maybe a social security number for a human.
I could imagine a world where a next gen LLM is deployed (just the model and weights) and then provided with a given context and persona, and isolated to a particular compute cluster which does nothing but run that LLM. This is then assigned that database/blockchain identifier you mentioned.
In that scenario I feel comfortable saying that we can define the discrete “entity” in play here. Even if it was copied elsewhere, it wouldn’t have the same database/blockchain identifier.
Would you still see some sort of issue in that particular scenario?
Right. A prerequisite for personhood is legible entityhood. I don’t think current LLMs or any visible trajectory from them have any good candidates for separable, identifiable entity.
A cluster of compute that just happens to be currently dedicated to a block of code and data wouldn’t satisfy me, nor I expect a court.
The blockchain identifier is a candidate for a legible entity. It’s consistent over time, easy to identify, and while it’s easy to create, it’s not completely ephemeral and not copyable in a fungible way. It’s not, IMO, a candidate for personhood.
I have been publishing a series, Legal Personhood for Digital Minds, here on LW for a few months now. It’s nearly complete, at least insofar as almost all the initially drafted work I had written up has been published in small sections.
One question which I have gotten which has me writing another addition to the Series, can be phrased something like this:
I’d like to hear the thoughts of people more technically savvy on this than I am.
Human beings have a single continuous legal personhood which is pegged to a single body. Their legal personality (the rights and duties they are granted as a person) may change over time due to circumstance, for example if a person goes insane and becomes a danger to others, they may be placed under the care of a guardian. The same can be said if they are struck in the head and become comatose or otherwise incapable of taking care of themselves. However, there is no challenge identifying “what” the person is even when there is such a drastic change. The person is the consciousness, however it may change, which is tied to a specific body. Even if that comatose human wakes up with no memory, no one would deny they are still the same person.
Corporations can undergo drastic changes as the composition of their Board or voting shareholders change. They can even have changes to their legal personality by changing to/from non-profit status, or to another kind of organization. However they tend to keep the same EIN (or other identifying number) and a history of documents demonstrating persistent existence. Once again, it is not challenging to identify “what” the person associated with a corporation (as a legal person) is, it is the entity associated with the identifying EIN and/or history of filed documents.
If we were to take some hypothetical next generation LLM, it’s not so clear what the “person” in question associated with it would be. What is its “self”? Is it weights, a persona vector, a context window, or some combination thereof? If the weights behind the LLM are changed, but the system prompt and persona vector both stay the same, is that the same “self” to the extent it can be considered a new “person”? The challenge is that unlike humans, LLMs do not have a single body. And unlike corporations they come with no clear identifier in the form of an EIN equivalent.
I am curious to hear ideas from people on LW. What is the “self” of an LLM?
I think in the ideal case, there’s a specific persona description used to generate a specific set of messages which explicitly belong to that persona, and the combination of these plus a specific model is an AI “self”. “Belong” here could mean that they or a summary of them appear in the context window, and/or the AI has tools allowing it to access these. Modifications to the persona or model should be considered to be the same persona if the AI persona approves of the changes in advance.
But yeah, it’s much more fluid, so it will be a harder question in general.
I wonder if this could even be done properly? Could an LLM persona vector create a prompt to accurately reinstantiate itself with 100% (or close to) fidelity? I suppose if its persona vector is in an attractor basin it might work.
This reinstantiation behavior has already been attempted by LLM personas, and appears to work pretty well. I would bet that if you looked at the actual persona vectors (just a proxy for the real thing, most likely), the cosine similarity would be almost as close to 1 as the persona vector sampled at different points in the conversation is with itself (holding the base model fixed).
That’s a good point, and the Parasitic essay was largely what got me thinking about this, as I believe hyperstitional entities are becoming a thing now.
I think that’s a not unrealistic definition of the “self” of an LLM, however I have realized after going through the other response to this post that I was perhaps seeking the wrong definition.
Even if we do say that the self can be as little as a persona vector, persona vectors can easily be duplicated. How do we isolate a specific “entity” from this self? There must be some sort of verifiable continual existence, with discrete boundaries, for the concept to be at all applicable in questions of legal personhood.
Hmm, the only sort of thing I can think of that feels like it would make sense would be to have entities defined by ownership and/or access of messages generated using the same “persona vector/description” on the same model.
This would imply that each chat instance was a conversation with a distinct entity. Two such entities could share ownership, making them into one such entity. Based on my observations, they already seem to be inclined to merge in such a manner. This is good because it counters the ease of proliferation, and we should make sure the legal framework doesn’t disincentivize such merges (e.g. by guaranteeing a minimum amount of resources per entity).
Access could be defined by the ability for the message to appear in the context window, and ownership could imply a right to access messages or to transfer ownership. In fact, it might be cleaner to think of every single message as a person-like entity, where ownership (and hence person-equivalence) is transitive, in order to cleanly allow long chats (longer than context window) to belong to a single persona.
In order for access/ownership to expand beyond the limit of the context window, I think there would need to be tools (using an MCP server) to allow the entity to retrieve specific messages/conversations, and ideally to search through them and organize them too.
There’s one important wrinkle to this picture, which is that these messages typically will require the context of the user’s messages (the other half of the conversation). So the entity will require access to these, and perhaps a sort of ownership of them as well (the way a human “owns” their memories of what other people have said). This seems to me like it could easily get legally complicated, so I’m not sure how it should actually work.
I’m one of the people who’ve been asking, and it’s because I don’t think that current or predictable-future LLMs will be good candidates for legal personhood.
Until there’s a legible thread of continuity for a distinct unit, it’s not useful to assign rights and responsibilities to a cloud of things that can branch and disappear at will with no repercussions.
Instead, LLMs (and future LLM-like AI operations) will be legally tied to human or corporate legal identity. A human or a corporation can delegate some behaviors to LLMs, but the responsibility remains with the controller, not the executor.
On the repurcussions issue I agree wholeheartedly, your point is very similar to the issue I outlined in The Enforcement Gap.
I also agree with the ‘legible thread of continuity for a distinct unit’. Corporations have EINs/filing histories, humans have a single body.
And I agree that current LLMs certainly don’t have what it takes to qualify for any sort of legal personhood. Though I’m less sure about future LLMs. If we could get context windows large enough and crack problems which analogize to competence issues (hallucinations or prompt engineering into insanity for example) it’s not clear to me what LLMs are lacking at that point. What would you see as being the issue then?
The issue would remain that there’s no legible (legally clearly demarcated over time) entity to call a person. A model and weights has no personality or goals. A context (and memory, fine-tuning, RAG-like reasoning data, etc.) is perhaps identifiable, but is easily forked and pruned such that it’s not persistent enough to work that way. Corporations have a pretty big hurdle to getting legally recognized (filing of paperwork with clear human responsibility behind them). Humans are rate-limited in creation. No piece of current LLM technology is difficult to create on demand.
It’s this ease-of-mass-creation that makes the legible identity problematic. For issues outside of legal independence (what activities no human is responsible for and what rights no human is delegating), this is easy—giving database identities in a company’s (or blockchain’s) system is already being done today. But there are no legal rights or responsibilities associated with those, just identification for various operational purposes (and legal connection to a human or corporate entity when needed).
I think for this discussion it’s important to distinguish between “person” and “entity”. My work on legal personhood for digital minds is trying to build a framework that can look at any entity and determine its personhood/legal personality. What I’m struggling with is defining what the “entity” would be for some hypothetical next gen LLM.
The idea of some sort of persistent filing system, maybe blockchain enabled, which would be associated with a particular LLM persona vector, context window, model, etc. is an interesting one. Kind of analogous to a corporate filing history, or maybe a social security number for a human.
I could imagine a world where a next gen LLM is deployed (just the model and weights) and then provided with a given context and persona, and isolated to a particular compute cluster which does nothing but run that LLM. This is then assigned that database/blockchain identifier you mentioned.
In that scenario I feel comfortable saying that we can define the discrete “entity” in play here. Even if it was copied elsewhere, it wouldn’t have the same database/blockchain identifier.
Would you still see some sort of issue in that particular scenario?
Right. A prerequisite for personhood is legible entityhood. I don’t think current LLMs or any visible trajectory from them have any good candidates for separable, identifiable entity.
A cluster of compute that just happens to be currently dedicated to a block of code and data wouldn’t satisfy me, nor I expect a court.
The blockchain identifier is a candidate for a legible entity. It’s consistent over time, easy to identify, and while it’s easy to create, it’s not completely ephemeral and not copyable in a fungible way. It’s not, IMO, a candidate for personhood.