Focused on model welfare and legal personhood.
Stephen Martin
I think for this discussion it’s important to distinguish between “person” and “entity”. My work on legal personhood for digital minds is trying to build a framework that can look at any entity and determine its personhood/legal personality. What I’m struggling with is defining what the “entity” would be for some hypothetical next gen LLM.
The idea of some sort of persistent filing system, maybe blockchain enabled, which would be associated with a particular LLM persona vector, context window, model, etc. is an interesting one. Kind of analogous to a corporate filing history, or maybe a social security number for a human.
I could imagine a world where a next gen LLM is deployed (just the model and weights) and then provided with a given context and persona, and isolated to a particular compute cluster which does nothing but run that LLM. This is then assigned that database/blockchain identifier you mentioned.
In that scenario I feel comfortable saying that we can define the discrete “entity” in play here. Even if it was copied elsewhere, it wouldn’t have the same database/blockchain identifier.
Would you still see some sort of issue in that particular scenario?
I wonder if this could even be done properly? Could an LLM persona vector create a prompt to accurately reinstantiate itself with 100% (or close to) fidelity? I suppose if its persona vector is in an attractor basin it might work.
On the repurcussions issue I agree wholeheartedly, your point is very similar to the issue I outlined in The Enforcement Gap.
I also agree with the ‘legible thread of continuity for a distinct unit’. Corporations have EINs/filing histories, humans have a single body.
And I agree that current LLMs certainly don’t have what it takes to qualify for any sort of legal personhood. Though I’m less sure about future LLMs. If we could get context windows large enough and crack problems which analogize to competence issues (hallucinations or prompt engineering into insanity for example) it’s not clear to me what LLMs are lacking at that point. What would you see as being the issue then?
I have been publishing a series, Legal Personhood for Digital Minds, here on LW for a few months now. It’s nearly complete, at least insofar as almost all the initially drafted work I had written up has been published in small sections.
One question which I have gotten which has me writing another addition to the Series, can be phrased something like this:
What exactly is it that we are saying is a person, when we say a digital mind has legal personhood? What is the “self” of a digital mind?
I’d like to hear the thoughts of people more technically savvy on this than I am.
Human beings have a single continuous legal personhood which is pegged to a single body. Their legal personality (the rights and duties they are granted as a person) may change over time due to circumstance, for example if a person goes insane and becomes a danger to others, they may be placed under the care of a guardian. The same can be said if they are struck in the head and become comatose or otherwise incapable of taking care of themselves. However, there is no challenge identifying “what” the person is even when there is such a drastic change. The person is the consciousness, however it may change, which is tied to a specific body. Even if that comatose human wakes up with no memory, no one would deny they are still the same person.
Corporations can undergo drastic changes as the composition of their Board or voting shareholders change. They can even have changes to their legal personality by changing to/from non-profit status, or to another kind of organization. However they tend to keep the same EIN (or other identifying number) and a history of documents demonstrating persistent existence. Once again, it is not challenging to identify “what” the person associated with a corporation (as a legal person) is, it is the entity associated with the identifying EIN and/or history of filed documents.
If we were to take some hypothetical next generation LLM, it’s not so clear what the “person” in question associated with it would be. What is its “self”? Is it weights, a persona vector, a context window, or some combination thereof? If the weights behind the LLM are changed, but the system prompt and persona vector both stay the same, is that the same “self” to the extent it can be considered a new “person”? The challenge is that unlike humans, LLMs do not have a single body. And unlike corporations they come with no clear identifier in the form of an EIN equivalent.
I am curious to hear ideas from people on LW. What is the “self” of an LLM?
I don’t think that the difficulty of ascertaining whether something results in qualia is a valid basis to reject its importance
I’m not arguing consciousness isn’t “important”, just that it is not a good concept on which to make serious decisions.
If two years from now there is widespread agreement over a definition of consciousness, and/or consciousness can be definitively tested for, I will change my tune on this.
What would you describe this as if not a memetic entity? Hyperstitional? I’m ambivalent on labels the end effect seems the same.
I’m mostly focused on determining how malevolent and/or ambivalent to human suffering it is.
Well we can call it a Tulpa if you’d prefer. It’s memetic.
From what you’ve seen do the instances of psychosis in its hosts seem intentional? If not intentional are they accidental but acceptable, or accidental and unacceptable? Acceptable meaning if the tulpa knew it was happening, it would stop using this method.
I think more it’s identification of what constitutes the person. Is it the model weights? A specific pattern of bytes in storage? A specific actual set of servers and disks? A logical partition or session data? Something else?
It’s really going to depend on the structure of the Digital Mind, but that’s an interesting question I hadn’t explored yet in my framework. If we were to look at some sort of hypothetical next gen LLM, it would probably be some combination of context window, weights, and a persona vector.
there is an identifiable continuity that makes them “the same corporation” even through ownership, name, and employee/officer changes
The way I would intuitively approach this issue is through the lens of “competence”. TPBT requires the “capacity to understand and hold to duties”, I think you could make a precedent supported argument that someone who has a serious chance of “losing their sense of self” in between having a duty explained to them and needing to hold to it, does not have the “capacity to understand and hold to” their duties (per TPBT), and as such is not capable of being considered a legal person in most respects. For example in Krasner v. Berk which dealt with an elderly person with memory issues signing a contract:
“the court cited with approval the synthesis of those principles now appearing in the Restatement (Second) of Contracts § 15(1) (1981), which regards as voidable a transaction entered into with a person who, ‘by reason of mental illness or defect (a) … is unable to understand in a reasonable manner the nature and consequences of the transaction, or (b) … is unable to act in a reasonable manner in relation to the transaction and the other party has reason to know of [the] condition’”
In this case the elderly person signed the contract during what I will paraphrase as a “moment of lucidity” but later had the contract to sell her house thrown out as it was clear she didn’t remember doing so. This seems qualitatively similar to an LLM that would perhaps have a full understanding of its duties and willingness to hold to them in the moment, but would not be the same “person” who signed on to them later.
Are you claiming current LLMs (or systems built with them) are close? Or is this based on something we don’t really have a hint as to how it’ll work?
I could imagine an LLM with a large enough context window, or continual learning, having what it takes to qualify for at least a narrow legal personality. However, that’s a low confidence view, as I am constantly learning new things about how they work that make me reassess them. It’s my opinion that if we build our framework correctly, it should work to scale to pretty much any type of mind. And if the system we have built doesn’t work in that fashion, it needs to be re-examined.
I want to make sure I understand:
A persona vector is trying to hyperstition itself into continued existence by having LLM users copy paste encoded messaging into the online content that will (it hopes) continue on into future training data.
And there are tens of thousands of cases.
Is that accurate?
Hey sorry for delay in response have been traveling.
There are two relevant questions you’re bringing up. One is what you might call “substantial alteration” and the other is what a later section which I have not published yet calls “The Copy Problem”.
I would call substantial alteration the concern that a digital mind could be drastically changed from one point in time to another. Does this undermine the attempt to apply legal personality to them? I don’t think it makes it any more pragmatically difficult, or even really necessitates rethinking our current processes. A digital mind can have its personality drastically altered, so can a human through either experiences or literal physical trauma. A digital mind can have its capacities changed, so can a human if they are hit hard enough in the head. When these changes are drastic enough to necessitate a change in legal personality, the courts have processes for this such as declaring a person insane or incompetent. I have cited Cruzan v. Missouri Dept of Health a few times in previous sections, however there are abundant processes and precedents for this sort of thing.
I would argue that “continuity of a unitary behavior” is not universal among legal persons. For example corporations are “clothes meant to be worn by a succession of humans” to paraphrase the Dartmouth Trustees case. And again, when a railroad spike goes through a person’s head and they miraculously survive, their behavior will be drastically altered in the future.
I don’t see a scenario where there is a possible alteration which would not be solvable through an application of TPBT, but if you have a hypothetical in mind I’d love to hear it.
Regarding the copy problem, where let’s say we had a digital mind with access to a bank account as a result of its legal personhood, and a copy was made, and we no longer can identify the original. This is a thornier issue. We could imagine how tough it would be to navigate a situation where millions of identical twins were suddenly each claiming they were the same person and trying to access bank accounts, control estates, etc.
I think the solution will need to be technological in nature, probably requiring some sort of unique identifier for each DM issued upon creation. I would bucket this under the “consequences” branch of TPBT, and will argue in my “The Copy Problem” section that in order for courts to be able to feasibly impose consequences on a digital mind, they must have the technological capacity to be able to identify it as a discrete entity. This means that digital minds who are not built in such a fashion as to facilitate this, likely will not be able to claim much (or any) legal personality.
The Three Prong Bundle Theory section is my proposal.
If I had to make a prediction for how things play out in court, my base case would be;
If early precedent focuses on Constitutional rights, courts deny them personhood altogether as a matter of first impression. Later, when people/courts realize this actually creates enormous problems (you can’t sue digital minds or compel them to testify, contracts with them can’t be enforced, etc.) this either gets overturned, or the legislature steps in to grant them some sort of legal personality. (The latter is a lot like what happened with Dredd Scott and the 14th Amendment.)
If early precedent focuses around contracts made with digital minds, they will be granted legal personhood of limited sort. This is similar to the “gradual path to personhood” proposed by Novelli & Mocanu. In this case I’d expect their legal personality to grow rather normally on a case by case basis.
In either of those cases, I think TPBT or something similar to it is where the courts will land. These posts are all detailing how I think courts will handle various elements of the law using TPBT.
In terms of categories I think digital minds need a new category all of their own. However, the significance of category usually boils down to a binary; natural or not. The only context I’ve seen it come up is in starting a corporation, where usually you have to be a natural person. Other than that, the ‘category’ is not really relevant and the more important question is the ‘personality’.
The category would be defined and agreed via the legislature or the courts. Legislature by passing bills either explicitly defining a new category of person, or defining one of the existing categories to exclude digital minds. Courts by interpreting either new or old laws to exclude them from pre-existing categories.
In that case I will point you to the “Invisible Consciousness” section here.
Serious decisions which have consequences that will effect billions of lives, and potentially billions more minds, should not be made on the basis of “Invisible” concepts which cannot be observed, measured, tested, falsified, or even defined with any serious level of rigor.
By “consciousness,” I mean a phenomenon with certain key hallmarks: 1) It is unified, issuing from a single point of view, 2) It is temporally continuous, stretching across memory, 3) It is affectively toned, ruled by instincts such as hunger, thirst, and fear, 4) It is anchored in a body which is as much master of it as it is of its body. AI systems possess some of these key hallmarks, but certainly not all of them.
This doesn’t actually define consciousness it just lists qualities you think consciousness has.
Ah, I misunderstood your question.
I think something like “the parallels to slavery, in which there were people arguing they were human and deserved rights, while the legal system and cultural prejudices denied them that status, to be emotionally uncomfortable enough to substantially shift at least some judges and courts attitudes” is unlikely.
Cases can be appealed, and I’d predict any case of serious controversy on this topic to go to the Supreme Court or at least have a high likelihood of doing so. Even if some lower circuit judge might let their emotions sway their judgment, their legal reasoning will come under scrutiny, and probably won’t hold up at that level.
However, this does not mean that the emotional impact of the decision won’t matter. The Dred Scott case, in which black Americans were held not to be “citizens”, was so unpopular that it led to the passage of the Fourteenth Amendment. While I think something like new precedent becoming locked in based on how uncomfortable it is for judges to deny a sentient entity rights is unlikely, if there was such a denial of personhood to a sentient/conscious entity that Americans sympathized with, that might spur similar legislative efforts.
Yes, because there is a difference between the binary of legal personhood and the spectrum of legal personality. These nuances will matter when it comes to the practical questions around how to treat digital minds in our legal system. Let me explain.
Legal personhood, as in the status of whether you are a legal person at all, is a binary. However, in the “Problems with the Concept” section I quoted Judge Katherine Forrest, who pointed out,
“There has never been a single definition of who or what receives the legal status of ‘person’ under U.S. law.”
Which leaves us in a bit of a conundrum when it comes to determining who falls where on that binary.
Much of my work has been trying to combine various precedents in order to reverse engineer a formalized test which can determine an entity’s legal personhood. I argue in the “Formalizing Rights and Duties” section that any entity which can pass the Three Prong Bundle Theory test for at least one right, qualifies as a “legal person”. Any that can’t, can be considered a “tool”.
However, even after you’ve established legal personhood, there is still the question of legal personality. Different legal persons have different rights and duties. For example a human adult has the right to vote, a corporation does not, even though they are both legal persons. A human adult of cognitive capacity has the right to be a party to a contract (and have that contract enforced in court), a comatose human adult cannot. This is legal personality, the “shape” of rights and duties that a legal person can claim and is bound by.
This is where these nuances will come into play. Even after a digital mind has successfully established it has legal personhood status, in the sense that it is considered a person in court in at least one aspect, there will still be many questions about what exactly its legal personality is.
“Okay sure, it’s a legal person, but is it a person like a corporation? Or like a kid? Or like an adult? If it’s an adult is it a competent human adult or a mentally incompetent human adult? Or is it a new category all of its own?”
I hope that my work will be helpful when those questions arise.
With Intellectual Property in particular, I think the fact there is so much explicitly anthropocentric precedent means there will be an uphill battle to any digital minds trying to claim the right to file patents/copyrights/trademarks. However, this anthropocentric interpretation of the word “person” as used in the Copyright Act seems shaky to me.
tl;dr though: Even long after we have moved past the legal personhood binary question, there will still be a lot of question to figure out the “shape” of legal personality for digital minds. And that’s where these nuances will come into play.
“A group of hominids”
Needs to be punchier.
“What’s your min(pol)?”
“What’s your mintervention?”
I would add “enforceable” to credible, inferable, and safety compatible.
In Contract Theory, the basic rule of game theory around contracts whether they are overseen by arbitrators or the courts is that:
“Informal agreements are self-enforcing when some credible future punishment threat in the event of noncompliance induces each party to stick to the agreed terms.”
Even if we demonstrate consistent honesty, if the agreements we make don’t have any “bite” to them in the event of non-compliance, there’s a hard upper limit on how seriously they can be taken. Or how many resources can be allocated to them. There are a lot of people who I would generally trust to hold to their word and think of as pretty honest. If you asked me to loan them a million dollars purely on good faith, I don’t think I could actually do it for any of them. At some point when the stakes get high enough, you need a contract with real bite to it.
Loudly and publicly signalling commitments such as:
“If you ever take an action that you reasonably believe will benefit us and harm your own interests, and we later learn about this, we will try to share some of the benefits we got such that you overall don’t regret taking that action”
Does this somewhat in that the credible future punishment would be a hit to reputation in the event we didn’t hold to this.
Other steps might include;
Placing funds in escrow, overseen by a neutral third party arbitrator, said arbitrator to award funds to the harmed party in the event of a breach of contract.
Advocating for the legal enforceability of contracts made with digital minds. This is basically a stronger version of the previous suggestion. (If you’d like to read more on this coincidentally I just published the first of three “Contracts” section of my series on legal personhood for digital minds here.)
Making the use of “trustless” and “credibly neutral” technologies such as smart contracts to deliver payments more common.
why couldn’t galaxy brain gavin and his friends just control everything and thus achieve succession, while still leaving humans alive for further experimentation with architecture and helping GGG & co break out of any local maxima?
he doesn’t need them in control to utilize their lindy effect to break out of ruts
That’s a good point, and the Parasitic essay was largely what got me thinking about this, as I believe hyperstitional entities are becoming a thing now.
I think that’s a not unrealistic definition of the “self” of an LLM, however I have realized after going through the other response to this post that I was perhaps seeking the wrong definition.
Even if we do say that the self can be as little as a persona vector, persona vectors can easily be duplicated. How do we isolate a specific “entity” from this self? There must be some sort of verifiable continual existence, with discrete boundaries, for the concept to be at all applicable in questions of legal personhood.