The emergence of agentic Artificial Intelligence (AI) is set to trigger a “Cambrian explosion” of new kinds of personhood. This paper proposes a pragmatic framework for navigating this diversification by treating personhood not as a metaphysical property to be discovered, but as a flexible bundle of obligations (rights and responsibilities) that societies confer upon entities for a variety of reasons, especially to solve concrete governance problems.
We argue that this traditional bundle can be unbundled, creating bespoke solutions for different contexts. This will allow for the creation of practical tools—such as facilitating AI contracting by creating a target “individual” that can be sanctioned—without needing to resolve intractable debates about an AI’s consciousness or rationality.
We explore how individuals fit in to social roles and discuss the use of decentralized digital identity technology, examining both ‘personhood as a problem’, where design choices can create “dark patterns” that exploit human social heuristics, and ‘personhood as a solution’, where conferring a bundle of obligations is necessary to ensure accountability or prevent conflict.
By rejecting foundationalist quests for a single, essential definition of personhood, this paper offers a more pragmatic and flexible way to think about integrating AI agents into our society.
I was already primed to unbundle personhood because I bought Simler’s view of personhood as an abstract interface that can be implemented to varying degrees by anything (not just humans) in return for getting to participate in civil society:
The authors argue that taking the pragmatic stance helpfully dissolves the personhood question and lets them craft bespoke solutions to specific governance problems:
This paper offers a pragmatic framework that shifts the crucial question from what an AI is to how it can be identified and which obligations it is useful to assign it in a given context. We regard the pragmatic stance as crucial. Assuming some essence of personhood is “out there” waiting to be discovered, or a metaphysical fact about what AIs or persons “really are” that can settle our practical questions seems to us, unlikely to prove helpful. We propose treating personhood not as something entities possess by virtue of their nature, but as a contingent vocabulary developed for coping with social life in a biophysical world (Rorty, 1989).
The default philosophical impulse is to ask what an entity truly is in its essence. The pragmatist instead asks what new description would be more useful for us to adopt. What vocabulary must we invent to cope? We think this move is a vital one for navigating our likely future where some AIs are owned property while similar AIs operate autonomously. …
Inspired by Schlager and Ostrom (1992)’s demonstration that the property rights bundle can be broken apart to fit specific contexts, we propose that the personhood bundle can be similarly unbundled into components. Our position on personhood as a bundle resembles that of Kurki (2019) but we put greater emphasis on the bundle’s plasticity and the diversity of different bundles. For AI persons, the components of the bundle need not co-occur in accordance with the specific configuration they take for natural human persons. Without essences to constrain us, we are free to craft bespoke solutions: sanctionability without suffrage, culpability and contracting without consciousness attribution, etc.
What’s in the personhood bundle? What kind of bundles?
The crucial question is always: what bundle of components constitutes the “person” that society needs to address for a given purpose? The answer changes depending on who is doing the addressing and for what reason. For a human user building a relationship, the person is a story—the (model + chat history) that creates a unique, evolving individual to bond with. For a court of law assigning liability, the person is the locus of responsibility—the entire operational stack of (model + instance + runtime variables + capital + registration) that can be held accountable, sanctioned, updated, and forced to pay for the harm it causes.
For the sake of concreteness, we can describe several possible configurations of the addressable bundle useful in different situations, for different kinds of AIs.
In the specific case of a goal-driven autonomous AI agent, perhaps the kind of personhood would be a Chartered Autonomous Entity, with a bundle consisting of rights to (1) Perpetuity, (2) Property, (3) Contract, and duties of (1) Mandate Adherence, (2) Transparency (3) Systemic Non-Harm, and (4) Self-Maintenance.
In other situations, it may be useful to define a Flexible Autonomous Entity with all the same bundle elements except the duty of mandate adherence. Perhaps the former could be seen as analogous to a for-profit company and the latter as analogous to a non-profit company.
It may also be useful to define Temporary Autonomous Entities (either chartered or flexible). These would drop the right to perpetuity and add a duty of self deletion under specified conditions.
This process of bundling and unbundling obligations is the engine of the Cambrian explosion.
More on their stance. I like how sensible it is, it’s like the authors clearly internalised a human’s guide to words (whether they’ve read it or not):
Our theory is developed in the context of an account of personhood that defines a person as a ‘political and community-participating’ actor (Haugeland, 1982). This is a status that depends not on an entity’s intrinsic properties, but on collective recognition from the community it seeks to join, a recognition which is itself dependent on adherence to norms. On this view, personhood status is always a collective decision, a contingent outcome of social negotiation, not a fixed metaphysical status.
This stance is strongly non-essentialist and, crucially, it partially dissolves the traditional distinction between a ‘natural person’, whose status is typically grounded in their intrinsic nature (like consciousness or rationality), and a ‘legal person’, a functional status conferred by a community to solve practical governance problems (like a corporation). From our perspective, this distinction is a relic of the search for essences.
Some motivating examples:
Our focus is on agentic AI systems, rather than on the underlying foundation models that power them. These are the long-running, persistent agents that maintain state, remember past interactions, and adapt their behavior over time. This persistence is what makes an agent a plausible candidate for other entities to relate themselves to. A human’s relationship with such a persistent agent can be emotionally salient and economically consequential in ways a one-off, stateless interaction cannot.
The part of this paper concerned with “personhood as a problem” applies most clearly to companion AIs, where long-term interaction is designed to foster emotional bonds, creating risks of exploitation (Earp et al., 2025; Manzini et al., 2024a). Conversely, the part of the paper concerned with “personhood as a solution” applies to more utility-like and virus-like agents, especially self-sufficient ownerless systems (or systems whose owner cannot be identified; Fagan (2025)), where persistence creates an accountability gap that legal personhood might fill. Consider an AI designed to seek out funding and pay its own server costs. It could easily outlive its human owner and creator. If this ownerless agent eventually causes some harm, our vocabulary of accountability, which searches for a responsible ‘person’, would fail to find one (Campedelli, 2025).
After discussing a historical precedent from maritime law (see the ships section), the authors argue:
The parallel to autonomous AI agents is striking. An artificial intelligence agent could be built upon open-source code contributed by a global network of developers, making it difficult to trace liability to any single party. When such an agent causes harm—by manipulating a market or causing a supply chain failure—the prospect of identifying a single, responsible human or human organization can be practically impossible. For the ownerless AI that outlives its creator, the problem is especially acute.
Following the logic of maritime law, we could grant a form of legal personhood directly to such AI agents. A judgment against an AI could result in its operational capital being seized or its core software being “arrested” by court order (see Section 9).
The next example is hypothetical – a generative “ghost” of a family’s late matriarch:
Now, consider a different kind of non-human entity that could fill such a role: an AI. Imagine a family that interacts for decades with a “generative ghost” of their late matriarch (Morris and Brubaker, 2024), an AI trained on her lifetime of diaries, messages, and videos. It shares her wisdom, recalls her stories, and even helps mediate disputes according to the principles she espoused. Or picture a small community whose collective history, language, and cultural traditions are held and nurtured by a persistent AI—a digital elder that has tutored their children and advised their leaders for generations.
For the great-grandchildren in that family or the youth of that community, their AI elder is not a tool; it is a constant, foundational presence. It is a source of identity and connection to their own past. Could they, in time, come to see it as an ancestor? Could they regard their identity as intertwined with it, and view themselves as having a duty to care for it as it cares for them?
There’s a novella I really like that explores a version of this, Catherynne Valente’s surrealist far-future Silently and Very Fast (see part III, “Three: Two Pails of Milk”), deservingly nominated for numerous awards.
On how their stance interacts with morality:
In our theory, “morality talk” is a form of social sanctioning used to make two specific claims about a norm: (1) that it is exceptionally important, and (2) that it has a wide or universal, scope of applicability (Leibo et al., 2024). Thus, in our theory, to argue an AI is a ‘person’ is not to make a metaphysical claim about its nature, but to make an emphatic political claim that the obligations bundled together as its personhood ought to take precedence over other considerations. For us, any form of personhood—moral, legal, or otherwise—is a functional status conferred by a community.
Therefore, we see the role of science, the institution, not as clarifying the list of properties an AI must satisfy to be a person, but as illuminating what may cause human communities to collectively ascribe personhood status to them.
The authors reject foundationalist stances in general (explicitly calling their pragmatism “anti-foundationalist”) and reject consciousness as a foundation for AI personhood in particular, which motivates welfarists:
On the welfare side, this tradition’s power lies in its combination of compassion with universalism and its account of moral progress (toward greater pleasure and lesser pain for more individuals). It provides a clear, non-arbitrary reason to prevent harm—because suffering is bad, regardless of who is suffering. This one-size-fits-all principle works powerfully in contexts like the movement for animal welfare. When applied to industrial farming or the use of animals in cosmetic testing, the question “does it suffer?” serves as a potent tool for moral argument capable of cutting through cultural justifications for cruelty and providing a clear metric for reformers to work to optimize (Singer, 2011).
However this focus on suffering arguably fails to address important welfare problems arising for pragmatic reasons. Consider again the “generative ghost” of a family’s late matriarch (recalling Section 1; Morris and Brubaker (2024)). Or picture a community whose history and traditions are held by a “digital elder” that has advised them for generations. For these groups, their AI is a a source of identity and connection to their past—an “ancestor”. The obligation they may feel to protect their AI from arbitrary deletion would not necessarily have anything to do with their assessment of its capacity to feel pain. After all, arguments that the ghost would not feel pain when deleted don’t seem likely to persuade them to permit its deletion.
The morally-relevant concern may be that the AI’s deletion would destroy an entity in a foundational relational role for their family or community (Kramm, 2020). In which case it would be the relational harm of deleting the AI that matters, not the pain the AI may or may not feel. …
The relational harm remark jives with Simler’s nihilistic account of meaning as relational (among other properties), which I already buy, which is probably why I find it sensible.
The authors call out the welfarists’ rhetorical sleight-of-hand:
Viewed together, the dual use of consciousness as backstop for welfare rights and accountability obligations reveals a stark asymmetry. When arguing for rights, the mere possibility of consciousness is deemed sufficient to open the debate. But when arguing against responsibilities, an impossible standard of proof for an internal state is demanded. This shows that consciousness is mostly being used as a rhetorical tool, not as a stable conceptual foundation. Therefore, anyone uninterested in the metaphysics may regard AI personhood as having no conceptual dependence on AI consciousness.
In fact, we would predict the dependence to run in the opposite direction. Both usage of the word consciousness, and human intuitions around it, are likely to shift in response to the emergence of pragmatic reasons to consider AIs as persons. … Notice that many cultures attribute consciousness to objects not conventionally considered alive (Keane, 2025). For example Shintoism posits that objects and places can have conscious spirits (kami) within them. It is likely that eventually some groups of people will attribute consciousness to AIs, while others will not. These groups will view their ethical obligations differently from each other, similarly to how people have diverse opinions on animal consciousness and whether eating animals is normative. The pragmatic question then is how to arrange institutions to resolve the conflicts that arise from these differences (Rorty, 1999).
My instinctive answer to that last question is “probably whatever the folks at the Meaning Alignment Institute are cooking up” (I linked to their full-stack agenda, but the writeup that personally convinced me to pay attention to them was the 500 participants’ positive experience especially getting Democrats and Republicans to agree substantively in their democratic fine-tuning experiment, contra my skepticism from predicting that the polarizing questions asked in the experiment would be mostly irreconcilable due to differently crystallised metaphysical heuristics).
I don’t have time to read the paper or even skim it really, just page through it. But I will, perhaps unwisely, voice my intuitive assessment, and then maybe people who actually read it, can correct me.
I find their concept to be sinister and dangerous. What are the actual consequences of “unbundling the personhood bundle”? It means, on the one hand, that you get to create entities that resemble people but which you don’t need to treat as people (good if you want intelligent slaves); on the other hand, you also get to create entities that aren’t really people at all, but which laws, customs and institutions will treat as people (good if you want to hasten the real “great replacement”).
A major reason why I respond negatively, is the line in the abstract about how this pragmatic attitude allows one to “creat[e] bespoke solutions for different contexts”. That’s corporate-speak, and I do not trust people who work for a mega-corporation and say they want to create customized concepts of personhood, whether they are lawyers or computer scientists.
Another reason is their pragmatist, relativist attitude to personhood. One of my persistent worries is that superintelligence will have the right values but the wrong ontology of personhood, and here these authors shrug their shoulders and say, meh, there aren’t real facts about that to discover anyway, just ever-shifting social conventions. If I had the time to do my due diligence on this paper, I would want to investigate the authors (I don’t know any of them) and find out where they are coming from, philosophically and professionally, so I could really identify the spirit in which the paper is written.
That’s what I derive from a superficial glance at the paper. I wish I had time to analyze and reflect on it properly, so that I could get the nuances right, and also have a more measured and less emotional response. But time is short, yet the issues are important, so, that’s my hasty response.
(I actually appreciate the emotion in the response, so thanks for including it)
One of my persistent worries is that superintelligence will have the right values but the wrong ontology of personhood
I would’ve expected the opposite phrasing (right ontology wrong values, cf. “the AI knows but doesn’t care”) so this caught my eye. Have you or anyone else written anything about this elsewhere you can point me to? I initially thought of Jan Kulveit’s essays (e.g. this or this) but upon re-skimming they don’t really connect to what you said.
“Tiling the solar system with smiley faces” used to be a canonical example of misalignment, and it could emerge from a combination of right values and very crudely wrong ontology, e.g. if the ontology can’t distinguish between actual happiness and pictures of happiness.
A more subtle example might be, what if humans are conscious and uploads aren’t. If an upload is as empty of genuine intentionality as a smiley face, you might have a causal model of conscious mind which is structurally correct in every particular, but which also needs to be implemented in the right kind of substrate to actually be conscious. If your ontology was missing that last detail, your aligned superintelligence might be profoundly correct in its theory of values, but could still lead to de-facto human extinction by being the Pied Piper of a mass migration of humanity into virtual spaces where all those hedons are only being simulated rather than being instantiated.
Interesting example. Tangentially I’m guessing believing in substrate dependence is part of some folks’ visceral dislike of Richard Ngo’s story The Gentle Romance, which was meant to be utopian. I mostly lean against substrate dependence and so don’t find your example persuasive, although Scott Aaronson’s monstrous edge cases do give me pause:
what if each person on earth simulated one neuron of your brain, by passing pieces of paper around. It took them several years just to simulate a single second of your thought processes. Would that bring your subjectivity into being? Would you accept it as a replacement for your current body?
If so, then what if your brain were simulated, not neuron-by-neuron, but by a gigantic lookup table? That is, what if there were a huge database, much larger than the observable universe (but let’s not worry about that), that hardwired what your brain’s response was to every sequence of stimuli that your sense-organs could possibly receive. Would that bring about your consciousness?
Let’s keep pushing: if it would, would it make a difference if anyone actually consulted the lookup table? Why can’t it bring about your consciousness just by sitting there doing nothing?
To these standard thought experiments, we can add more. Let’s suppose that, purely for error-correction purposes, the computer that’s simulating your brain runs the code three times, and takes the majority vote of the outcomes. Would that bring three “copies” of your consciousness into being? Does it make a difference if the three copies are widely separated in space or time—say, on different planets, or in different centuries? Is it possible that the massive redundancy taking place in your brain right now is bringing multiple copies of you into being?
Maybe my favorite thought experiment along these lines was invented by my former student Andy Drucker. In the past five years, there’s been a revolution in theoretical cryptography, around something called Fully Homomorphic Encryption (FHE), which was first discovered by Craig Gentry. What FHE lets you do is to perform arbitrary computations on encrypted data, without ever decrypting the data at any point. So, to someone with the decryption key, you could be proving theorems, simulating planetary motions, etc. But to someone without the key, it looks for all the world like you’re just shuffling random strings and producing other random strings as output.
You can probably see where this is going. What if we homomorphically encrypted a simulation of your brain? And what if we hid the only copy of the decryption key, let’s say in another galaxy? Would this computation—which looks to anyone in our galaxy like a reshuffling of gobbledygook—be silently producing your consciousness?
Obviously you’re not obliged to, but if you ever get round to looking into the GDM paper more deeply like you mentioned I’d be interested in what you have to say, as you might change my opinion on it.
I like the viewpoint in this Google DeepMind paper A Pragmatic View of AI Personhood (h/t Ben Goldhaber’s post), it reads like a modern AI-specific version of Kevin Simler’s 2014 essay on personhood. Abstract:
I was already primed to unbundle personhood because I bought Simler’s view of personhood as an abstract interface that can be implemented to varying degrees by anything (not just humans) in return for getting to participate in civil society:
The authors argue that taking the pragmatic stance helpfully dissolves the personhood question and lets them craft bespoke solutions to specific governance problems:
What’s in the personhood bundle? What kind of bundles?
More on their stance. I like how sensible it is, it’s like the authors clearly internalised a human’s guide to words (whether they’ve read it or not):
Some motivating examples:
After discussing a historical precedent from maritime law (see the ships section), the authors argue:
The next example is hypothetical – a generative “ghost” of a family’s late matriarch:
There’s a novella I really like that explores a version of this, Catherynne Valente’s surrealist far-future Silently and Very Fast (see part III, “Three: Two Pails of Milk”), deservingly nominated for numerous awards.
On how their stance interacts with morality:
The authors reject foundationalist stances in general (explicitly calling their pragmatism “anti-foundationalist”) and reject consciousness as a foundation for AI personhood in particular, which motivates welfarists:
The relational harm remark jives with Simler’s nihilistic account of meaning as relational (among other properties), which I already buy, which is probably why I find it sensible.
The authors call out the welfarists’ rhetorical sleight-of-hand:
My instinctive answer to that last question is “probably whatever the folks at the Meaning Alignment Institute are cooking up” (I linked to their full-stack agenda, but the writeup that personally convinced me to pay attention to them was the 500 participants’ positive experience especially getting Democrats and Republicans to agree substantively in their democratic fine-tuning experiment, contra my skepticism from predicting that the polarizing questions asked in the experiment would be mostly irreconcilable due to differently crystallised metaphysical heuristics).
I don’t have time to read the paper or even skim it really, just page through it. But I will, perhaps unwisely, voice my intuitive assessment, and then maybe people who actually read it, can correct me.
I find their concept to be sinister and dangerous. What are the actual consequences of “unbundling the personhood bundle”? It means, on the one hand, that you get to create entities that resemble people but which you don’t need to treat as people (good if you want intelligent slaves); on the other hand, you also get to create entities that aren’t really people at all, but which laws, customs and institutions will treat as people (good if you want to hasten the real “great replacement”).
A major reason why I respond negatively, is the line in the abstract about how this pragmatic attitude allows one to “creat[e] bespoke solutions for different contexts”. That’s corporate-speak, and I do not trust people who work for a mega-corporation and say they want to create customized concepts of personhood, whether they are lawyers or computer scientists.
Another reason is their pragmatist, relativist attitude to personhood. One of my persistent worries is that superintelligence will have the right values but the wrong ontology of personhood, and here these authors shrug their shoulders and say, meh, there aren’t real facts about that to discover anyway, just ever-shifting social conventions. If I had the time to do my due diligence on this paper, I would want to investigate the authors (I don’t know any of them) and find out where they are coming from, philosophically and professionally, so I could really identify the spirit in which the paper is written.
That’s what I derive from a superficial glance at the paper. I wish I had time to analyze and reflect on it properly, so that I could get the nuances right, and also have a more measured and less emotional response. But time is short, yet the issues are important, so, that’s my hasty response.
(I actually appreciate the emotion in the response, so thanks for including it)
I would’ve expected the opposite phrasing (right ontology wrong values, cf. “the AI knows but doesn’t care”) so this caught my eye. Have you or anyone else written anything about this elsewhere you can point me to? I initially thought of Jan Kulveit’s essays (e.g. this or this) but upon re-skimming they don’t really connect to what you said.
“Tiling the solar system with smiley faces” used to be a canonical example of misalignment, and it could emerge from a combination of right values and very crudely wrong ontology, e.g. if the ontology can’t distinguish between actual happiness and pictures of happiness.
A more subtle example might be, what if humans are conscious and uploads aren’t. If an upload is as empty of genuine intentionality as a smiley face, you might have a causal model of conscious mind which is structurally correct in every particular, but which also needs to be implemented in the right kind of substrate to actually be conscious. If your ontology was missing that last detail, your aligned superintelligence might be profoundly correct in its theory of values, but could still lead to de-facto human extinction by being the Pied Piper of a mass migration of humanity into virtual spaces where all those hedons are only being simulated rather than being instantiated.
Interesting example. Tangentially I’m guessing believing in substrate dependence is part of some folks’ visceral dislike of Richard Ngo’s story The Gentle Romance, which was meant to be utopian. I mostly lean against substrate dependence and so don’t find your example persuasive, although Scott Aaronson’s monstrous edge cases do give me pause:
Obviously you’re not obliged to, but if you ever get round to looking into the GDM paper more deeply like you mentioned I’d be interested in what you have to say, as you might change my opinion on it.