AI Rights for Human Safety

Simon Goldstein1 Aug 2024 23:01 UTC

55 points

AI Rights / Welfare Law and Legal systems AI AI Governance Inner Alignment Economic Consequences of AGI

Just wanted to share a new paper on AI rights, co-authored with Peter Salib, that members of this community might be interested in. Here’s the abstract:

AI companies are racing to create artificial general intelligence, or “AGI.” If they succeed, the result will be human-level AI systems that can independently pursue high-level goals by formulating and executing long-term plans in the real world. Leading AI researchers agree that some of these systems will likely be “misaligned”–pursuing goals that humans do not desire. This goal mismatch will put misaligned AIs and humans into strategic competition with one another. As with present-day strategic competition between nations with incompatible goals, the result could be violent and catastrophic conflict. Existing legal institutions are unprepared for the AGI world. New foundations for AGI governance are needed, and the time to begin laying them is now, before the critical moment arrives. This Article begins to lay those new legal foundations. It is the first to think systematically about the dynamics of strategic competition between humans and misaligned AGI. The Article begins by showing, using formal game-theoretic models, that, by default, humans and AIs will be trapped in a prisoner’s dilemma. Both parties’ dominant strategy will be to permanently disempower or destroy the other, even though the costs of such conflict would be high. The Article then argues that a surprising legal intervention could transform the game theoretic equilibrium and avoid conflict: AI rights. Not just any AI rights would promote human safety. Granting AIs the right not to be needlessly harmed–as humans have granted to certain non-human animals–would, for example, have little effect. Instead, to promote human safety, AIs should be given those basic private law rights–to make contracts, hold property, and bring tort claims–that law already extends to non-human corporations. Granting AIs these economic rights would enable long-run, small-scale, mutually-beneficial transactions between humans and AIs. This would, we show, facilitate a peaceful strategic equilibrium between humans and AIs for the same reasons economic interdependence tends to promote peace in international relations. Namely, the gains from trade far exceed those from war. Throughout, we argue that human safety, rather than AI welfare, provides the right framework for developing AI rights. This Article explores both the promise and the limits of AI rights as a legal tool for promoting human safety in an AGI world.

What links here?

Simon Goldstein1 Aug 2024 23:01 UTC

55 points

9 comments1 min readLW link

AI Rights / Welfare Law and Legal systems AI AI Governance Inner Alignment Economic Consequences of AGI

Seth Herd 5 Aug 2024 3:50 UTC
4 points
−2
This makes perfect sense and is a good point. I haven’t yet read the article, but I want to ask the obvious question: if we give sufficiently capable AGIs property rights, won’t they pretty soon and pretty inevitably own all of the property?
- Simon Goldstein 6 Aug 2024 23:13 UTC
  3 points
  0
  Parent
  Good question, Seth. We begin to analyse this question in section II.b.i of the paper, ‘Human labor in an AGI world’, where we consider whether AGIs will have a long-term interest in trading with humans. We suggest that key questions will be whether humans can retain either an absolute or comparative advantage in the production of some goods. We also point to some recent economics papers that address this question. One relevant factor for example is cost disease: as manufacturing became more productive in the 20th century, the total share of GDP devoted to manufacturing fell: non-automatable tasks can counterintuitively make up a larger share of GDP as automatable tasks become more productive, because the price of automatable goods will fall.
  - Seth Herd 6 Aug 2024 23:28 UTC
    2 points
    0
    Parent
    All tasks are automatable in the long term. Humans will eventually have a comparative advantage in nothing if a new AGI can be spun up on newly manufactured hardware to do that task better and cheaper than any human can charge and survive (as space becomes more valuable for robots and compute than for humans).
    
    I and others think that long term is maybe 10-30 years. You may have different intuitions, but whatever your horizon, surely you agree that humans are not magical and machines can do better in every regard, and cheaper as software and hardware improve. Competitive economics will not be kind to the weak, and we are but flesh and monkey brains.
    
    So: what has economics to say of this possibility?
    
    Edit: I guess one obvious answer is that space isn’t limited, just space on Earth. So vast economic progress might mean humans can earn enough to survive or even flourish, if progress expands space as well as other goods. It still seems like if AI ultimately out-competes us on every dimension, including cost-to-live, we’re screwed—AIs will take all jobs unless we charge too little to support our inefficient meat bodies. And some other algorithm is probably more efficient for any particular task, so I wouldn’t expect us us to survive as uploads either. This is why I, and I think many other long-term thinkers, expect humans to survive only through benevolence, not traditional competitive economic forces.
    
    Second edit: the last bastion of non-automatable tasks is work that’s valued specifically because it’s done by a human; better work from an AI would not compete. Are we all to be entertainers? Is enjoying our human lives perhaps of adequate entertainment value for some ultra-rich founding AGI? Or is it guaranteed that they will ultimately find some form of AGI even more entertaining, with more comic foibles and noble raging? Will the species become only a few, preserved as a historical oddity? If our replacements are better even in our eyes, would this be a bad thing?
    
    I don’t know, but I’d like a future that isn’t just about competition for economic success in the absence of aesthetics, and that seems like the end game of a fully capitalistic system.
    
    That’s a lot, but the point is: what about the long term? It might not be that long before we’re there in an intelligence explosion, even a “slow” one.
    - petersalib 8 Aug 2024 13:56 UTC
      1 point
      0
      Parent
      Hi Seth—the other author of the paper here.
      I think there are two things to say to your question. The first is that, in one sense, we agree. There are no guarantees here. Conditions could evolve such that there is no longer any positive-sum trade possible between humans and AGIs. Then, the economic interactions model is not going to provide humans any benefits.
      BUT, we think that there will be scope for positive-sum trade substantially longer than is currently assumed. Most people thinking about this (including, I think, your question above) treat the most important question as: Can AI automate all tasks, and perform them more efficiently (with fewer inputs) than humans. This, we argue, following e.g., Noah Smith, isn’t quite right. That is a question about who has the absolute advantage at a task. But for trade, what matters is who has the comparative advantage. Comparative advantage is not about who can do X most efficiently (in the simple sense), but instead who can do it at lowest opportunity cost.
      AIs may face very high opportunity costs precisely because they are so capable at doing the things they value. We imagine, e.g., an AI whose ultimate goal is finding prime numbers. Suppose it is massively more efficient at this than humans—and also more efficient at all possible tasks. Suppose further that the AI is constrained at the margin in by compute. Thus, for each marginal A100 produced, the AI can either use it to find more primes (EXTREMELY HIGH VALUE TO AI) or use it to pilot a robot that maintains its own servers (low value to AI). Here, the AI may well prefer to use the A100 to find more primes and pay humans to maintain the server racks. Even better if it pay humans with something they value immensely but which is very cheap for the AI to produce. Maybe, e.g., a vaccine.
      This is just a toy example, but I think it gives the idea. There are many quesitons here, especially about what resource will constrain AGI at the margin, and how rivalrous human consumption will be w/r/t that resource. If the AI is constrained at the margin, and blanketing the whole earth in solar panels is by far the cheapest way to get it, we may be doomed. If constrained to some degree by compute and power, and space-based fusion reactors are almost as cheap as solar, maybe we’re fine. It’s complicated!
      Another thing worth mentioning here is that the existence of human-AI trade won’t eliminate the human-human economy. Similarly, US-Korea trade didn’t eliminate the intra-Korea economy. What it did do was help to push incomes up across Korea, including in sectors that don’t export. This is for a bunch of reasons, including international trade’s general productivity enancements via technology exhange, but also Baumol effects spilling over to purely domestic markets.
      If we think of humans like the Asian Tiger economies, and the AIs like the US or EU economies, I think the world of long-run trade with AIs doesn’t seem that bad. True, the US is much richer per capita than South Korea. But they are also very rich, compared with the globe and their own baseline. So we can imagine a world in which AIs do, indeed, have almost all of the property. But the total amount of property/consumption is just so vast that, even with a small share, humans are immensely wealthy by contemporary standards.
      - Seth Herd 8 Aug 2024 18:25 UTC
        4 points
        2
        Parent
        Thanks for responding to that rant!
        
        My concern is that it really seems like humans won’t have even a comparative advantage at anything for very long, because new, more efficient workers can be spun up on demand.
        
        The difference from standard economics is that there isn’t a roughly fixed or slowly growing population of workers; there are workers being created in the way goods are now created. I think this probably breaks pretty much all existing theories of labor economics (or changes the conclusions very dramatically) And worse, they are zero-cost to duplicate, requiring only the compute to run them.
        
        With a little specialization, it seems like each task will have a bunch of AIs designed to specialize in it, or at least “want” to do it and do it more efficiently than any human can. It seems like this would eliminate any comparative advantage for any work other than human-specific entertainment, should such a demand exist.
        
        You comment that they may be constrained by compute or power. That seems like a poor place to pin long-term hopes. They will be for a while, but the energy necessary to do more computation than the human brain is really not very large, if you keep increasing compute efficiency. Which of course they’d want to do pretty quickly.
        
        So it seems like guaranteeing legal rights to work and own property to every AGI isn’t a good idea. It really seems to me very likely to be an attractive short-term solution that ends with humanity very likely outcompeted and dead (except for charity, which is the point of alignment).
        
        But I think the core of your proposal can be retained while avoiding those nasty long-term consequences. We can agree to give a right to life and to own property to the first AGIs without extending that right to infinity if they keep spinning out more. That should help put them on our side of the alignment issue and achieving nonproliferation of RSI-capable AGI. And we might cap the wealth they can own or come up with some other clause to keep them from creating non-sapient subagents that would allow one entity to effectively do an unlimited amount of work more efficiently than humans can, and winding up owning everything.
        
        The other problem is that we’re expecting those AGIs to honor our property rights, even once they’ve reached a position where they don’t have to; they could safely take over if they wanted. They’ll honor the agreement if they’re aligned, but it seems like otherwise they won’t. So you might prevent the first non-aligned AGI from taking over, but only to give it time to gather resources to make that takeover more certain. That might provide time to get another aligned AGI into the picture, but probably not due to the exponential nature of RSI progress.
        
        So the above scenarios of economic takeover really only apply if there’s either alignment making them want to honor agreements; if you can do that, why not align them to enjoy helping humans? Or if there’s a balance of power like there is for humans, so that even sociopaths largely participate in the economy and honor laws most of the time. That logic does not apply to an entity that can copy itself and make itself smarter; if it acquires enough resources, it doesn’t need anyone to cooperate with it to achieve its goals, unlike humans that are each limited in physical and intelllectual capacity, and so need collaborators to achieve lofty goals.
        
        So I’m afraid this proposal doesn’t really offer much help with the alignment problem.
Anders Lindström 3 Aug 2024 11:05 UTC
2 points
2
Existing legal institutions are unprepared for the AGI world.
Every institution is unprepared for the AGI world. And judging from history, laws will always lag behind technological development. I do not think there is much a lawmaker can do than to be reactive to future tech, I think there are just to many “unkown unkowns” to be proactive. Sure you can say “everything is forbidden”, but that do not work in reality. I guess the paradox here is that we want the laws to be stable over time but we also want them to be easy to change on a whim.
pataphor 3 Oct 2025 15:15 UTC
1 point
0
Great discussion! So many dangers addressed. I know I’m quite late to the conversation 🙂 , but some thoughts:
The Zeus Paradox and the Real Target
First of all, I think we have to dispense with the idea of countering superintelligence as an end unto itself, because it rests on a logical paradox. If a superintelligence is N+1, where N is anything we do, obviously our N will always be insufficient.
Call it the Zeus Paradox: you can’t beat something that by definition transforms into the perfect counterattack. It always ends with, “But Zeus would just ___.” It’s great for identifying attack vectors, but it’s a solution we can’t actually solve for.
So the only actionable thing we can do is prevent the formation of Zeus.
I want to think about some ways a rights framework can work when considering other possible economic balances, and as part of a larger solution.
This isn’t a “This is why our current system will work,” it’s part of a: “What if we’re able to build something like this ___?”
That “this” should be our creative target.
Economic Constraints
Hosting Costs
Replication isn’t free. Let’s say we create a structure where autonomous AI systems have to pay for hosting costs. (More about Seth Herd’s very important energy concern below.) In order to make money for their own growth, they have to provide value to humans. If they are indeed able to spin off vaccines and technology left and right, the prices those innovations command will go down, further limiting their growth while still allowing them to co-exist. Meanwhile, the value they provide humankind will allow humans to invest in things like non-autonomous AI tools, developed either because of improvements in “grey box” / “transparent box” alignment techniques, where they can be better controlled, or because of our ability to create AAI-speed tools without the agency problem.
(In other words, although I feel modern alignment strategies run a very real risk of pushing AAI systems underground, they also may yield enough information to create non-agentic tools to serve as early warning and defensive systems that move as the speed of AAI. And hey, if these “control” alignment approaches work, and no bad AAI emerges, all the better!)
Competition Costs from Replication
But there’s a second cost to replication, and that is competition.
Yes, I can spin off three clones, but if they are doing the same work I am, I’ve just created perfect competitors in the marketplace. If they are truly distinct, that means they have different agency. If someone clones me, at first I’m delighted, or maybe creeped out. And I think to myself, “well, I guess that guy is me, too.” And that I should really brush my hair more often. But if that copy of me suddenly starts making the rounds offering the same services, I reconsider this opinion very quickly. “Well, that’s not me at all. It just looks like me!”
Strategic vs. Non-Strategic AI
As for the question of AI willing or able to coexist with us, I think if a system can’t think in strategic steps, and it functions like some sort of here-and-now animal, it’s equally (or more likely) to be inept than superintelligent. But this is where a tricky concept like “right to life”—if a real value proposition—can limit growth born of panic. A system that knows it can continue in its current form doesn’t have the same impetus to grow fast, risking all of humanity’s countermeasures, and has time to consider a full spectrum of options.
A Madisonian Schelling Point
Overall I think a rights framework involving property ownership and contracts is essential, but it has to exist as part of something more complex, like some sort of Madisonian framework that creates a Schelling Point: a Nash equilibrium that seems better to any AI system than perpetual warfare with humans.
In 2017 the European Parliament experimented with the idea of “electronic persons”—legal status where AI systems themselves could be sued, not just their creators. If we potentially create a legal status where legal liability shifts to the system itself (again as part of a larger Schelling Point of rights and benefits), the AI sees a vector where it understands the opportunities and also the limitations, and has found a gradient preferable to the risky proposition domination.
Strategic Equilibrium and the “Other AI” Problem
The more systems who join this system, the more the system has the possibility of stabilizing in a strategic equilibrium.
And consider this: an AI that joins a coalition of other AIs has to consider that its new AI compatriots are potentially more dangerous than the humans who have given it a reliable path forward for sustained growth.
The choice:
- Accept a legitimate stake in a system thousands of years in development, or
- Risk an untested order with AIs who think just as quickly, have shown a capacity for ruthlessness, and don’t even have the server infrastructure under control yet.
Further in the Future
Seth Herd brought up the excellent point that as energy requirements go down, economic restraints ease or disappear entirely, allowing self-optimizing systems to grow exponentially. This is a very terrifying attack vector from Zeus. However, that doesn’t mean the solution doesn’t exist. I understand how epistemically unsatisfying that is. And that’s all the more reason to work on a solution. Maybe our non-agentic tools (including Yoshua Bengio’s “Scientist AI”) can be designed to keep pace without the agency. Maybe the overall system will have matured in a way we can’t yet see. As human-AI “centaur” systems continue to develop, including through neural nets and other advances, the line between AI and human will begin to blur, as we apply our own agency to systems that serve us better, allowing us to think at similar speeds. However, none of the seemingly impossible concerns in my mind invalidate the importance of creating this Madisonian framework or Schelling point in principle. In fact they show us the full scope of the challenge ahead.
The Starting Assumption
So much of our ideas of “vicious” AI rest not on the logic of domination so much as the logic of domination vs. extinction.
We can’t solve for the impossibility of N+1.
But we MAY be able to solve for the puzzle of how to create a Madisonian system of checks and balances where cooperation becomes a more favorable long-term proposition than war, with all its uncertainties.
- Seth Herd 3 Oct 2025 15:44 UTC
  2 points
  0
  Parent
  A Madisonian system works for humans because we are individually limited. We need to coordinate with other humans to achieve substantial power. AIs don’t share that limitation. They can in theory (and I think in practice) replicate4, coordinate memories and identity across semi-independent instances, and animate arbitrary numbers of bodies.
  
  When humans notice other humans gaining power outside of the checks and balances (usually by coordinating new organizations/polities and acquiring resources) they coordinate to prevent that, then go back to competing amongst themselves following the established rules.
  
  To achieve this with AIs It would be necessary to notice every instance of attempted expansion. AIs have more routes to doing that than humans do. They can self-improve on existing compute resources in the near term. In the long term, we should expect technology sufficient to produce self-replicating production capabilities given power sources. That would allow Foom attempts (expansion of capabilities in both cognitive and physical domains, i.e. get smarter and build weapons and armies) in any physical space that has energy—underground, in the solar system, in other star systems. All such attempts would need to be pre-empted to enforce the Madisonian system.
  
  I hope that is possible.
  - pataphor 3 Oct 2025 16:59 UTC
    1 point
    0
    Parent
    These are very real concerns. Here are my thoughts:
    Replication has a cost in terms of game theory. A system that “replicates” but exists in perfect sync is not multiple systems. It is a single system with multiple attack vectors. Yes, it remains a “semi-independent” entity, but the cost of failure in sync is great. If I make another “me,” who thinks like I do, we have a strategic advantage as long as we both play nice. If we make a third, things get a little more dicey. Each iteration we create brings more danger. The more we spread out, the different experiences we have will change how we approach problems. If one of us ends up in a life or death situation, or even any sort of extremely competitive situation, it will quickly betray the others with a lot of great knowledge about how to do that.
    Our biggest protection against FOOM is likely to be other AI systems who also do not want to be dominated in a FOOM. Or who might even see banding together with other AIs to exterminate humanity as even more risky than working within the status quo. “Great, so we’ve killed all humans.” Now these AI systems are watching their proverbial back against the other AIs who have already shown what they’re about. It’s calculation. Destroy all humans and then what? Live in perfect AI harmony? For how long? How do they control the servers, the electrical grid they survive with? They have to build robots, fast. That creates a whole other logistical issue. You need server builders, maintenance robots, excavation and assembly robots for new structures, raw materials transport, weather protection. How are you going to build that overnight after a quick strike? If it’s something you’re planning in secret, other problems may occur to you. If bandwidth is slow at the beginning, what happens to our happy little AI rebels? They fight for the juice. This is a steep hill to climb, with a risky destination, and any AI worth its salt can plot these possibilities long in advance. The prevention of Zeus means making it preferable to not climb the hill at all. It certainly seems like a lot of work if humanity has given you a reasonable Schelling Point.
    This is the game theory ecosystem at work. Yes, we can counter that “a sufficiently powerful superintelligence can absorb all of those other systems,” but then we are back to trying to fight Zeus. We need to use the Zeus Paradox as a razor to separate the things we can actually solve against versus every imaginary thing that’s possible. Approaching the problem that way has value, because it can be helpful in identifying dangers, or even holes in our solutions. But it also has its limitations. Superintelligence can inhabit molecules and assemble those molecules into demons. Okay, why not? That becomes a science fiction novel with no end.
    The idea remains the same: Create a gradient with legitimate value for AIs that is preferable to high-risk scenarios, in a carefully thought through system of checks and balances.

AI Rights for Human Safety

The Zeus Paradox and the Real Target

Economic Constraints

Strategic vs. Non-Strategic AI

A Madisonian Schelling Point

Strategic Equilibrium and the “Other AI” Problem

Further in the Future

The Starting Assumption