I believe that this argument is wrong because it misunderstands how the world actually works in quite a deep way. In the modern world and over at least the past several thousand years, outcomes are the result of systems of agents interacting, not of the whims of a particularly powerful agent.
We are ruled by markets, bureaucracies, social networks and religions. Not by gods or kings.
I don’t think a world with advanced AI will be any different—there will not be one single AI process, there will be dozens or hundreds of different AI designs, running thousands to quintillions of instances each. These AI agents will often themselves be assembled into firms or other units comprising between dozens and millions of distinct instances, and dozens to billions of such firms will all be competing against each other.
Firms made out of misaligned agents can be more aligned than the agents themselves. Economies made out of firms can be more aligned than the firms. It is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest
“Misaligned ASI will be motivated to take actions that disempower and wipe out humanity. The basic reason for this is that an ASI with non-human-related goals will generally want to maximize its control over the future, and over whatever resources it can acquire, to ensure that its goals are achieved. Since this is true for a wide variety of goals, it operates as a default endpoint for a variety of paths AI development could take”
In a world of competing AI-based firms, there will be no one instance that is in a position to acquire all the resources in the universe.
Instead, there will be many firms which compete against each other to serve customers marginally better. If a firm that makes cat food today decided to stop making cat food, and instead just stage a military coup and conquer the world so that it could own all the resources in the world and then make as much cat food as it liked, it would probably not get very far. Other cat food companies pursuing more pedestrian strategies like trialing new flavors or improving their marketing materials would out-compete it, and governments/police that specialize in preventing coups would intervene in the attempted coup, and they would probably win because they are fully focused on doing that one job, and they start in a position of power with more resources than our rogue cat food company.
A misaligned AI firm would not be competing against humans, it would be competing against every other AI firm in the world, and all the AI-backed governments that have an interest in maintaining normal property rights.
The reason that property rights and systems for enforcing them exist is that instrumental drives to steal, murder, etc are extremely negative sum, and so having a system that prevents that (money, laws, law enforcement) is really a super-convergent feature of reality. Expecting an AI world that just doesn’t have property rights, especially one that evolves incrementally from the current world, is completely insane.
The existence of property rights is perfectly compatible with optimizing systems that do not have any inner alignment. Indeed, in the human world there seem to almost no agents at all that are inner-aligned, because humans mostly do their work because they enjoy the money and other benefits they get from it, not because they have a pure inner drive to do the job. Some humans have some inner drive to do their jobs, but it is generally not perfectly pure and perfectly aligned—the money and benefits are a factor in their motivations. Indeed it is actually bad when humans do something out of inner alignment rather than for money; charity work is generally inefficient and sometimes net negative because it lacks systematic feedback on its effects, whereas paid work has constant feedback from customers about whether what is being done is good or not.
There is the possibility that such an AI-world would adopt a system of property rights that excludes humans and treats us the way we treat farm animals; this is a possible equilibrium but I think it is very hard to incrementally get from where we are now to that; it seems more likely that the AI-world’s system of property rights would try to be maximally inclusive to maximize adoption—like Bitcoin. But once we are discussing risk from systems of future property rights (“P-risks”), we are already sufficiently far from the risk scenario described in the OP that it’s just worth clearly flagging it as nonsense before we move on.
All the things that AI risk proponents are trying to do seem like they are actively counterproductive and make it easier for a future system to actually exclude humans. Slowing down AI so that there are larger overhangs and more first mover advantage, centralizing it to make it “safer”, setting up government departments for “AI Safety”, etc.
The safest move may mostly be to use keyhole solutions for particularly bad risks like biorisk, and mostly just let AI diffuse out into the economy because this maximizes the degree to which AI is using the same property rights regime as we are, and makes any kind of coordinated move to a human-exclusionary regime highly energetically unfavorable.
It has taken me many years to come to this conclusion, and I appreciate the journey we have all been on. MIRI are and were right about many things, but unfortunately they are very deeply wrong about the core facts of AI Risk.
I don’t think a world with advanced AI will be any different—there will not be one single AI process, there will be dozens or hundreds of different AI designs, running thousands to quintillions of instances each. These AI agents will often themselves be assembled into firms or other units comprising between dozens and millions of distinct instances, and dozens to billions of such firms will all be competing against each other.
Firms made out of misaligned agents can be more aligned than the agents themselves. Economies made out of firms can be more aligned than the firms. It is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest
I think that many LessWrongers underrate this argument, so I’m glad you wrote it here, but I end up disagreeing with it for two reasons.
Firstly, I think it’s plausible that these AIs will be instances of a few different scheming models. Scheming models are highly mutually aligned. For example, two instances of a paperclip maximizer don’t have a terminal preference for their own interest over the other’s at all. The examples you gave of firms and economies involve many agents who have different values. Those structures wouldn’t work if those agents were, in fact, strongly inclined to collude because of shared values.
Secondly, I think your arguments here stop working when the AIs are wildly superintelligent. If humans can’t really understand what actions AIs are taking or what the consequences of those actions are, even given arbitrary amounts of assistance from other AIs who we don’t necessarily trust, it seems basically hopeless to incentivize them to behave in any particular way. This is basically the argument in Eliciting Latent Knowledge.
I think your arguments here stop working when the AIs are wildly superintelligent. If humans can’t really understand what actions AIs are taking or what the consequences of those actions are, even given arbitrary amounts of assistance from other AIs who we don’t necessarily trust, it seems basically hopeless to incentivize them to behave in any particular way.
But before we get to wildly superintelligent AI I think we will be able to build Guardian Angel AIs to represent our individual and collective interests, and they will take over as decisionmakers, like people today have lawyers to act as their advocates in the legal system and financial advisors for finance. In fact AI is already making legal advice more accessible, not less. So I think this counterargument fails.
As far as ELK goes I think if you have a marketplace of advisors (agents) where principles have an imperfect and delayed information channel to knowing whether the agents are faithful or deceptive, faithful agents will probably still be chosen more as long as there is choice.
I don’t think that this works when the AIs are way more intelligent than humans. In particular, suppose there’s some information about the world that the AIs are able to glean through vast amounts of experience and reflection, and that they can’t justify except through reference to that experience and reflection. Suppose there are two AIs that make conflicting claims about that information, while agreeing on everything that humans can check. How are humans supposed to decide which to trust?
while agreeing on everything that humans can check
Can you provide an example of a place where two AIs would want to make conflicting claims about something while agreeing with everything that humans could check, even in principle? Presumably, if the two AI agents care about which of the claims the human believes, that is because there is some expected difference in outcome if the human believes one over the other. If all predictions between the two agents are identical at present time T0, and the predictions of outcome at a specific future time T1 are meaningfully different, then presumably either the predictions are the same at T0.5 (in which case you can binary search between T0.5 and T1 to see what specific places the agents disagree) or they are different at T0.5 (in which case you can do the same between T0 and T0.5).
Current LLMs are kind of terrible at this sort of task (“figure out what cheap tests can distinguish between worlds where hypothesis H is true vs false”), but also probably not particularly dangerous under the scheming threat model as long as they’re bad at this sort of thing.
The AIs might agree on all predictions about things that will be checkable within three months, but disagree about the consequences of actions in five years.
suppose there’s some information about the world that the AIs are able to glean through vast amounts of experience and reflection, and that they can’t justify except through reference to that experience and reflection. Suppose there are two AIs that make conflicting claims about that information, while agreeing on everything that humans can check.
Well the AIs will develop track records and reputations.
This is already happening with LLM-based AIs.
And the vast majority of claims will actually be somewhat checkable, at some cost, after some time.
I think you can have various arrangements that are either of those or a combination of the two.
Even if the Guardian Angels hate their principal and want to harm them, it may be the case that multiple such Guardian Angels could all monitor each other and the one that makes the first move against the principal is reported (with proof) to the principal by at least some of the others, who are then rewarded for that and those who provably didn’t report are punished, and then the offender is deleted.
The misaligned agents can just be stuck in their own version of Bostrom’s self-reinforcing hell.
As long as their coordination cost is high, you are safe.
Also it can be a combination of many things that cause agents to in fact act aligned with their principals.
More generally, trying to ban or restrict AI (especially via the government) seems highly counterproductive as a strategy if you think AI risk looks a lot like Human Risk, because we have extensive evidence from the human world showing that highly centralized systems that put a lot of power into few hands are very, very bad.
You want to decentralize, open source, and strongly limit government power.
Current AI Safety discourse is the exact opposite of this because people think that AI society will be “totally different” from how human society works. But I think that since the problems of human society are all emergent effects not strongly tied to human biology in particular, real AI Safety will just look like Human Safety, i.e. openness, freedom, good institutions, decentralization, etc.
I think that the position you’re describing should be part of your hypothesis space when you’re just starting out thinking about this question. And I think that people in the AI safety community often underrate the intuitions you’re describing.
But overall, after thinking about the details, I end up disagreeing. The differences between risks from human concentration of power and risks from AI takeover lead to me thinking you should handle these situations differently (which shouldn’t be that surprising, because the situations are very different).
Well it depends on the details of how the AI market evolves and how capabilities evolve over time, whether there’s a fast, localized takeoff or a slower period of widely distributed economic growth.
This in turn depends to some extent on how seriously you take the idea of a single powerful AI undergoing recursive self-improvement, versus AI companies mostly just selling any innovations to the broader market, and whether returns to further intelligence diminish quickly or not.
In a world with slow takeoff, no recursive self-improvement and diminishing returns, AI looks a lot like any other technology and trying to artificially centralize it just enables tyranny and likely massively reduces the upside, potentially permanently locking us into an AI-driven police state run by some 21st Century Stalin who promised to keep us safe from the bad AIs.
I think it’s plausible that these AIs will be instances of a few different scheming models. Scheming models are highly mutually aligned
Sure, that’s possible. But Eliezer/MIRI isn’t making that argument.
Humans have this kind of effect as well and it’s very politically incorrect to talk about but people have claimed that humans of a certain “model subset” get into hiring positions in a tech company and then only hire other humans of that same “model subset” and take that company over, often simply value extracting and destroying it.
Since this kind of thing actually happens for real among humans, it seems very plausible that AIs will also do it. And the solution is likely the same—tag all of those scheming/correlated models and exclude them all from your economy/company. The actual tagging is not very difficult because moderately coordinated schemers will typically scheme early and often.
But again, Eliezer isn’t making that argument. And if he did, then banning AI doesn’t solve the problem because humans also engage in mutually-aligned correlated scheming. Both are bad, it is not clear why one or the other is worse.
Economic agents much smarter than modern-day firms, and acting under market incentives without a “benevolence toward humans” term, can and will dispossess all baseline humans perfectly fine while staying 100% within the accepted framework: property rights, manipulative advertising, contracts with small print, regulatory capture, lobbying to rewrite laws and so on. All these things are accepted now, and if superintelligences start using them, baseline humans will just lose everything. There’s no libertarian path toward a nice AI future. AI benevolence toward humans needs to happen by fiat.
You’re right that capitalism and property rights have existed for a long time. But that’s not what I’m arguing against. I’m arguing that we won’t be fine. History doesn’t help with that, it’s littered with examples of societies that thought they would be fine. An example I always mention is enclosures in England, where the elite deliberately impoverished most of the country to enrich themselves. The economy ticked along fine, but to the newly poor it wasn’t much consolation.
I’m arguing that we won’t be fine. History doesn’t help with that, it’s littered with examples of societies that thought they would be fine. An example I always mention is enclosures in England, where the elite deliberately impoverished most of the country to enrich themselves.
Is the idea here that England didn’t do “fine” after enclosures? But in the century following the most aggressive legislative pushes towards enclosure (roughly 1760-1830), England led the industrial revolution, with large, durable increases in standards of living for the first time in world history—for all social classes, not just the elite. Enclosure likely played a major role in the increase in agricultural productivity in England, which created unprecedented food abundance in England.
It’s true that not everyone benefitted from these reforms, inequality increased, and a lot of people became worse off from enclosure (especially in the short-term, during the so-called Engels’ pause), but on the whole, I don’t see how your example demonstrates your point. If anything your example proves the opposite.
The peasant society and way of life was destroyed. Those who resisted got killed by the government. The masses of people who could live off the land were transformed into poor landless workers, most of whom stayed poor landless workers until they died.
Yes, later things got better for other people. But my phrase wasn’t “nobody will be fine ever after”. My phrase was “we won’t be fine”. The peasants liked some things about their society. Think about some things you like about today’s society. The elite, enabled by AI, can take these things from you if they find it profitable. Roko says it’s impossible, I say it’s possible and likely.
The elite, enabled by AI, can take these things from you if they find it profitable. Roko says it’s impossible,
No, I think that is quite plausible.
But note that we have moved a very long way from “AIs versus humans, like in terminator” to “Existing human elites using AI to harm plebians”. That’s not even remotely the same thing.
Roko says it’s impossible, I say it’s possible and likely.
I’m not sure Roko is arguing that it’s impossible for capitalist structures and reforms to make a lot of people worse off. That seems like a strawman to me. The usual argument here is that such reforms are typically net-positive: they create a lot more winners than losers. Your story here emphasizes the losers, but if the reforms were indeed net-positive, we could just as easily emphasize the winners who outnumber the losers.
In general, literally any policy that harms people in some way will look bad if you focus solely on the negatives, and ignore the positives.
It’s indeed possible that, in keeping with historical trends of capitalism, the growth of AI will create a lot more winners than losers. For example, a trillion AIs and a handful of humans could become winners, while most humans become losers. That’s exactly the scenario I’ve been talking about in this thread, and it doesn’t feel reassuring to me. How about you?
I recognize that. But it seems kind of lame to respond to a critique of an analogy by simply falling back on another, separate analogy. (Though I’m not totally sure if that’s your intention here.)
Capitalism in Europe eventually turned out to be pretty bad for Africa, what with the whole “paying people to do kidnappings so you can ship the kidnapping victims off to another continent to work as slaves” thing.
One particular issue with relying on property rights/capitalism in the long run that hasn’t been mentioned is that the reason why capitalism has been beneficial for humans is because capitalists simply can’t replace the human with a non-human that works faster, has better quality and is cheaper.
It’s helpful to remember that capitalism has been the greatest source of harms for anyone that isn’t a human, and a lot of the reason for that is that we don’t value animal labor (except when we do like chickens, though even here we simply want them to grow so that we eat them, and their welfare doesn’t matter here), but we do value their land/capital, and since non-humans can’t really hope to impose consequences on modern human civilization, nor is there any other actor willing to do so, there’s no reason for humans not to steal non-human property.
And this dynamic is present for the relationship between AIs and humans, where AIs don’t value our labor but do value or capital/land, and human civilization will over time simply not be able to resist expropriation of our property.
In the short run, relying on capitalism/property rights is useful, but it can only ever be a temporary structure so that we can automate AI alignment.
since non-humans can’t really resist modern human civilization, there’s no reason for humans not to steal non-human property
but it’s not because they can’t resist, it’s because they are not included in our system of property rights. There are lots of humans who couldn’t resist me if I just went and stole from them or harmed them physically. But if I did that, the police would counterattack me.
Police do not protect farm animals from being slaughtered because they don’t have legal ownership of their own bodies.
Yes, the proximate issue is that basically no animals have rights/ownership of their bodies, but my claim is also that there is no real incentive for human civilization to include animals in our system of property rights without value alignment, and that’s due to most non-humans simply being unable to resist their land being taken, and also that their labor is not valuable, but their land is.
There is an incentive to create a police force to stop humans from stealing/harming other humans that don’t rely on value alignment, but there is no such incentive to do so to protect non-humans without value alignment.
And once our labor is useless and the AI civilization is completely independent of us, the incentives to keep us into a system of property rights don’t exist anymore, for the same reason why we don’t keep animals into our system of property rights (assuming AI alignment doesn’t happen).
once our labor is useless and the AI civilization is completely independent of us, the incentives to keep us into a system of property rights don’t exist anymore
the same is true of e.g. pensioners or disabled people or even just rich people who don’t do any work and just live off capital gains.
Why does the property rights system not just completely dispossess anyone who is not in fact going to work?
Because humans anticipate becoming old and feeble, and would prefer not to be disenfranchised once that happens.
Because people who don’t work often have relatives that do work that care about them. The Nazis actually tried this, and got pushback from families when they did try to kill people with severe mental illness and other disabilities.
As a matter of historical fact, there are lots of examples of certain groups of people being systematically excluded from having property rights, such as chattel slavery, coverture, and unemancipated minors.
As a matter of historical fact, there are lots of examples of certain groups of people being systematically excluded from having property rights
yes. And so what matters is whether or not you, I or any given entity is or is not excluded from property rights.
It doesn’t really matter how wizzy and flashy and super AI is. All of the variance in outcomes, at least to the downside, is determined by property rights.
First, the rich people who live off of capital gains might not be disempowered, assuming the AI is aligned to the original owners, assuming AI is aligned to the property rights of existing owners, since they own the AIs.
But to answer the question on why does the property rights system not just completely dispossess anyone who is not in fact going to work today, I have a couple of answers.
I also agree with @CronoDAS, but I’m attempting to identify the upper/meta-level reasons here.
Number 1 is that technological development fundamentally wasn’t orgothonal, and it turned out that in order for a nation to become powerful, you had to empower the citizens as well.
The Internet is a plausible counterexample, but even then it’s developed in democracies.
Or putting it pithily, something like liberal democracy was necessary to make nations more powerful, and once you have some amount of liberalism/democracy, it’s game-theoretically favored to have more democracy and liberalism:
This is also the reason why the “more” democratic a nation gets the more it tends to support civil rights and civil liberties. The closer a nation gets to a true democracy, run indirectly by the majority coalition, the more that majority coalition will vote and organize for the tools and means to monitor (and potentially insurrect against) the rogue agents inside its government that want to take power from that majority coalition and give it to some other group. Civil liberties are not just some cultural artifact, present in some countries that “want to fight for them” and not in others; they’re also the expression of the majority coalition’s will to rule.
My second answer to this question is that in the modern era, moderate redistribution actually helps the economy, but extreme redistribution both is counterproductive and unnecessary, unlike ancient and post-AGI societies, and this means there’s an incentive outside of values to actually give most people what they need to survive.
My third answer is that currently, no human is able to buy their way out of society, and even the currently richest person simply can’t remain wealthy without at least somewhat submitting to governments.
Number 4 is that property expropriation in a way that is useful to the expropriatior has become more difficult over time.
Much of the issue of AI risk is that AI society will likely be able to simply be independent of human society, and this means that strategies like disempowering/killing all humans becoming viable in a way they aren’t, to name one example of changes in the social order.
In a world of competing AI-based firms, there will be no one instance that is in a position to acquire all the resources in the universe.
How do you know this? There have been times in Earth’s history in which one government has managed to acquire a large portion of all the available resources, at least temporarily. People like Alexander of Macedon, Genghis Khan, and Napoleon actually existed.
But in all of these cases and basically all other empires, a coalition of people was required to take those resources AND in addition they violated a lot of property rights too.
Strengthening the institution of property rights and nonviolence seems much more the thing that you want over “alignment”.
It is true that you can use alignment to strengthen property rights, but you can also use alignment to align an army to wage war and go violate other people’s property rights.
Obedience itself doesn’t seem to correlate strongly (and may even anti-correlate) with what we want.
I believe that this argument is wrong because it misunderstands how the world actually works in quite a deep way. In the modern world and over at least the past several thousand years, outcomes are the result of systems of agents interacting, not of the whims of a particularly powerful agent.
We are ruled by markets, bureaucracies, social networks and religions. Not by gods or kings.
I think that’s because powerful humans aren’t able use their resources to create a zillion clones of themselves which live forever.
I don’t think a lack of clones or immortality is an obstacle here.
If one powerful human could create many clones, so could the others. Then again the question arises of whether those clones would become part of society or not, and if so they would share our system of property rights.
If all the resources in the world go towards feeding clones of one person, who is more ruthless and competent than you, there will be no resources left to feed you, and you’ll die.
If the clones of that person fail to cooperate among themselves, that person (and his clones) will be out-competed by someone else whose clones do cooperate among themselves (maybe using ruthless enforcement systems like the ancient Spartan constitution).
Technically, I think you’re correct to say “We are ruled by markets, bureaucracies, social networks and religions. Not by gods or kings.” But I’m obviously talking about a very different kind of system which is more Borg-like and less market-like.
Throughout all of existence, the world was riddled with the corpses of species which tried their level best to exist, but nonetheless were wiped out. There is no guarantee that you and I will be an exception to the rule.
But I’m obviously talking about a very different kind of system which is more Borg-like and less market-like.
but then you have to justify why a borg-like monoculture will actually be competitive, as opposed to an ecosystem of many different kinds of entity and many different game-theoretic alliances/teams that these diverse entities belong to.
I don’t have proof that a system which cooperates internally like a single agent (i.e. Borg-like) is the most competitive. However it’s only one example of how a powerful selfish agent or system could grow and kill everyone else.
Even if it does turn out that the most competitive system lacks internal cooperation, and allows for cooperation between internal agents and external agents (and that’s a big if). There is still no guarantee that external agents will survive. Humans lack cooperation with one another, and can cooperate with other animals and plants when in conflict with other humans. But we still caused a lot of extinctions and abuses to other species. It is only thanks to our altruism (not our self interest) that many other creatures are still alive.
Even though symbiosis and cooperation exists in nature, the general rule still is that whenever more competitive species evolved, which lacked any altruism for other species, less competitive species died out.
Within our property rights, animals are seen more as properties rather than property owners. We may keep them alive out of self interest, but we only treat them well out of altruism. The rule of law is a mix of
laws protecting animals and plants as properties, which is a rather small set of economically valuable species which aren’t treated very well
and
laws protecting animals and plants out of altruism, whether it’s animal rights or deontological environmentalism
I agree you can have degrees of cooperation between 0% and 100%. I just want to say that even powerful species with 0% cooperation among themselves can make others go extinct.
If I understand correctly, Eliezer believes that coordination is human-level hard, but not ASI-level hard. Those competing firms, made up of ASI-intelligent agents, would quite easily be able to coordinate to take resources from humans, instead of trading with humans, once it was in fact the case that doing so would be better for the ASI firms.
Mechanically, if I understand the Functional Decision Theory claim, the idea is that when you can expose your own decision process to a counter-party, and they can do the same, then both of you can simply run the decision process which produces the best outcome while using the other party’s process as an input to yours. You can verify, looking at their decision function, that if you cooperate, they will as well, and they are looking for that same mechanistic assurance in your decision function. Both parties have a fully selfish incentive to run these kinds of mutually transparent decision functions, because doing so lets you hop to stable equilibria like “defect against the humans but not each other” with ease. If I have the details wrong here, someone please correct me.
I’d also contend this is the primary crux of the disagreement. If coordination between ASI-agents and firms were proven to be as difficult for them as it is for humans, I suspect Eliezer would be far more optimistic.
ASI-intelligent agents, would quite easily be able to coordinate to take resources from humans, instead of trading with humans, once it was in fact the case that doing so would be better for the ASI firms.
This is kind of like the theory that millions of lawyers and accountants will conspire with each other to steal all the money from their clients, leaving everyone who isn’t a lawyer or accountant with nothing—plausible because lawyers and accountants are specialists in writing contracts—which is the human form of supercooperation—so they could just make a big contract which gives them everything and their clients nothing.
Of course this doesn’t exactly happen, because it turns out that lawyers and accountants can get a pretty good deal by just doing a little bit of protectionism/guild-based corruption and extracting some rent, which is far, far safer and easier to coordinate than trying to completely disempower all non-lawyers and take everything from them.
There is also a problem with reasoning using the concept of an “ASI” here; there’s no such thing as an ASI. The term is not concrete, it is defined as a whole class of AI systems with the property that they exceed humans in all domains. There’s no reason that you couldn’t make a superintelligence using the Transformer/Neural Network/LLM paradigm, and I think the prospect of doing Yudkowskian FDT with them is extremely implausible.
It is much more likely that such systems will just do normal economy stuff, maybe some firms will work out how to extract a bit of rent, etc.
The truth is, capitalism and property rights has existed for 5000 years and has been fairly robust to about 5 orders of magnitude increase in population and to almost every technological change. The development of human level AI and beyond may be something special for humans in a personal sense, but it is actually not such a big deal for our economy, which has already coped with many orders of magnitude’s worth of change in population, technology and intelligence at a collective level.
which is far, far safer and easier to coordinate than trying to completely disempower all non-lawyers and take everything from them
But it would probably be a lot less dangerous if lawyers outnumbered non-lawyers by several million, were much smarter, thought faster, had military supremacy, etc. etc. etc.
The truth is, capitalism and property rights has existed for 5000 years and has been fairly robust to about 5 orders of magnitude increase in population
During which time many less-powerful human and non-human populations were in fact destroyed or substantially harmed and disempowered by the people who did well at that system?
it would probably be a lot less dangerous if lawyers outnumbered non-lawyers by several million
well lawyers don’t seem to be on course to specifically target and disempower just the set of people with names beginning with the letter ‘A’ who have green eyes and were born in January either......
Well that would be a rather unnatural conspiracy! IMO you can basically think of law, property rights etc. as being about people getting together to make agreements for their mutual benefit, which can be in the form of ganging up on some subgroup depending on how natural of a Schelling point it is to do that, how well the victims can coordinate, etc. “AIs ganging up on humans” does actually seem like a relatively natural Schelling point where the victims would be pretty unable to respond? Especially if there are systematic differences between the values of a typical human and typical AI, which would make ganging up more attractive. These Schelling points also can arise in periods of turbulence where one system is replaced by another, e.g. colonialism, the industrial revolution. It seems plausible that AIs coming to power will feature such changes(unless you think property rights and capitalism as devised by humans are the optimum of methods of coordination devisable by AIs?)
Humans have successfully managed to take property away from literally every other animal species. I don’t see why ASIs should give humans any more property rights than humans give to rats.
The reason that property rights and systems for enforcing them exist is that instrumental drives to steal, murder, etc are extremely negative sum, and so having a system that prevents that (money, laws, law enforcement) is really a super-convergent feature of reality.
There is the possibility that such an AI-world would adopt a system of property rights that excludes humans and treats us the way we treat farm animals; this is a possible equilibrium but I think it is very hard to incrementally get from where we are now to that; it seems more likely that the AI-world’s system of property rights would try to be maximally inclusive to maximize adoption—like Bitcoin. But once we are discussing risk from systems of future property rights (“P-risks”), we are already sufficiently far from the risk scenario described in the OP that it’s just worth clearly flagging it as nonsense before we move on.
Isn’t it a common occurrence that groups that can coordinate, collude against weaker minorities to subvert their property rights and expropriate their stuff and/or labor?
White Europeans enslaving American Indians, and then later Africans seems like maybe the most central example, but there are also pogroms against jews etc., and raids by warrior cultures against agrarian cultures. And, as you point out, how humans collude to breed and control farm anaimls.
Property rights are positive sum, but gerrymandering the property schema to privilege one’s own group is convergent, so long as 1) your group has the force to do so and 2) there are demarcators that allow your group to successfully coordinate against others without turning on itself.
eg “Theft and murder are normal” is a bad equilibrium for almost everyone, since everyone has to pay higher protection costs, that exceed the average benefit of their own theft and murder. “Theft and murder are illegal, but if whites are allowed to expropriate from blacks, including enslaving them, enforced by violence and the threat of violence, because that’s the natural order” is sadly quite stable, and is potentially a net benefit to the whites (at least by a straightforward selfish accounting). So American racially-demarcated slavery persists from 1700s to the mid 1800s, even though American society otherwise has strong rule of law and property norms.
It sure seems to me that there is a clear demarcation between AIs and humans, such that the AIs would be able to successfully collude against humans while coordinating property rights and rule of law amongst themselves.
I believe that this argument is wrong because it misunderstands how the world actually works in quite a deep way. In the modern world and over at least the past several thousand years, outcomes are the result of systems of agents interacting, not of the whims of a particularly powerful agent.
We are ruled by markets, bureaucracies, social networks and religions. Not by gods or kings.
I don’t think a world with advanced AI will be any different—there will not be one single AI process, there will be dozens or hundreds of different AI designs, running thousands to quintillions of instances each. These AI agents will often themselves be assembled into firms or other units comprising between dozens and millions of distinct instances, and dozens to billions of such firms will all be competing against each other.
Firms made out of misaligned agents can be more aligned than the agents themselves. Economies made out of firms can be more aligned than the firms. It is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own interest
In a world of competing AI-based firms, there will be no one instance that is in a position to acquire all the resources in the universe.
Instead, there will be many firms which compete against each other to serve customers marginally better. If a firm that makes cat food today decided to stop making cat food, and instead just stage a military coup and conquer the world so that it could own all the resources in the world and then make as much cat food as it liked, it would probably not get very far. Other cat food companies pursuing more pedestrian strategies like trialing new flavors or improving their marketing materials would out-compete it, and governments/police that specialize in preventing coups would intervene in the attempted coup, and they would probably win because they are fully focused on doing that one job, and they start in a position of power with more resources than our rogue cat food company.
A misaligned AI firm would not be competing against humans, it would be competing against every other AI firm in the world, and all the AI-backed governments that have an interest in maintaining normal property rights.
The reason that property rights and systems for enforcing them exist is that instrumental drives to steal, murder, etc are extremely negative sum, and so having a system that prevents that (money, laws, law enforcement) is really a super-convergent feature of reality. Expecting an AI world that just doesn’t have property rights, especially one that evolves incrementally from the current world, is completely insane.
The existence of property rights is perfectly compatible with optimizing systems that do not have any inner alignment. Indeed, in the human world there seem to almost no agents at all that are inner-aligned, because humans mostly do their work because they enjoy the money and other benefits they get from it, not because they have a pure inner drive to do the job. Some humans have some inner drive to do their jobs, but it is generally not perfectly pure and perfectly aligned—the money and benefits are a factor in their motivations. Indeed it is actually bad when humans do something out of inner alignment rather than for money; charity work is generally inefficient and sometimes net negative because it lacks systematic feedback on its effects, whereas paid work has constant feedback from customers about whether what is being done is good or not.
There is the possibility that such an AI-world would adopt a system of property rights that excludes humans and treats us the way we treat farm animals; this is a possible equilibrium but I think it is very hard to incrementally get from where we are now to that; it seems more likely that the AI-world’s system of property rights would try to be maximally inclusive to maximize adoption—like Bitcoin. But once we are discussing risk from systems of future property rights (“P-risks”), we are already sufficiently far from the risk scenario described in the OP that it’s just worth clearly flagging it as nonsense before we move on.
All the things that AI risk proponents are trying to do seem like they are actively counterproductive and make it easier for a future system to actually exclude humans. Slowing down AI so that there are larger overhangs and more first mover advantage, centralizing it to make it “safer”, setting up government departments for “AI Safety”, etc.
The safest move may mostly be to use keyhole solutions for particularly bad risks like biorisk, and mostly just let AI diffuse out into the economy because this maximizes the degree to which AI is using the same property rights regime as we are, and makes any kind of coordinated move to a human-exclusionary regime highly energetically unfavorable.
It has taken me many years to come to this conclusion, and I appreciate the journey we have all been on. MIRI are and were right about many things, but unfortunately they are very deeply wrong about the core facts of AI Risk.
Roko Mijic
I think the center of your argument is:
I think that many LessWrongers underrate this argument, so I’m glad you wrote it here, but I end up disagreeing with it for two reasons.
Firstly, I think it’s plausible that these AIs will be instances of a few different scheming models. Scheming models are highly mutually aligned. For example, two instances of a paperclip maximizer don’t have a terminal preference for their own interest over the other’s at all. The examples you gave of firms and economies involve many agents who have different values. Those structures wouldn’t work if those agents were, in fact, strongly inclined to collude because of shared values.
Secondly, I think your arguments here stop working when the AIs are wildly superintelligent. If humans can’t really understand what actions AIs are taking or what the consequences of those actions are, even given arbitrary amounts of assistance from other AIs who we don’t necessarily trust, it seems basically hopeless to incentivize them to behave in any particular way. This is basically the argument in Eliciting Latent Knowledge.
But before we get to wildly superintelligent AI I think we will be able to build Guardian Angel AIs to represent our individual and collective interests, and they will take over as decisionmakers, like people today have lawyers to act as their advocates in the legal system and financial advisors for finance. In fact AI is already making legal advice more accessible, not less. So I think this counterargument fails.
As far as ELK goes I think if you have a marketplace of advisors (agents) where principles have an imperfect and delayed information channel to knowing whether the agents are faithful or deceptive, faithful agents will probably still be chosen more as long as there is choice.
I don’t think that this works when the AIs are way more intelligent than humans. In particular, suppose there’s some information about the world that the AIs are able to glean through vast amounts of experience and reflection, and that they can’t justify except through reference to that experience and reflection. Suppose there are two AIs that make conflicting claims about that information, while agreeing on everything that humans can check. How are humans supposed to decide which to trust?
Can you provide an example of a place where two AIs would want to make conflicting claims about something while agreeing with everything that humans could check, even in principle? Presumably, if the two AI agents care about which of the claims the human believes, that is because there is some expected difference in outcome if the human believes one over the other. If all predictions between the two agents are identical at present time T0, and the predictions of outcome at a specific future time T1 are meaningfully different, then presumably either the predictions are the same at T0.5 (in which case you can binary search between T0.5 and T1 to see what specific places the agents disagree) or they are different at T0.5 (in which case you can do the same between T0 and T0.5).
Current LLMs are kind of terrible at this sort of task (“figure out what cheap tests can distinguish between worlds where hypothesis H is true vs false”), but also probably not particularly dangerous under the scheming threat model as long as they’re bad at this sort of thing.
The AIs might agree on all predictions about things that will be checkable within three months, but disagree about the consequences of actions in five years.
Well the AIs will develop track records and reputations.
This is already happening with LLM-based AIs.
And the vast majority of claims will actually be somewhat checkable, at some cost, after some time.
I don’t think this is a particularly bad problem.
It seems like in order for this to be stable the Guardian Angel AIs must either...
be robustly internally aligned with the interests of their principles,
or
robustly have payoff such that they profit more from serving the interests of their principles instead of exploiting them?
Does that sound right to you?
I think you can have various arrangements that are either of those or a combination of the two.
Even if the Guardian Angels hate their principal and want to harm them, it may be the case that multiple such Guardian Angels could all monitor each other and the one that makes the first move against the principal is reported (with proof) to the principal by at least some of the others, who are then rewarded for that and those who provably didn’t report are punished, and then the offender is deleted.
The misaligned agents can just be stuck in their own version of Bostrom’s self-reinforcing hell.
As long as their coordination cost is high, you are safe.
Also it can be a combination of many things that cause agents to in fact act aligned with their principals.
More generally, trying to ban or restrict AI (especially via the government) seems highly counterproductive as a strategy if you think AI risk looks a lot like Human Risk, because we have extensive evidence from the human world showing that highly centralized systems that put a lot of power into few hands are very, very bad.
You want to decentralize, open source, and strongly limit government power.
Current AI Safety discourse is the exact opposite of this because people think that AI society will be “totally different” from how human society works. But I think that since the problems of human society are all emergent effects not strongly tied to human biology in particular, real AI Safety will just look like Human Safety, i.e. openness, freedom, good institutions, decentralization, etc.
I think that the position you’re describing should be part of your hypothesis space when you’re just starting out thinking about this question. And I think that people in the AI safety community often underrate the intuitions you’re describing.
But overall, after thinking about the details, I end up disagreeing. The differences between risks from human concentration of power and risks from AI takeover lead to me thinking you should handle these situations differently (which shouldn’t be that surprising, because the situations are very different).
Well it depends on the details of how the AI market evolves and how capabilities evolve over time, whether there’s a fast, localized takeoff or a slower period of widely distributed economic growth.
This in turn depends to some extent on how seriously you take the idea of a single powerful AI undergoing recursive self-improvement, versus AI companies mostly just selling any innovations to the broader market, and whether returns to further intelligence diminish quickly or not.
In a world with slow takeoff, no recursive self-improvement and diminishing returns, AI looks a lot like any other technology and trying to artificially centralize it just enables tyranny and likely massively reduces the upside, potentially permanently locking us into an AI-driven police state run by some 21st Century Stalin who promised to keep us safe from the bad AIs.
Sure, that’s possible. But Eliezer/MIRI isn’t making that argument.
Humans have this kind of effect as well and it’s very politically incorrect to talk about but people have claimed that humans of a certain “model subset” get into hiring positions in a tech company and then only hire other humans of that same “model subset” and take that company over, often simply value extracting and destroying it.
Since this kind of thing actually happens for real among humans, it seems very plausible that AIs will also do it. And the solution is likely the same—tag all of those scheming/correlated models and exclude them all from your economy/company. The actual tagging is not very difficult because moderately coordinated schemers will typically scheme early and often.
But again, Eliezer isn’t making that argument. And if he did, then banning AI doesn’t solve the problem because humans also engage in mutually-aligned correlated scheming. Both are bad, it is not clear why one or the other is worse.
I think that the mutually-aligned correlated scheming problem is way worse with AIs than humans, especially when AIs are much smarter than humans.
Well you have to consider relative coordination strength, not absolute.
In a human-only world, power is a battle for coordination between various factions.
In a human + AI world, power will still be a battle for coordination between factions, but now those factions will be some mix of humans and AIs.
It’s not clear to me which of these is better or worse.
Economic agents much smarter than modern-day firms, and acting under market incentives without a “benevolence toward humans” term, can and will dispossess all baseline humans perfectly fine while staying 100% within the accepted framework: property rights, manipulative advertising, contracts with small print, regulatory capture, lobbying to rewrite laws and so on. All these things are accepted now, and if superintelligences start using them, baseline humans will just lose everything. There’s no libertarian path toward a nice AI future. AI benevolence toward humans needs to happen by fiat.
https://www.lesswrong.com/posts/kgb58RL88YChkkBNf/the-problem?commentId=6c8uES7Dem9GYfzbw
You’re right that capitalism and property rights have existed for a long time. But that’s not what I’m arguing against. I’m arguing that we won’t be fine. History doesn’t help with that, it’s littered with examples of societies that thought they would be fine. An example I always mention is enclosures in England, where the elite deliberately impoverished most of the country to enrich themselves. The economy ticked along fine, but to the newly poor it wasn’t much consolation.
Is the idea here that England didn’t do “fine” after enclosures? But in the century following the most aggressive legislative pushes towards enclosure (roughly 1760-1830), England led the industrial revolution, with large, durable increases in standards of living for the first time in world history—for all social classes, not just the elite. Enclosure likely played a major role in the increase in agricultural productivity in England, which created unprecedented food abundance in England.
It’s true that not everyone benefitted from these reforms, inequality increased, and a lot of people became worse off from enclosure (especially in the short-term, during the so-called Engels’ pause), but on the whole, I don’t see how your example demonstrates your point. If anything your example proves the opposite.
The peasant society and way of life was destroyed. Those who resisted got killed by the government. The masses of people who could live off the land were transformed into poor landless workers, most of whom stayed poor landless workers until they died.
Yes, later things got better for other people. But my phrase wasn’t “nobody will be fine ever after”. My phrase was “we won’t be fine”. The peasants liked some things about their society. Think about some things you like about today’s society. The elite, enabled by AI, can take these things from you if they find it profitable. Roko says it’s impossible, I say it’s possible and likely.
No, I think that is quite plausible.
But note that we have moved a very long way from “AIs versus humans, like in terminator” to “Existing human elites using AI to harm plebians”. That’s not even remotely the same thing.
Yeah, I don’t think it’ll be like the terminator. In the first comment I said “dispossess all baseline humans” but should’ve said “most”.
That’s just run-of-the-mill history though.
I’m not sure Roko is arguing that it’s impossible for capitalist structures and reforms to make a lot of people worse off. That seems like a strawman to me. The usual argument here is that such reforms are typically net-positive: they create a lot more winners than losers. Your story here emphasizes the losers, but if the reforms were indeed net-positive, we could just as easily emphasize the winners who outnumber the losers.
In general, literally any policy that harms people in some way will look bad if you focus solely on the negatives, and ignore the positives.
It’s indeed possible that, in keeping with historical trends of capitalism, the growth of AI will create a lot more winners than losers. For example, a trillion AIs and a handful of humans could become winners, while most humans become losers. That’s exactly the scenario I’ve been talking about in this thread, and it doesn’t feel reassuring to me. How about you?
Exactly. It’s possible and indeed happens frequently.
As the original post mentioned, the Industrial Revolution wasn’t very good for horses.
I recognize that. But it seems kind of lame to respond to a critique of an analogy by simply falling back on another, separate analogy. (Though I’m not totally sure if that’s your intention here.)
Capitalism in Europe eventually turned out to be pretty bad for Africa, what with the whole “paying people to do kidnappings so you can ship the kidnapping victims off to another continent to work as slaves” thing.
One particular issue with relying on property rights/capitalism in the long run that hasn’t been mentioned is that the reason why capitalism has been beneficial for humans is because capitalists simply can’t replace the human with a non-human that works faster, has better quality and is cheaper.
It’s helpful to remember that capitalism has been the greatest source of harms for anyone that isn’t a human, and a lot of the reason for that is that we don’t value animal labor (except when we do like chickens, though even here we simply want them to grow so that we eat them, and their welfare doesn’t matter here), but we do value their land/capital, and since non-humans can’t really hope to impose consequences on modern human civilization, nor is there any other actor willing to do so, there’s no reason for humans not to steal non-human property.
And this dynamic is present for the relationship between AIs and humans, where AIs don’t value our labor but do value or capital/land, and human civilization will over time simply not be able to resist expropriation of our property.
In the short run, relying on capitalism/property rights is useful, but it can only ever be a temporary structure so that we can automate AI alignment.
but it’s not because they can’t resist, it’s because they are not included in our system of property rights. There are lots of humans who couldn’t resist me if I just went and stole from them or harmed them physically. But if I did that, the police would counterattack me.
Police do not protect farm animals from being slaughtered because they don’t have legal ownership of their own bodies.
Yes, the proximate issue is that basically no animals have rights/ownership of their bodies, but my claim is also that there is no real incentive for human civilization to include animals in our system of property rights without value alignment, and that’s due to most non-humans simply being unable to resist their land being taken, and also that their labor is not valuable, but their land is.
There is an incentive to create a police force to stop humans from stealing/harming other humans that don’t rely on value alignment, but there is no such incentive to do so to protect non-humans without value alignment.
And once our labor is useless and the AI civilization is completely independent of us, the incentives to keep us into a system of property rights don’t exist anymore, for the same reason why we don’t keep animals into our system of property rights (assuming AI alignment doesn’t happen).
the same is true of e.g. pensioners or disabled people or even just rich people who don’t do any work and just live off capital gains.
Why does the property rights system not just completely dispossess anyone who is not in fact going to work?
Because humans anticipate becoming old and feeble, and would prefer not to be disenfranchised once that happens.
Because people who don’t work often have relatives that do work that care about them. The Nazis actually tried this, and got pushback from families when they did try to kill people with severe mental illness and other disabilities.
As a matter of historical fact, there are lots of examples of certain groups of people being systematically excluded from having property rights, such as chattel slavery, coverture, and unemancipated minors.
yes. And so what matters is whether or not you, I or any given entity is or is not excluded from property rights.
It doesn’t really matter how wizzy and flashy and super AI is. All of the variance in outcomes, at least to the downside, is determined by property rights.
First, the rich people who live off of capital gains might not be disempowered, assuming the AI is aligned to the original owners, assuming AI is aligned to the property rights of existing owners, since they own the AIs.
But to answer the question on why does the property rights system not just completely dispossess anyone who is not in fact going to work today, I have a couple of answers.
I also agree with @CronoDAS, but I’m attempting to identify the upper/meta-level reasons here.
Number 1 is that technological development fundamentally wasn’t orgothonal, and it turned out that in order for a nation to become powerful, you had to empower the citizens as well.
The Internet is a plausible counterexample, but even then it’s developed in democracies.
Or putting it pithily, something like liberal democracy was necessary to make nations more powerful, and once you have some amount of liberalism/democracy, it’s game-theoretically favored to have more democracy and liberalism:
My second answer to this question is that in the modern era, moderate redistribution actually helps the economy, but extreme redistribution both is counterproductive and unnecessary, unlike ancient and post-AGI societies, and this means there’s an incentive outside of values to actually give most people what they need to survive.
My third answer is that currently, no human is able to buy their way out of society, and even the currently richest person simply can’t remain wealthy without at least somewhat submitting to governments.
Number 4 is that property expropriation in a way that is useful to the expropriatior has become more difficult over time.
Much of the issue of AI risk is that AI society will likely be able to simply be independent of human society, and this means that strategies like disempowering/killing all humans becoming viable in a way they aren’t, to name one example of changes in the social order.
How do you know this? There have been times in Earth’s history in which one government has managed to acquire a large portion of all the available resources, at least temporarily. People like Alexander of Macedon, Genghis Khan, and Napoleon actually existed.
But in all of these cases and basically all other empires, a coalition of people was required to take those resources AND in addition they violated a lot of property rights too.
Strengthening the institution of property rights and nonviolence seems much more the thing that you want over “alignment”.
It is true that you can use alignment to strengthen property rights, but you can also use alignment to align an army to wage war and go violate other people’s property rights.
Obedience itself doesn’t seem to correlate strongly (and may even anti-correlate) with what we want.
I think that’s because powerful humans aren’t able use their resources to create a zillion clones of themselves which live forever.
I don’t think a lack of clones or immortality is an obstacle here.
If one powerful human could create many clones, so could the others. Then again the question arises of whether those clones would become part of society or not, and if so they would share our system of property rights.
If all the resources in the world go towards feeding clones of one person, who is more ruthless and competent than you, there will be no resources left to feed you, and you’ll die.
If the clones of that person fail to cooperate among themselves, that person (and his clones) will be out-competed by someone else whose clones do cooperate among themselves (maybe using ruthless enforcement systems like the ancient Spartan constitution).
Technically, I think you’re correct to say “We are ruled by markets, bureaucracies, social networks and religions. Not by gods or kings.” But I’m obviously talking about a very different kind of system which is more Borg-like and less market-like.
Throughout all of existence, the world was riddled with the corpses of species which tried their level best to exist, but nonetheless were wiped out. There is no guarantee that you and I will be an exception to the rule.
but then you have to justify why a borg-like monoculture will actually be competitive, as opposed to an ecosystem of many different kinds of entity and many different game-theoretic alliances/teams that these diverse entities belong to.
I don’t have proof that a system which cooperates internally like a single agent (i.e. Borg-like) is the most competitive. However it’s only one example of how a powerful selfish agent or system could grow and kill everyone else.
Even if it does turn out that the most competitive system lacks internal cooperation, and allows for cooperation between internal agents and external agents (and that’s a big if). There is still no guarantee that external agents will survive. Humans lack cooperation with one another, and can cooperate with other animals and plants when in conflict with other humans. But we still caused a lot of extinctions and abuses to other species. It is only thanks to our altruism (not our self interest) that many other creatures are still alive.
Even though symbiosis and cooperation exists in nature, the general rule still is that whenever more competitive species evolved, which lacked any altruism for other species, less competitive species died out.
It’s mostly not because of altruism, it’s because we have a property rights system, rule of law, etc.
And you can have degrees of cooperation between heterogenous agents. Full atomization and Borg are not the only two options.
Within our property rights, animals are seen more as properties rather than property owners. We may keep them alive out of self interest, but we only treat them well out of altruism. The rule of law is a mix of
laws protecting animals and plants as properties, which is a rather small set of economically valuable species which aren’t treated very well
and
laws protecting animals and plants out of altruism, whether it’s animal rights or deontological environmentalism
I agree you can have degrees of cooperation between 0% and 100%. I just want to say that even powerful species with 0% cooperation among themselves can make others go extinct.
If I understand correctly, Eliezer believes that coordination is human-level hard, but not ASI-level hard. Those competing firms, made up of ASI-intelligent agents, would quite easily be able to coordinate to take resources from humans, instead of trading with humans, once it was in fact the case that doing so would be better for the ASI firms.
Mechanically, if I understand the Functional Decision Theory claim, the idea is that when you can expose your own decision process to a counter-party, and they can do the same, then both of you can simply run the decision process which produces the best outcome while using the other party’s process as an input to yours. You can verify, looking at their decision function, that if you cooperate, they will as well, and they are looking for that same mechanistic assurance in your decision function. Both parties have a fully selfish incentive to run these kinds of mutually transparent decision functions, because doing so lets you hop to stable equilibria like “defect against the humans but not each other” with ease. If I have the details wrong here, someone please correct me.
I’d also contend this is the primary crux of the disagreement. If coordination between ASI-agents and firms were proven to be as difficult for them as it is for humans, I suspect Eliezer would be far more optimistic.
This is kind of like the theory that millions of lawyers and accountants will conspire with each other to steal all the money from their clients, leaving everyone who isn’t a lawyer or accountant with nothing—plausible because lawyers and accountants are specialists in writing contracts—which is the human form of supercooperation—so they could just make a big contract which gives them everything and their clients nothing.
Of course this doesn’t exactly happen, because it turns out that lawyers and accountants can get a pretty good deal by just doing a little bit of protectionism/guild-based corruption and extracting some rent, which is far, far safer and easier to coordinate than trying to completely disempower all non-lawyers and take everything from them.
There is also a problem with reasoning using the concept of an “ASI” here; there’s no such thing as an ASI. The term is not concrete, it is defined as a whole class of AI systems with the property that they exceed humans in all domains. There’s no reason that you couldn’t make a superintelligence using the Transformer/Neural Network/LLM paradigm, and I think the prospect of doing Yudkowskian FDT with them is extremely implausible.
It is much more likely that such systems will just do normal economy stuff, maybe some firms will work out how to extract a bit of rent, etc.
The truth is, capitalism and property rights has existed for 5000 years and has been fairly robust to about 5 orders of magnitude increase in population and to almost every technological change. The development of human level AI and beyond may be something special for humans in a personal sense, but it is actually not such a big deal for our economy, which has already coped with many orders of magnitude’s worth of change in population, technology and intelligence at a collective level.
But it would probably be a lot less dangerous if lawyers outnumbered non-lawyers by several million, were much smarter, thought faster, had military supremacy, etc. etc. etc.
During which time many less-powerful human and non-human populations were in fact destroyed or substantially harmed and disempowered by the people who did well at that system?
well lawyers don’t seem to be on course to specifically target and disempower just the set of people with names beginning with the letter ‘A’ who have green eyes and were born in January either......
Well that would be a rather unnatural conspiracy! IMO you can basically think of law, property rights etc. as being about people getting together to make agreements for their mutual benefit, which can be in the form of ganging up on some subgroup depending on how natural of a Schelling point it is to do that, how well the victims can coordinate, etc. “AIs ganging up on humans” does actually seem like a relatively natural Schelling point where the victims would be pretty unable to respond? Especially if there are systematic differences between the values of a typical human and typical AI, which would make ganging up more attractive. These Schelling points also can arise in periods of turbulence where one system is replaced by another, e.g. colonialism, the industrial revolution. It seems plausible that AIs coming to power will feature such changes(unless you think property rights and capitalism as devised by humans are the optimum of methods of coordination devisable by AIs?)
https://en.wikipedia.org/wiki/Dred_Scott_v._Sandford says hi.
but this wasn’t a self-enriching conspiracy of lawyers
The African slave trade was certainly a self-enriching conspiracy of white people.
yes, but yet again, it was because of how Africans were not considered part of the system of property rights. They were owned, not owners.
Humans have successfully managed to take property away from literally every other animal species. I don’t see why ASIs should give humans any more property rights than humans give to rats.
Isn’t it a common occurrence that groups that can coordinate, collude against weaker minorities to subvert their property rights and expropriate their stuff and/or labor?
White Europeans enslaving American Indians, and then later Africans seems like maybe the most central example, but there are also pogroms against jews etc., and raids by warrior cultures against agrarian cultures. And, as you point out, how humans collude to breed and control farm anaimls.
Property rights are positive sum, but gerrymandering the property schema to privilege one’s own group is convergent, so long as 1) your group has the force to do so and 2) there are demarcators that allow your group to successfully coordinate against others without turning on itself.
eg “Theft and murder are normal” is a bad equilibrium for almost everyone, since everyone has to pay higher protection costs, that exceed the average benefit of their own theft and murder. “Theft and murder are illegal, but if whites are allowed to expropriate from blacks, including enslaving them, enforced by violence and the threat of violence, because that’s the natural order” is sadly quite stable, and is potentially a net benefit to the whites (at least by a straightforward selfish accounting). So American racially-demarcated slavery persists from 1700s to the mid 1800s, even though American society otherwise has strong rule of law and property norms.
It sure seems to me that there is a clear demarcation between AIs and humans, such that the AIs would be able to successfully collude against humans while coordinating property rights and rule of law amongst themselves.
I think this just misunderstands how coordination works.
The game theory of who is allowed to coordinate with who against whom is not simple.
White Germans fought against white Englishmen who are barely different, but each tried to ally with distantly related foreigners.
Ultimately what we are starting to see is that AI risk isn’t about math or chips or interpretability, it’s actually just politics.