cousin_it
Yeah, I also think humans-as-housecats is a pretty good scenario. But not sure it’s an optimum (even a local one). Consider this: the question “how can humans have true agency and other things they value, when ASIs are around” is itself a question that intelligence can answer. As one extreme point, consider an ASI that precommits itself to not interfering in the affairs of humans, except for stopping other ASIs. That’s clearly not optimal on other dimensions; okay, turn the dial until you get a pivotal act that’s optimal on the mix of dimensions that we care about.
A world of competing human emulations is a world I would actually want to live in
I think there’s a huge danger of people running private servers full of emulations and doing anything they want to them, undetectably. Desire for power over others is a very real thing, in some people at least. Maybe the government could prevent it by oversight; but in modern democracy a big factor of stability is that people could rise up and feasibly overthrow the government. Emulations on private servers wouldn’t have that power, so I don’t expect government to stably defend their rights. It’ll wash out over time, to agree more with the interests of those who can actually influence government. In short, this leads to emulation-world being very bad and I don’t want it.
The same arguments would apply to our world if governments got armies of autonomous drones, for example. Whenever I imagine possible worlds, the distribution of power is the first thing I think about. It makes the problem more real: it’s very hard to imagine a nice future world that works.
Why do you think all poor people will end up in these “wildlife preserves”, and not somewhere else under the power of someone less altruistic? A future of large power differences is… a future of large power differences.
Ah I see, I misunderstood your point. You’re right.
I fail to see the relevance of including, in moral deliberation, the harm that animals inflict upon other animals
It’s wrong to make dogs fight ⇒ it’s right to stop someone else from making dogs fight ⇒ it’s right to spend some resources stopping Nature from making dogs fight. Or at least, I don’t see where this logic fails.
If there’s a small class of people with immense power over billions of have-nothings that can do nothing back, sure, some of the superpowerful will be more than zero altruistic. But others won’t be, and overall I expect callousness and abuse of power to much outweigh altruism. Most people are pretty corruptible by power, especially when it’s power over a distinct outgroup, and pretty indifferent to abuses of power happening to the outgroup; all history shows that. Bigger differences in power will make it worse if anything.
Good post, and a good discussion to have.
Humans have been exterminating each other since before agriculture, and hunted a lot of megafauna to extinction before agriculture too. Technology made us more efficient at these things for sure, but even with colonialism and the world wars and factory farming, human-caused suffering is still a tiny blip compared to the history of biological evolution. Annie Dillard mentions horrifying parasitism in insects which has been happening under every leaf for millions of years, for example.
That said, I agree that technological progress should lead toward solving suffering. If it leads to filling more planets with the same kind of biological life as today, with the same ever-present suffering, then I’d rather not have it. In the utopia drafts that I write from time to time (but don’t post anywhere), most animal life has been replaced and even plants have been re-engineered not to smother each other.
You’re also right that this kind of good outcome seems very hard to achieve. It needs not just coordination, but global coordination. Otherwise countries that race ahead in technology will win out and keep remaking the world in their selfish image, with factory farming 2.0 and all that.
Things I’m pretty sure about: that your possibility 1 is much more likely than 2. That extrapolation is more like resolving internal conflicts in a set of values, not making them change direction altogether. That the only way for a set of values to extrapolate to “good” is if its starting percentage of “good” is high enough to win out.
Things I believe, but with less confidence: that individual desires will often extrapolate to a pretty nasty kind of selfishness (“power corrupts”). That starting from culture also has lots of dangers (like the wokeness or religion that you’re worried about), but a lot of it has been selected in a good direction for a long time, precisely to counteract the selfishness of individuals. So the starting percentage of good in culture might be higher.
Yeah. I think the main reason democracy exists at all is because people are necessary for skilled labor and for fighting wars. If that goes away, the result will be a world where money and power just discards most people. Why some people think “oh we’ll implement UBI if it comes to that”, I have no idea. When it “comes to that”, there won’t be any force powerful enough to implement UBI and interested in doing so. My cynical view is that the promise of distributing AI benefits in the future is a distraction: look over there, while we take all power.
Is that you in the photo? Why is your left hand a claw?
Thank you for writing this! I have a question though. The post says “many cases” and so on. Can we get some estimates on how many people are affected now, and is it growing or decreasing?
I’m not very confident, but will try to explain where the intuition comes from.
Basically I think the idea of “good” might be completely cultural. As in, if you extrapolate what an individual wants, that’s basically a world optimized for that individual’s selfishness; then there is what groups can agree on by rational negotiation, which is a kind of group selfishness, cutting out everyone who’s weak enough (so for example factory farming would be ok because animals can’t fight back); and on top of that there is the abstract idea of “good”, saying you shouldn’t hurt the weak at all. And that idea is not necessitated by rational negotiation. It’s just a cultural artifact that we ended up with, I’m not sure how.
So if you ask AI to optimize for what individuals want, and go through negotiations and such, there seems a high chance that the resulting world won’t contain “good” at all, only what I called group selfishness. Even if we start with individuals who strongly believe in the cultural idea of good, they can still get corrupted by power. The only way to get “good” is to point AI at the cultural idea to begin with.
You are of course right that culture also contains a lot of nasty stuff. The only way to get something good out of it is with a bunch of extrapolation, philosophy, and yeah I don’t know what else. It’s not reliable. But the starting materials for “good” are contained only there. Hope that makes sense.
Also to your other question: how to train philosophical ability? I think yeah, there isn’t any reliable reward signal, just as there wasn’t for us. The way our philosophical ability seems to work is by learning heuristics and ways of reasoning from fields where verification is possible (like math, or everyday common sense) and applying them to philosophy. And it’s very unreliable of course. So for AIs maybe this kind of carry-over to philosophy is also the best we can hope for.
It’s complicated.
First, I think there’s enough overlap between different reasoning skills that we should expect a smarter than human AI to be really good at most such skills, including philosophy. So this part is ok.
Second, I don’t think philosophical skill alone is enough to figure out the right morality. For example, let’s say you like apples but don’t like oranges. Then when choosing between philosophical theory X, which says apples are better than oranges, and theory Y which says the opposite, you’ll use the pre-theoretic intuition as a tiebreak. And I think when humans do moral philosophy, they often do exactly that: they fall back on pre-theoretic intuitions to check what’s palatable and what isn’t. It’s a tree with many choices, and even big questions like consequentialism vs deontology vs virtue ethics may ultimately depend on many such case by case intuitions, not just pure philosophical reasoning.
Third, I think morality is part of culture. It didn’t come from the nature of an individual person: kids are often cruel. It came from constraints that people put on each other, and cultural generalization of these constraints. “Don’t kill.” When someone gets powerful enough to ignore these constraints, the default outcome we should expect is amorality. “Power corrupts.” Though of course there can be exceptions.
Fourth—and this is the payoff—I think the only good outcome is if the first smarter than human AIs start out with “good” culture, derived from what human societies think is good. Not aligned to an individual human operator, and certainly not to money and power. Then AIs can take it from there and we’ll be ok. But I don’t know how to achieve that. It might require human organizational forms that are not money- or power-seeking. I wrote a question about it sometime ago, but didn’t get any answers.
All good points and I wanted to reply with some of them, so thanks. But there’s also another point where I might disagree more with LW folks (including you and Carl and maybe even Wei): I no longer believe that technological whoopsie is the main risk. I think we have enough geniuses working on the thing that technological whoopsie probably won’t happen. The main risk to me now is that AI gets pretty well aligned to money and power, and then money and power throws most humans by the wayside. I’ve mentioned it many times, the cleanest formulation is probably in this book review.
In that light, Redwood and others are just making better tools for money and power, to help align AI to their ends. Export controls are a tool of international conflict: if they happen, they happen as part of a package of measures which basically intensify the arms race. And even the CAIS letter is now looking to me like a bit of PR move, where Altman and others got to say they cared about risk and then went on increasing risk anyway. Not to mention the other things done by safety-conscious money, like starting OpenAI and Anthropic. You could say the biggest things that safety-conscious money achieved were basically enabling stuff that money and power wanted. So the endgame wouldn’t be some kind of war between humans and AI: it would be AI simply joining up with money and power, and cutting out everyone else.
I think a lot of rationalists accepted these Molochian offers (“build the Torment Nexus before others do it”, “invest in the Torment Nexus and spend the proceeds on Torment Nexus safety”) and the net result is simply that the Nexus is getting built earlier, with most safety work ending up as enabling capabilities or safetywashing. The rewards promised by Moloch have a way of receding into the future as the arms race expands, while the harms are already here and growing.
Wow. There’s a very “room where it happens” vibe about this post. Lots of consequential people mentioned, and showing up in the comments. And it’s making me feel like...
Like, there was this discussion club online, ok? Full of people who seemed to talk about interesting things. So I started posting there too, did a little bit of math, got invited to one or two events. There was a bit of money floating around too. But I always stayed a bit at arms length, was a bit less sharp than the central folks, less smart, less jumping on opportunities.
And now that folks from the same circle essentially ended up doing this huge consequential thing—the whole AI thing I mean, not just Anthropic—and many got rich in the process… the main feeling in my mind isn’t envy, but relief. That my being a bit dull, lazy and distant saved me from being part of something very ugly. This huge wheel of history crushing the human form, and I almost ended up pushing it along, but didn’t.
Or as Mike Monteiro put it:
Tech, which has always made progress in astounding leaps and bounds, is just speedrunning the cycle faster than any industry we’ve seen before. It’s gone from good vibes, to a real thing, to unicorns, to let’s build the Torment Nexus in record time. All in my lifetime...
I was lucky (by which I mean old) to enter this field when I felt, for my own peculiar reasons, that it was at its most interesting. And as it went through each phase, it got less and less interesting to me, to the point where I have little desire to interact too much with it now… In fact, when I think about all the folks I used to work on web shit with and what they’re currently doing, the majority are now woodworkers, ceramicists, knitters, painters, writers, etc. People who make things tend to move on when there’s nothing left to make. Nothing to make but the Torment Nexus.
Great writing. And yeah, the human-cat relationship is indeed one of the better ways that an AI-human relationship could turn out.
It’s not quite perfect though. We neuter cats. In the Middle Ages, people burned live cats for fun.
I remember you from the Pugs days. Two questions about this presentation. One is more aspirational: do you think of this society of AIs as more egalitarian (many superhuman AIs at roughly the same level) or more hierarchical (a range of AI sizes, with the largest hopefully being the most aligned to those below)? And the other is more practical. Right now the AI market is locked in an arms race kind of situation, and in particular, scrambling to make AIs that will bring commercial profit. That can lead to nasty incentives, e.g. an AI working for a tax software company can help it lobby the government to keep tax filing difficult, and of course much worse things can be imagined as well. If this continues, all the nice vision of kami and so on will just fail to exist. What is to be done, in your opinion?
Gonna be full of lies about living people, including billionaires, celebrities and politicians. Takedown in 3...2...1...
I agree with others about the fawning. A more “hardball” question I’d ask is: why not the left? It feels at some point a choice was made to build a libertarian-leaning techie community, which backfired: rationalists and adjacent folks ended up playing a big role in building and investing in AI. Maybe a more left-leaning movement focused on protest and the like would make more sense now?