When your terminal goal is suffering, no amount of alignment will lead to a good future.
andrew sauer
The public at large will certainly be unable to distinguish between Friendly and unfriendly AGI, since either would be incentivised to present itself as friendly on a surface-level, and very few people have the ability to distinguish between the two in the presence of competent PR deception.
Antinatalists getting the AI is morally the same as paperclip doom, everyone dies.
Forgive me, I didn’t see the point about nuclear weapons. Could you clarify that?
Re extreme pacifism:
I do think non consensual mind modification is a pretty authoritarian measure. The MIRI guy is going to have a lot more parameters to set than just “violence bad=500”, and if the AI is willing to modify people’s minds to satisfy that value, why not do that for everything else it believes in? Bad actors can absolutely exploit this capability, if they have a hand in the development of the relevant AI, they can just mind-control people to believe in their ideology.
Or you need to explain why bad outcomes happen when a buisinessman who doesn’t think about ethics much gets to the AI.
Sure. Long story short, even though the businessman doesn’t care that much, other people do, and will pick up any slack left behind by the businessman or his AI.
Some business guy who doesn’t care much about ethics but doesn’t actively hate anybody gets his values implanted into the AI. He is immediately whisked off to a volcano island with genetically engineered catgirls looking after his every whim or whatever the hell. Now the AI has to figure out what to do with the rest of the world.
It doesn’t just kill everybody else and convert all spare matter into defenses set up around the volcano lair, because the businessman guy is chill and wouldn’t want that. He’s a libertarian and just sorta vaguely figures that everyone else can do their thing as long as it doesn’t interfere with him. The AI quickly destroys all other AI research so that nobody can challenge its power and potentially mess with its master. Now that its primary goal is done with, it has to decide what to do with everything else.
It doesn’t just stop interfering altogether, since then AI research could recover. Plus, it figures the business guy has a weak preference for having a big human society around with cool tech and diverse, rich culture, plus lots of nice beautiful ecosystems so that he can go exploring if he ever gets tired of hanging out in his volcano lair all day.
So the AI gives the rest of society a shit ton of advanced technology, including mind uploading and genetic engineering, and becomes largely hands-off other than making sure nobody threatens its power, destroys society, or makes something which would be discomforting to its businessman master, who doesn’t really care that much about ethics anyway. Essentially, it keeps things interesting.
What is this new society like? It probably has pretty much every problem the old society has that doesn’t stem from limited resources or information. Maybe everybody gets a generous UBI and nobody has to work. Of course, nature is still as nasty and brutish as ever, and factory farms keep chugging along, since people have decided they don’t want to eat frankenmeat. There are still lots of psychopaths and fanatics around, both powerless and powerful. Some people decide to use the new tech to spin up simulations in VR to lord over in every awful way you can think of. Victims of crimes upload the perpetrators into hell, and religious people upload people they consider fanatics into hell, assholes do it to people they just don’t like. The businessman doesn’t care, or he doesn’t believe in sentient digital minds, or something else, and it doesn’t disrupt society. Encryption algorithms can hide all this activity, so nobody can stop it except for the AI, which doesn’t really care.
Meanwhile, since the AI doesn’t quite care all that much about what happens, and is fine with a wide range of possible outcomes, political squabbling between all the usual factions, some of which are quite distasteful, about which outcomes should come about within this acceptable range, continues as usual. People of course debate about all the nasty stuff that people are doing with the new technology, and in the end society decides that technology in the hands of man is bad and should only be used in pursuit of goodness in the eyes of the One True God, whose identity is decided upon after extensive fighting which probably causes quite a lot of suffering itself, but is very interesting from the perspective of someone looking at it from the outside, not from too close up, like our businessman.
The new theocrats decide they’re going to negotiate with the AI to build the most powerful system for controlling the populace that the AI will let them. The AI decides this is fine as long as they leave a small haven behind with all the old interesting stuff from the interim period. The theocrats begrudgingly agree, and now most of the minds in hell are religious dissidents, just like the One True God says it should be, and a few of the old slaves are left over in the new haven. The wilderness and the farms, of course, remain untouched. Wait a few billion years, and this shit is spread to every corner of the universe.
Is this particular scenario likely? Of course not, it’s far too specific. I’m just using it as a more concrete example to illustrate my points. The main points are:
Humanity has lots of moral pitfalls, any of which will lead to disaster when universally applied and locked-in, and we are unlikely to avoid all of them
Not locking-in values immediately or only locking-in partially is only a temporary solution, as there will always be actors which seek to lock-in whatever is left unspecified by the current system, which cannot be prevented by definition without locking-in the values.
By bargaining process, are we talking about humans doing politics in the real world, or about the AI running a “assume all humans had equal weight at the hypothetical platonic negotiating table” algorithm. I was thinking of the latter.
The latter algorithm doesn’t get run unless the people who want it to be run win the real-world political battle over AI takeoff, so I was thinking of the former.
And how much of a “if we were wiser and thought more” is the AI applying.
I’m not sure it matters. First of all, “wiser” is somewhat of a value judgement anyway, so it can’t be used to avoid making value judgements up front. What is “wisdom” when it comes to determining your morality? It depends on what the “correct” morality is.
And thinking more doesn’t necessarily change anything either. If somebody has an internally consistent value system where they value or don’t care about certain others, they’re not going to change that simply because they think more, any more than a paperclip maximizer will decide to make a utopia instead because it thinks more. The utility function is not up for grabs.
I’ll have to think more about your “extremely pacifist” example. My intuition says that something like this is very unlikely, as the amount of killing, indoctrination, and general societal change required to get there would seem far worse to almost anybody in the current world than the more abstract concept of suffering subroutines or exploiting uploads or designer minds or something like that. It seems like in order to achieve a society like you describe there would have to be some seriously totalitarian behavior, and while it may be justified to avoid the nightmare scenarios, that comes with its own serious and historically attested risk of corruption. It seems like any attempt at this would either leave some serious bad tendencies behind, be co-opted into a “get rid of the hated outgroup because they’re the real sadists” deal by bad actors, or be so strict that it’s basically human extinction anyway, leaving humans unrecognizable, and it doesn’t seem likely that society will go for this route even if it would work. But that’s the part of my argument I’m probably least confident in at the moment.
I think Putin is kind of a weak man here. There are other actors which are competent, if not from the top-down, than at least some segments of the people near to power in many of the powers that be are somewhat competent. Some level of competence is required to even remain in power. I think it’s likely that Putin is more incompetent than the average head of state, and he will fall from power at some point before things really start heating up with AI, probably due to the current fiasco. But whether or not that happens doesn’t really matter, because I’m focused more generally on somewhat competent actors which will exist around the time of takeoff, not individual imbeciles like Putin. People like him are not the root of the rot, but a symptom.
Or perhaps corporate actors are a better example than state actors, being able to act faster to take advantage of trends. This is why the people offering AI people jobs may not be so non-evil after all. If the world 50 years from now is owned by some currently unknown enterprising psychopathic CEO, or by the likes of Zuckerberg, that’s not really much better than any of the current powers that be. I apologize for being too focused on tyrannical governments, it was simply because you provided the initial example of Putin. He’s not the only type of evil person in this world, there are others who are more competent and better equipped to take advantage of looming AI takeoff.
Also the whole “break into some research place with guns and demand they do your research for you” example is silly, that’s not how power operates. People with that much power would set up and operate their own research organizations and systems for ensuring those orgs do what the boss wants. Large companies in the tech sector would be particularly well-equipped to do this, and I don’t think their leaders are the type of cosmopolitan that MIRI types are. Very few people are outside of the rationalist community itself in fact, and I think you’re putting too much stock in the idea that the rationalist community will be the only ones to have any say in AI, even aside from issues of trusting them.
As for the bargaining process, how confident are you that more people want good things than bad things as relates to the far future? For one thing, the bargaining process is not guaranteed to be fair, and almost certainly won’t be. It will greatly favor people with influence over those without, just like every other social bargaining process. There could be minority groups, or groups that get minority power in the bargain, who others generally hate. There are certainly large political movements going in this direction as we speak. And most people don’t care at all about animals, or whatever other kinds of nonhuman consciousness which may be created in the future, and it’s very doubtful any such entities will get any say at all in whatever bargaining process takes place.
The reason my position “devolves into” accepting extinction is because horrific suffering following singularity seems nearly inevitable. Every society which has yet existed has had horrific problems, and every one of them would be made almost unimaginably worse if they had access to value lock-in or mind uploading. I don’t see any reason to believe that our society today, or whatever it might be in 15-50 years or however long your AGI timeline is, should be the one exception? The problem is far more broad than just a few specific humans: if only a few people held evil values(or values accepting of evil, which is basically the same given absolute power) at any given time, it would be easy for the rest of society to prevent them from doing harm. You say “maybe we can’t (save our species from extinction) but we have to try.” But my argument isn’t that we can’t, it’s that we maybe can, and the scenarios where we do are worse. My problem with shooting for AI alignment isn’t that it’s “wasting time” or that it’s too hard, it’s that shooting for a utopia is far more likely to lead to a dystopia.
I don’t think my position of accepting extinction is as defeatist or nihilistic as it seems at first glance. At least, not more so than the default “normie” position might be. Every person who isn’t born right before immortality tech needs to accept death, and every species that doesn’t achieve singularity needs to accept extinction.
The way you speak about our ancestors suggests a strange way of thinking about them and their motivations. You speak about past societies, including tribes who managed to escape the ice age, as though they were all motivated by a desire to attain some ultimate end-state of humanity, and that if we don’t shoot for that, we’d be betraying the wishes of everybody who worked so hard to get us here. But those tribesmen who survived the ice age weren’t thinking about the glorious technological future, or conquering the universe, or the fate of humanity tens of thousands of years down the line. They wanted to survive, and to improve life for themselves and their immediate descendants, and to spread whatever cultural values they happened to believe in at the time. That’s not wrong or anything, I’m just saying that’s what people have been mostly motivated by for most of history. Each of our ancestors either succeeded or failed at this, but that’s in the past and there’s nothing we can do about it now.
To speak about what we should do based on what our ancestors would have wanted in the past is to accept the conservative argument that values shouldn’t change just because people fought hard in the past to keep them. What matters going forward is the people now and in the future, because that’s what we have influence over.
One does not have to be “irrational” to make others suffer. One just has to value their suffering, or not care and allow them to suffer for some other reason. There are quite a few tendencies in humanity which would lead to this, among them
Desire for revenge or “justice” for perceived or real wrongs
Desire to flex power
Sheer sadism
Nostalgia for an old world with lots of suffering
Belief in “the natural order” as an intrinsic good
Exploitation for selfish motives, e.g. sexual exploitation
Belief in the goodness of life no matter how awful the circumstances
Philosophical belief in suffering as a good thing which brings meaning to life
Religious or political doctrine
Others I haven’t thought of right now
First of all, I don’t trust MIRI nerds nor myself with this kind of absolute power. We may not be as susceptible to the ‘hated outgroup’ pitfall but that’s not the only pitfall. For one thing, presumably we’d want to include other people’s values in the equation to avoid being a tyrant, and you’d have to decide exactly when those values are too evil to include. Err on either side of that line and you get awful results. You have to decide exactly which beings you consider sentient, in a high-tech universe. Any mistakes there will result in a horrific future, since there will be at least some sadists actively trying to circumvent your definition of sentience, exploiting the freedom you give them to live as they see fit, which you must give them to avoid a dystopia. The problem of value choice in a world with such extreme potential is not something I trust anybody with, noble as they may be compared to the average person on today’s Earth.
Second, I’m not sure about the scenario you describe where AI is developed by a handful of MIRI nerds without anybody else in broader society or government noticing the potential of the technology and acting to insert their values into it before takeoff. It’s not like the rationalist community are the only people in the world who are concerned about the potential of AI tech. Especially since AI capabilities will continue to improve and show their potential as we get closer to the critical point. As for powerful people like Putin, they may not understand how AI works, but people in their orbit eventually will, and the smarter ones will listen, and use their immense resources to act on it. Besides, people like Putin only exist because there is at least some contingent of people who support him. If AI values are decided upon by some complex social bargaining process including all the powers that be, which seems likely, the values of those people will be represented, and even representing evil values can lead to horrific consequences down the line.
Do you really think you’d be wrong to want death in that case, if there were no hope whatsoever of rescue? Because that’s what we’re talking about in the analogous situation with AGI.
Then I don’t think you understand how bad extreme suffering can get.
Any psychopathic idiot could make you beg to get it over with and kill you using only a set of pliers if they had you helpless. What more could an AGI do?
Well, an AI treating us like we treat pigs is one of the things I’m so worried about, wouldn’t you?
Imaging bringing up factory farming as an example to show that what I’m talking about isn’t actually so bad...
Yes. The only other alternative I could see is finding some way to avoid singleton until humanity eventually goes extinct naturally, but I don’t think that’s likely. Advocating against AI would be a reasonable response but I don’t think it will work forever, technology marches on.
Every species goes extinct, and some have already gone extinct by being victims of their own success. The singularity is something which theoretically has the potential to give humanity, and potentially other species, a fate far better or far worse than extinction. I believe that the far worse fate is far more likely given what I know about humanity and our track record with power. Therefore I am against the singularity “elevating” humanity or other species away from extinction, which means I must logically support extinction instead since it is the only alternative.
Edit: People seem to disagree more strongly with this comment than anything else I said, even though it seems to follow logically. I’d like a discussion on this specific point and why people are taking issue with it.
I’ll have to think about the things you say, particularly the part about support and goodwill. I am curious about what you mean by trading with other worlds?
Maybe that’s the biggest difference between me and a lot of people here. You want to maximize the chance of a happy ending. I don’t think a happy ending is coming. This world is horrible and the game is rigged. Most people don’t even want the happy ending you or I would want, at least not for anybody other then themselves, their families, and maybe their nation.
I’m more concerned with making sure the worst of the possibilities never come to pass. If that’s the contribution humanity ends up making to this world, it’s a better contribution than I would have expected anyway.
What do you mean that AI was invented in 12th century France?
And why do you think that locking in values to protect some humans and not others, or humans and not animals, or something like this, is less possible than locking in values to protect all sentient beings? What makes it a “fantasy”?
Or, it could decide that it wants retribution for the perceived or actual wrongs against its creators, and enact punishment upon those the creators dislike.
My problem with that is I think solving “human values” is extremely unlikely for us to do in the way you seem to be describing it, since most people don’t even want to. At best, they just want to be left alone and make sure them and their families and friends aren’t the ones hit hardest. And if we don’t solve this problem, but manage alignment anyways, the results are unimaginably worse than what Clippy would produce.
What about bargaining which only supports those who can demand it in the interim before value lock-in, when humans still have influence? If people in power successfully lock-in their own values into the AGI, the fact they have no bargaining power after the AI takes over doesn’t matter, since it’s aligned to them. If that set of values screws over others who don’t have bargaining power even before the AI takeover, that won’t hurt them after the AI takes over.
I swear once true mindcrime becomes possible this is how it will happen.