To the average human, controlled AI is just as lethal as ‘misaligned’ AI

A few months ago I posted this understated short piece proposing, in a nutshell, that the average person has at least as much to fear from perfectly controlled advanced AI as they would from so-called ‘misaligned’ AI, because if automation can emerge that can defeat all humans’ defenses on its own whim, even despite its developers’ best efforts to prevent this from happening, it seems to me that automation that merely assists a small group of humans to defeat the rest of humans’ defenses would be a technically easier milestone, without the hurdle of subverting its own makers’ intentions. Willing human participation in automation-enabled mass killing is being improperly relegated, I attempted to suggest, to manageable, “falling into the wrong hands” edge cases, particularly as the possibility has a self-fulfilling dynamic: if there might exist one clandestine group that wants to and could attain the means to ‘take out’ most of the human population, it would be rational for anyone wishing to survive such a purge to initiate it themselves. Thus, the existence of many groups with a reason to ‘take out’ most of the human population is guaranteed by the emergence of a widely distributable, low side-effect mass-killing technology like AI.

I received some insightful responses, for which I’m grateful. But I perceived the post as being mostly ignored. Granted, it was not well-written. Nevertheless, the basic idea was there, and no comments were offered that I felt satisfactorily put it to rest. Given that AI ‘misalignment’ is a favorite topic of this community, a claim about an AI risk that is just as catastrophic and more likely might be expected to be taken up enthusiastically, no matter how inelegantly it is presented.

To be fair, there is no actual inconsistency here. LW is not an AI risk community. It’s OK to be interested in ‘alignment’ because ‘alignment’ is interesting, and to withhold engagement from adjacent problems one finds less interesting. What I’ve found when discussing this topic with other people, though, no matter how clever, is a visceral resistance to normalizing mass killing. It is more comfortable to treat it as deviant behavior, not something that could be predicted of reasonable, rational people (including, of course, oneself) given the right circumstances. This community, despite its recognition of the possible correctness of the Orthogonality Thesis, meaning the intentions of artificial intelligent agents or aliens cannot be trusted, seems to me to place faith in the benevolent tendencies of intelligent human agents. But, consequences aside, blind faith clashes with its explicitly stated aim of seeking to hold true beliefs and to “each day, be less wrong about the world.” I hope my argument is wrong in fundamental ways, not cosmetic ones, and so I hope the Less Wrong community will engage with it to refute it. That invitation is the main thing, but I’ll throw in a modest expansion of the sketchy argument.

A response I received (again, thanks) compared my claim that the existence of mass killing technology triggers its use to von Neumann’s failed argument for the US to launch a preemptive nuclear strike against the Soviet Union. Von Neumann argued that as a nuclear exchange was inevitable, it would be strategically advantageous to get it out of the way: “if you say today at five o’clock, I say why not one o’clock?” Actual humans rejected von Neumann’s logic, which he based on a game theory model of human behavior, and thereby avoided a horrific outcome.

The situations are, indeed, similar in important ways, both in the repugnant acts being driven by self-fulfilling beliefs about what others would do, and the massive loss of human life. But the mass killing is only one aspect of what would have made the preemptive strike horrific, and we should not rush to generalize to “human benevolence will override game theory rationality to avoid massive loss of human life.” Rather, the nuclear exchange outcomes that were (and continue to be) avoided no one actually wants, demonstrating that human intuition is more self-interestedly rational than the rational actor in von Neumann’s model (Amartya Sen has said as much when saying that the “purely economic man is indeed close to being a social moron”). In contrast, a serious case for intentionally reducing the human population is already being made on its own (let us call it Malthusian) merits, serious enough to demand attempts at anti-Malthusian rebuttal, including from members of this community. One of the pillars of the anti-Malthusianism position is simply that it’s wrong to wish for an end-state whose realization requires the death or forced sterilization of billions of people. This is a normative assertion, not a predictive one: if one has to say “it’s wrong to kill or sterilize billions of people,” one has practically already conceded that there may be rational reasons for doing so, if one could, but one ought not to as a good person, meaning “a person who will hold true to a mutually beneficial agreement not to engage in such behavior.” But in the presence of sufficiently accessible technology that would make intentional low-pop feasible, embracing the reasons for doing so, seeing them as right, becomes not just a preference but a survival strategy. To overcome this, to prevent defections from the “killing is wrong” regime, a high-pop world would have to be both feasible and clearly, overwhelmingly better than the low-pop one.

Given the current discourse on climate change as an existential threat, it hardly feels necessary to spell out the Malthusian argument. In the absence of any significant technological developments, sober current trajectory predictions seem to me to range from ‘human extinction’ to ‘catastrophic, but survivable,’ involving violent paths to low-pop (or no-pop) states. Another pillar of anti-Malthusianism is that even the ‘better’ of these scenarios, where population and natural resource consumption peacefully stabilize, things don’t look great. The modern global political-economy is relatively humane (in the Pinkerian sense) under conditions of growth, which, under current conditions, depends on a growing population and rising consumption. Under stagnant or deflationary conditions it can be expected to become more cutthroat, violent, undemocratic and unjust. So far, high-pop is either suicidal or dystopian.

So how do these bad options compare against a world that has been managed into a low-pop state? A thought experiment: if you could run simulations of transporting a hand-picked crew (sized to your choosing) from Earth’s present population to an exact replica of Earth, including all of its present man-made infrastructure and acquired knowledge, just not all of its people, what percentage of those simulations would you predict to produce good, even a highly enviable, lives for the crew and their descendants? High predictions seem warranted. Present views on many varieties of small population human life, even with their lack of access to science and technology, are favorable; life as a highly adaptable, socially cooperative apex predator can be quite good. With the addition of the accumulated knowledge from agrarian and industrial societies, but not the baggage from their unsustainable growth patterns, it could be very good indeed.

I did just compare the current human trajectory to the low population alternative absent significant technological developments, which unfairly eliminates what I take to be most secular people’s source of hope about the current trajectory, and the basis for anti-Malthusians’ argument that everything will turn out rosy as long as we find a humane way to keep on breeding: in a high-and-growing-pop world, humanity will keep innovating its way around its problems.

But a safe off-ramp from the growth trap counts as a solution produced by the high-pop innovative world to solve its high-pop problems. Specifically, the high-pop root cause problem. Other specific solutions only treat individual symptoms, or side-effects of prior treatments, or side-effects of treatments of side-effects, and so on. Energy from cold fusion, if not sucked up by cryptocurrency mining or squashed by the fossil fuel industry, may help us dodge the climate change bullet, but doesn’t remove “forever chemicals” from the environment, nor prevent over-fishing in the oceans, nor induce people in developed countries to have more babies (who will exacerbate over-fishing and die of cancer from forever chemicals). Conversely, a low-pop world doesn’t need to care about cold fusion or lab-grown fish meat. What the anti-Malthusians offer from a high-pop world is a vague promise of continued innovation: yes, cancer caused by forever chemicals today, but someday, as long as we don’t stop growing now, a cure for all cancers or even mortality itself. Even if this astounding claim is theoretically possible, it fails as soon as innovation itself is automated, which is exactly what all AI research is hell-bent on achieving, ‘misalignment’ risks be damned. If and when it is achieved, that technology, too, can be carried into the low-pop world, where it can be more carefully supervised and not wasted on billions of dead-weight, surplus people.

I have proposed human-controlled automated mass-killing technology as more dangerous to the average person than a malevolent artificial superintelligence because it is more task-specific and therefore technically simpler to achieve than general intelligence, doesn’t require escaping its own creators’ controls, and because once developed there is a race to be the first to put it to use. I am willing to concede that humans’ cooperative powers may suppress the self-triggering dynamic to support the achievement of general intelligence first, if the ‘misalignment’ concerns from ‘going all the way’ to AGI appear to have been addressed and the crises of the high-pop world don’t force the issue. With respect to the survival prospects for the average human, this seems to me to be a minor detail.