A plea for having the courage to morally stigmatize the people working in the AGI industry:
I agree with Nate Soares that we need to show much more courage in publicly sharing our technical judgments about AI risks—based on our understanding of AI, the difficulties of AI alignment, the nature of corporate & geopolitical arms races, the challenges of new technology regulation & treaties, etc.
But we also need to show much more courage in publicly sharing our social and moral judgments about the evils of the real-life, flesh-and-blood people who are driving these AI risks—specifically, the people leading the AI industry, working in it, funding it, lobbying for it, and defending it on social media.
Sharing our technical concerns about these abstract risks isn’t enough. We also have to morally stigmatize the specific groups of people imposing these risks on all of us.
We need the moral courage to label other people evil when they’re doing evil things.
If we don’t do this, we look like hypocrites who don’t really believe that AGI/ASI would be dangerous.
Moral psychology teaches us that moral judgments are typically attached not just to specific actions, or to emergent social forces (e.g. the ‘Moloch’ of runaway competition), or to sad Pareto-inferior outcomes of game-theoretic dilemmas, but to people. We judge people. As moral agents. Yes, even including AI researchers and devs.
If we want to make credible claims that ‘the AGI industry is recklessly imposing extinction risks on all of our kids’, and we’re not willing to take the next step of saying ‘and also, the people working on AGI are reckless and evil and wrong and should be criticized, stigmatized, ostracized, and punished’, then nobody will take us seriously.
As any parent knows, if some Bad Guy threatens your kids, you defend your kids and you denounce the Bad Guy. Your natural instinct is to rally social support to punish them. This is basic social primate parenting, of the sort that’s been protecting kids in social groups for tens of millions of years.
If you don’t bother to rally morally outraged support against those who threaten kids, then the threat wasn’t real. This is how normal people think. And rightfully so.
So why don’t we have the guts to vilify the AGI leaders, devs, investors, and apologists, if we’re so concerned about AGI risk?
Because too many rationalists, EAs, tech enthusiasts, LessWrong people, etc still see those AI guys as ‘in our tribe’, based on sharing certain traits we hold dear—high IQ, high openness, high decoupling, Aspy systematizing, Bay Area Rationalist-adjacent, etc. You might know some of the people working in OpenAI, Anthropic, DeepMind, etc. -- they might be your friends, housemates, neighbors, relatives, old school chums, etc.
But if you take seriously their determination to build AGI/ASI—or even to work in ‘AI safety’ at those companies, doing their performative safety-washing and PR—then they are not the good guys.
We have to denounce them as the Bad Guys. As traitors to our species. And then, later, once they’ve experienced the most intense moral shame they’ve ever felt, and gone through a few months of the worst existential despair they’ve ever felt, and they’ve suffered the worst social ostracism they’ve ever experienced, we need to offer them a path towards redemption—by blowing the whistle on their former employers, telling the public what they’ve seen on the inside of the AI industry, and joining the fight against ASI.
This isn’t ‘playing dirty’ or ‘giving in to our worst social instincts’. On the contrary. Moral stigmatization and ostracism of evil-doers is how social primate groups have enforced cooperation norms for millions of years. It’s what keeps the peace, and supports good social norms, and protects the group. If we’re not willing to use the moral adaptations that evolved specifically to protect our social groups from internal and external threats, then we’re not really taking those threats seriously.
PS I outlined this ‘moral backlash’ strategy for slowing reckless AI development in this EA Forum point
Sharing our technical concerns about these abstract risks isn’t enough. We also have to morally stigmatize
I’m with you up until here; this isn’t just a technical debate, it’s a moral and social and political conflict with high stakes, and good and bad actions.
the specific groups of people imposing these risks on all of us.
To be really nitpicky, I technically agree with this as stated: we should stigmatize groups as such, e.g. “the AGI capabilities research community” is evil.
We need the moral courage to label other people evil when they’re doing evil things.
Oops, this is partially but importantly WRONG. From Braχot 10a:
With regard to the statement of Rabbi Yehuda, son of Rabbi Shimon ben Pazi, that David did not say Halleluya until he saw the downfall of the wicked, the Gemara relates: There were these hooligans in Rabbi Meir’s neighborhood who caused him a great deal of anguish. Rabbi Meir prayed for God to have mercy on them, that they should die. Rabbi Meir’s wife, Berurya, said to him: What is your thinking? On what basis do you pray for the death of these hooligans? Do you base yourself on the verse, as it is written: “Let sins cease from the land” (Psalms 104:35), which you interpret to mean that the world would be better if the wicked were destroyed? But is it written, let sinners cease?” Let sins cease, is written. One should pray for an end to their transgressions, not for the demise of the transgressors themselves.
Not everyone who is doing evil things is evil. Some people are evil. You should hate no more than necessary, but not less than that. You should hate evil, and hate evildoers if necessary, but not if not necessary.
Schmidhuber? Evil. Sutton? Evil. Larry Page? Evil. If, after reflection, you endorse omnicide, you’re evil. Altman? Evil and probably a sociopath.
Up-and-coming research star at an AI lab? Might be evil, might not be. Doing something evil? Yes. Is evil? Maybe, it depends.
Essentializing someone by calling them evil is an escalation of a conflict. You’re closing off lines of communication and gradual change. You’re polarizing things: it’s harder for that one person to make gradual moves in belief space and social space and life-narrative space, and it’s harder for groups to have group negotiations. Sometimes escalation is good and difficult and necessary, but sometimes escalation is really bad! Doing a more complicated subtle thing with more complicated boundaries is more difficult. And more brave, if we’re debating bravery here.
So:
Good:
The work you’re doing is evil.
Good:
The goal of this company is among the evilest possible goals ever.
Good:
If you ignore the world-class experts saying that your work might kill everyone, then you are being a disgusting creep and will be responsible for killing everyone.
Bad:
You’re a creep / evil / bad person.
Sidenote:
Because too many rationalists, EAs, tech enthusiasts, LessWrong people, etc still see those AI guys as ‘in our tribe’, based on sharing certain traits we hold dear
I agree that this is an improper motivation for treating some actions with kid gloves, which will lead to incorrect action; and that this is some of what’s actually happening.
I understand your point about labelling industries, actions, and goals as evil, but being cautious about labelling individuals as evil.
But I don’t think it’s compelling.
You wrote ‘You’re closing off lines of communication and gradual change. You’re polarizing things.’
Yes, I am. We’ve had open lines of communication between AI devs and AI safety experts for a decade. We’ve had pleas for gradual change. Mutual respect, and all that. Trying to use normal channels of moral persuasion. Well-intentioned EAs going to work inside the AI companies to try to nudge them in safer directions.
None of that has worked. AI capabilities development is outstripping AI safety developments at an ever-increasing rate. The financial temptations to stay working inside AI companies keep increasing, even as the X risks keep increasing. Timelines are getting shorter.
The right time to ‘polarize things’ is when we still have some moral and social leverage to stop reckless ASI development. The wrong time is after it’s too late.
Altman, Amodei, Hassabis, and Wang are buying people’s souls—paying them hundreds of thousands or millions of dollars a year to work on ASI development, despite most of their workers they supervise knowing that they’re likely to be increasing extinction risk.
This isn’t just a case of ‘collective evil’ being done by otherwise good people. This is a case of paying people so much that they ignore their ethical qualms about what they’re doing. That makes the evil very individual, and very specific. And I think that’s worth pointing out.
(This rhetoric is not quite my rhetoric, but I want to affirm that I do believe that ~most people working at big AI companies are contributing to the worst atrocity in human history, are doing things that are deontologically prohibited, and are morally responsible for that.)
For one, I think I’m a bit scared of regretting my choices. Like, calling someone evil and then being wrong about it isn’t something where you just get to say “oops, I made a mistake” afterwards, you did meaningfully move to socially ostracize someone, mark them as deeply untrustworthy, and say that good people should remove their power, and you kind of owe them something significant if you get that wrong.
For two, a person who has done evil, versus a person who is evil, are quite different things. I think that it’s sadly not always the case that a person’s character is aligned with a particular behavior of theirs. I think it’s not accurate to think of all the people building the doomsday machines as generically evil and someone who will do awful things in lots of different contexts, I think there’s a lot of variation in the people and their psychologies and predispositions, and some are screwing up here (almost unforgivably, to be clear) in ways they wouldn’t screw up in different situations.
For two, a person who has done evil, versus a person who is evil, are quite different things. I think that it’s sadly not always the case that a person’s character is aligned with a particular behavior of theirs.
I do think many of the historical people most widely considered to be evil now were similarly not awful in full generality, or even across most contexts. For example, Eichmann, the ops lead for the Holocaust, was apparently a good husband and father, and generally took care not to violate local norms in his life or work. Yet personally I feel quite comfortable describing him as evil, despite “evil” being a fuzzy folk term of the sort which tends to imperfectly/lossily describe any given referent.
I’m not quite sure what I make of this, I’ll take this opportunity to think aloud about it.
I often take a perspective where most people are born a kludgey mess, and then if they work hard they can become something principled and consistent and well-defined. But without that, they don’t have much in the way of persistent beliefs or morals such that they can be called ‘good’ or ‘evil’.
I think of an evil person as someone more like Voldemort in HPMOR, who has reflected on his principles and will be persistently a murdering sociopath, than someone who ended up making horrendous decisions but wouldn’t in a different time and place. I think if you put me under a lot of unexpected political forces and forced me to make high-stakes decisions, I could make bad decisions, but not because I’m a fundamentally bad person.
I do think it makes sense to write people off as bad people, in our civilization. There are people who have poor impulse control, who have poor empathy, who are pathological liars, and who aren’t save-able by any of our current means, and will always end up in jail or hurting people around them. I rarely interact with such people so it’s hard for me to keep this in mind, but I do believe such people exist.
But evil seems a bit stronger than that, it seems a bit more exceptional. Perhaps I would consider SBF an evil person; he seems to me someone who knew he was a sociopath from a young age, and didn’t care about people, and would lie and deceive, was hyper-competent, and I expect that if you release him into society he will robustly continue to do extreme amounts of damage.
Is that who Eichmann was? I haven’t read the classic book on him, but I thought the point of ‘the banality of evil’ was that he seemed quite boring and like many other people? Is it the case that you could replace Eichmann with like >10% of the population and get similar outcomes? 1%? I am not sure if it is accurate to think of that large a chunk of people as ‘evil’, as being the kind of robustly bad people who should probably be thrown in prison for the protection of civilization. My current (superficial) understanding is that Eichmann enacted an atrocity without being someone who would persistently do so in many societies. He had the capacity for great evil, but this was not something he would reliably seek out.
It is possible that somehow thousands of people like SBF and Voldemort have gotten together to work at AI companies; I don’t currently believe that. To be clear, I think that if we believe there are evil people, then it must surely describe some of the people working at big AI companies that are building doomsday machines, who are very resiliently doing so in the face of knowing that they’re hastening the end of humanity, but I don’t currently think it describes most of the people.
This concludes my thinking aloud; I would be quite interested to read more of how your perspective differs, and why.
Ben—your subtext here seems to be that only lower-class violent criminals are truly ‘evil’, whereas very few middle/upper-class white-collar people are truly evil (with a few notable exceptions such as SBF or Voldemort) -- with the implications that the majority of ASI devs can’t possibly be evil in the ways I’ve argued.
I think that doesn’t fit the psychological and criminological research on the substantial overlap between psychopathy and sociopathy, and between violent and non-violent crime.
It also doesn’t fit the standard EA point that a lot of ‘non-evil’ people can get swept up in doing evil collective acts as parts of collectively evil industries, such as slave-trading, factory farming, Big Tobacco, the private prison system, etc. - but that often, the best way to fight such industries is to use moral stigmatization.
You mis-read me on the first point; I said that (something kind of like) ‘lower-class violent criminals’ are sometimes dysfunctional and bad people, but I was distinguishing that from someone more hyper competent and self-aware like SBF or Voldemort; I said that only the latter are evil. (For instance, they’ve hurt orders of magnitude more people.)
(I’m genuinely not sure what research you’re referring to – I am expect you are 100x as familiar with the literature as I am, and FWIW I’d be happy to get a pointer or two of things to read.[1])
The standard EA point is to use moral stigmatization? Even if that’s accurate, I’m afraid I no longer have any trust in EAs to do ethics well. As an example that you will be sympathetic to, lots of them have endorsed working at AI companies over the past decade (but many many other examples have persuaded me of this point).
To be clear, I am supportive of moral stigma being associated with working at AI companies. I’ve shown up to multiple protests outside the companies (and I brought my mum!). If you have any particular actions in mind to encourage me to do (I’m probably not doing as much as I could) I’m interested to hear them. Perhaps you could write a guide to how to act when dealing with people in your social scene who work on building doomsday devices in a way that holds a firm moral line while not being socially self-destructive / not immediately blowing up all of your friendships. I do think more actionable advice would be helpful.
I expect it’s the case that crime rates correlate with impulsivity, low-IQ, and wealth (negatively). Perhaps you’re saying that psychopathy and sociopathy do not correlate with social class? That sounds plausible. (I’m also not sure what you’re referring to with the violent part, my guess is that violent crimes do correlate with social class.)
Is that who Eichmann was? I haven’t read the classic book on him, but I thought the point of ‘the banality of evil’ was that he seemed quite boring and like many other people? Is it the case that you could replace Eichmann with like >10% of the population and get similar outcomes? 1%? I am not sure if it is accurate to think of that large a chunk of people as ‘evil’, as being the kind of robustly bad people who should probably be thrown in prison for the protection of civilization. My current (superficial) understanding is that Eichmann enacted an atrocity without being someone who would persistently do so in many societies. He had the capacity for great evil, but this was not something he would reliably seek out.
Eichmann was definitely evil. The popular conception of Eichmann as merely an ordinary guy who was “just doing his job” and was “boring” is partly mischaracterization of Arendt’s work, partly her own mistakes (i.e., her characterizations, which are no longer considered accurate by historians).
Arendt also notes very explicitly in the article, Eichmann’s evident pride in having been responsible for so many deaths.[24] She notes also the rhetorical contradiction of this pride with his claim that he will gladly go to the gallows as a warning to all future antisemites, as well as the contradiction between this sentiment and the entire argument of his defense (that he should not be put to death). Eichmann, as Arendt observes, does not experience the dissonance between these evidently contradictory assertions but finds the aptness of his clichés themselves to be —from the perspective of his own inclination—satisfactory substitutes for moral or ethical evaluation. He has no concern about their contradiction. This, coupled with an inability to imagine the perspective [of] others, is the individually psychological expression of what Arendt calls the banality of evil.[15] Careerism without moral evaluation of the consequences of one’s work is the collective or social aspect of this banality of evil.
(Hmm… that last line is rather reminiscent of something, no?)
In her 2011 book Eichmann Before Jerusalem, based largely on the Sassen interviews and Eichmann’s notes made while in exile, Bettina Stangneth argues that Eichmann was an ideologically motivated antisemite and lifelong committed Nazi who intentionally built a persona as a faceless bureaucrat for presentation at the trial.[228] Historians such as Christopher Browning, Deborah Lipstadt, Yaacov Lozowick, and David Cesarani reached a similar conclusion: that Eichmann was not the unthinking bureaucratic functionary that Arendt believed him to be.[229] Historian Barbara W. Tuchman wrote of Eichmann, “The evidence shows him pursuing his job with initiative and enthusiasm that often outdistanced his orders. Such was his zeal that he learned Hebrew and Yiddish the better to deal with the victims.”[230] Concerning the famous characterisation of his banality, Tuchman observed, “Eichmann was an extraordinary, not an ordinary man, whose record is hardly one of the ‘banality’ of evil. …”
I am one of those people that are supposed to be stigmatized/detterred by this action. I doubt this tactic will be effective. This thread (including the disgusting comparison to Eichmann who directed the killing of millions in the real world—not in some hypothetical future one) does not motivate me to interact with the people holding such positions. Given that much of my extended family was wiped out by the holocaust, I find these Nazi comparisons abhorrent, and would not look forward to interact with people making them whether or not they decide to boycott me.
BTW this is not some original tactic, PETA is using similar approaches for veganism. I don’t think they are very effective either.
To @So8res—I am surprised and disappointed that this Godwin’s law thread survived a moderation policy that is described as “Reign of Terror”
I’ve often appreciated your contributions here, but given the stakes of existential risk, I do think that if my beliefs about risk from AI are even remotely correct, then it’s hard to escape the conclusion that the people presently working at labs are committing the greatest atrocity that anyone in human history has or will ever commit.
The logic of this does not seem that complicated, and while I disagree with Geoffrey Miller on how he goes about doing things, I have even less sympathy for someone reacting to a bunch of people really thinking extremely seriously and carefully about whether what that person is doing might be extremely bad with “if people making such comparisons decide to ostracize me then I consider it a nice bonus”. You don’t have to agree, but man, I feel like you clearly have the logical pieces to understand why one could believe you are causing extremely great harm, without that implying the insanity of the person believing that.
I respect at least some of the people working at capability labs. One thing that unites all of the ones I do respect is that they treat their role at those labs with the understanding that they are in a position of momentous responsibility, and that them making mistakes could indeed cause historically unprecedented levels of harm. I wish you did the same here.
I edited the original post to make the same point with less sarcasm.
I take risk from AI very seriously which is precisely why I am working in alignment at OpenAI. I am also open to talking with people having different opinions, which is why I try to follow this forum (and also preordered the book). But I do draw the line at people making Nazi comparisons.
FWIW I think radicals often hurt the causes they espouse, whether it is animal rights, climate change, or Palestine. Even if after decades the radicals are perceived to have been on “the right side of history”, their impact was often negative and it caused that to have taken longer: David Shor was famously cancelled for making this point in the context of the civil rights movement.
Sorry to hear the conversation was on a difficult topic for you; I imagine that is true for many of the Jewish folks we have around these parts.
FWIW I think we were discussing Eichmann in order to analyze what ‘evil’ is or isn’t, and did not make any direct comparisons between him and anyone.
...oh, now I see what Said’s “Hmm… that last line is rather reminiscent of something, no?” is probably making such a comparison (I couldn’t tell what he meant it when I read it initially). I can see why you’d respond negatively to that. While there’s a valid point to be made about how people who just try to gain status/power/career-capital without thinking about ethics can do horrendous things, I do not think that it is healthy for discourse to express that in the passive-aggressive way that Said did.
The comparisons invite themselves, frankly. “Careerism without moral evaluation of the consequences of one’s work” is a perfect encapsulation of the attitudes of many of the people who work in frontier AI labs, and I decline to pretend otherwise.
(And I must also say that I find the “Jewish people must not be compared to Nazis” stance to be rather absurd, especially in this sort of case. I’m Jewish myself, and I think that refusing to learn, from that particular historical example, any lessons whatsoever that could possibly ever apply to our own behavior, is morally irresponsible in the extreme.)
EDIT: Although the primary motivation of my comment about Eichmann was indeed to correct the perception of the historians’ consensus, so if you prefer, I can remove the comparison to a separate comment; the rest of the comment stands without that part.
To be clear, I would approve more of a comment that made the comparison overtly[0], rather than one that made it in a subtle way that was harder to notice or that people missed (I did not realize what you were referring to until I tried to puzzle at why boaz had gotten so upset!). I think it is not healthy for people to only realize later that they were compared to Nazis. And I think it fair for them to consider that an underhanded way to cause them social punishment, to do it in a way that was hard to directly respond to. I believe it’s healthier for attacks[1] to be open and clear.
[0] To be clear, there may still be good reasons to not throw in such a jab at this point in the conversation, but my main point is that doing it with subtlety makes it worse, not better, because it also feels sneaky.
[1] “Attacks”, a word which here means “statements that declare someone has a deeply rotten character or whose behavior has violated an important norm, in a way that if widely believed will cause people to punish them”.
(I don’t mean to derail this thread with discussion of discussion norms. Perhaps if we build that “move discourse elsewhere button” that can later be applied back to this thread.)
boazbarak—I don’t understand your implication that my position is ‘radical’.
I have exactly the same view on the magnitude of ASI extinction risk that every leader of a major AI company does—that it’s a significant risk.
The main difference between them and me is that they are willing to push ahead with ASI development despite the significant risk of human extinction, and I think they are utterly evil for doing so, because they’re endangering all of our kids.
In my view, risking extinction for some vague promise of an ASI utopia is the radical position. Protecting us from extinction is a mainstream, commonsense, utterly normal human position.
I consider the following question-cluster to be squarely topical: “Suppose one believes it is evil to advance AI capabilities towards superintelligence, on the grounds that such a superintelligence would quite likely to kill us all. Suppose further that one fails to unapologetically name this perceived evil as ‘evil’, e.g. out of a sense of social discomfort. Is that a failure of courage, in the sense of this post?”
I consider the following question-cluster to be a tangent: “Suppose person X is contributing to a project that I believe will, in the future, cause great harms. Does person X count as ‘evil’? Even if X agrees with me about which outcomes are good and disagrees about the consequences of the project? Even if the harms of the project have not yet occurred? Even if X would not be robustly harmful in other circumstances? What if X thinks they’re trying to nudge the project in a less-bad direction?”
I consider the following sort of question to be sliding into the controversy attractor: “Are people working at AI companies evil?”
The LW mods told me they’re considering implementing a tool to move discussions to the open thread (so that they may continue without derailing the topical discussions). FYI @habryka: if it existed, I might use it on the tangents, idk. I encourage people to pump against the controversy attractor.)
I agree with you on the categorization of 1 and 2. I think there is a reason why Godwin’s law was created once thread follow the controversy attractor to this direction they tend to be unproductive.
I completely agree this discussion should be moved outside your post. But the counterintuitive mechanics of LessWrong mean a derailing discussion may actually increase the visibility and upvotes of your original message (by bumping it in the “recent discussion”).
(It’s probably still bad if it’s high up in the comment section.)
It’s too bad you can only delete comment threads, you can’t move them to the bottom or make them collapsed by default.
I agree that a comparison to Eichmann is not optimal.
Instead, if AI turns out to have consequences so bad that they outweigh the good, it’ll have been better to compare people working at AI labs to Thomas Midgley, who insisted that his leaded gasoline couldn’t be dangerous even when presented with counter-evidence, and Edward Teller, who (as far as I can tell) was simply fascinated by the engineering challenges of scaling hydrogen bombs to levels that could incinerate entire continents.
These two people still embody two archetypes of what could reasonably be called “evil”, but arguably fit better with the psychology of people currently working at AI labs.
These two examples also avoid Godwin’s law type attractors.
That’s interesting to hear that many historians believe he was secretly more ideologically motivated than Arendt thought, and also believe that he portrayed a false face during all of the trials, thanks for the info.
Yes, it takes courage to call people out as evil, because you might be wrong, you might unjustly ruin their lives, you might have mistakenly turned them into scapegoat, etc. Moral stigmatization carries these risks. Always has.
And people understand this. Which is why, if we’re not willing to call the AGI industry leaders and devs evil, then people will see us failing to have the courage of our convictions. They will rightly see that we’re not actually confident enough in our judgments about AI X-risk to take the bold step of pointing fingers and saying ‘WRONG!’.
So, we can hedge our social bets, and try to play nice with the AGI industry, and worry about making such mistakes. Or, we can save humanity.
To be clear, I think it would probably be reasonable for some external body like the UN to attempt to prosecute & imprison ~everyone working at big AI companies for their role in racing to build doomsday machines. (Most people in prison are not evil.) I’m a bit unsure if it makes sense to do things like this retroactively rather than to just outlaw it going forward, but I think it sometime makes sense to prosecute atrocities after the fact even if there wasn’t a law against it at the time. For instance my understanding is that the Nuremberg trials set precedents for prosecuting people for war crimes, crimes against humanity, and crimes against peace, even though legally they weren’t crimes at the time that they happened.
I just have genuine uncertainty about the character of many of the people in the big AI companies and I don’t believe they’re all fundamentally rotten people! And I think language is something that can easily get bent out of shape when the stakes are high, and I don’t want to lose my ability to speak and be understood. Consequently I find I care about not falsely calling people’s character/nature evil when what I think is happening is that they are committing an atrocity, which is similar but distinct.
My answer to this is “because framing things in terms of evil turns the situation more mindkilly, not really the right gears, and I think this domain needs clarity-of-thought more than it needs a social-conflict orientation”
(I’m not that confident about that, and don’t super object to other people calling them evil. But I think “they are most likely committing a great atrocity” is pretty non-euphemistic and more true)
I think this domain needs clarity-of-thought more than it needs a social-conflict orientation
Openly calling people evil has some element of “deciding who to be in social conflict with”, but insofar as it has some element of “this is simply an accurate description of the world” then FWIW I want to note that this consideration partially cuts in favor of just plainly stating what counts as evil, even whether specific people have met that bar.
If I thought evil was a more useful gear here I’d be more into it. Like, I think “they probably committing an atrocity” carves reality at the joints.
I think there are maybe 3 people involved with AGI development who seem like they might be best described as “evil” (one of whom is Sam Altman, who I feel comfortable naming because I think we’ve seen evidence of him doing nearterm more mundane evil, rather than making guesses about their character and motivations)
I think it probably isn’t helpful to think of Eichmann as evil, though again fine to say “he commited atrocities” or even “he did evil.”
As an aside, I notice that I currently feel much more reticent to name individuals who there is not some sort of legal/civilizational consensus about their character. I think I am a bit worried about contributing to the dehumanization of people who are active players, and a drop in basic standards of decency toward them, evenif I were to believe they were evil.
I’m personally against this as matter of principle, and I also don’t think it’ll work.
Moral stigmatizing only works against a captive audience. It doesn’t work against people who can very easily ignore you.
You’re more likely to stop eating meat if a kind understanding vegetarian/vegan talks to you and makes you connect with her story of how she stopped eating meat. You’re more likely to simply ignore a militant one who calls you a murderer.
Moral stigmatizing failed to stop nuclear weapon developers, even though many of them were the same kind of “nerd” as AI researchers.
People see Robert Oppenheimer saying “Now, I am become Death, the destroyer of worlds” as some morally deep stuff. “The scientific community ostracized [Edward] Teller,” not because he was very eager to build bigger bombs (like the hydrogen bomb and his proposed Sundial), but because he made Oppenheimer lose his security clearance by saying bad stuff about him.
Which game do you choose to play? The game of dispassionate discussion, where the truth is on your side? Or the game of Twitter-like motivated reasoning, where your side looks much more low status than the AI lab people, and the status quo is certainly not on your side?
Imagine how badly we’ll lose the argument if people on our side are calling them evil and murderous and they’re talking like a sensible average Joe trying to have a conversation with us.
Moral stigmatization seems to backfire rather than help for militant vegans because signalling hostility is a bad strategy when you’re the underdog going against the mainstream. It’s extremely big ask for ordinary people show hostility towards other ordinary people who no one else is hostile towards. It’s even difficult for ordinary people to be associated with a movement which shows such hostility. Most people just want to move on with their lives.
I think you’re underestimating the power of backlashes to aggressive activism. And I say this, despite the fact just a few minutes ago I was arguing to others that they’re overestimating the power of backlashes.
The most promising path to slowing down AI is government regulation, not individuals ceasing to do AI research.
- Think about animal cruelty. Government regulation has succeeded on this many times. Trying to shame people who work in factory farms into stopping, has never worked, and wise activists don’t even consider doing this.
- Think about paying workers more. Raising the minimum wage works. Shaming companies into feeling guilty doesn’t. Even going on strike doesn’t work as well as minimum wage laws. - Despite the fact half of the employees refusing to work is like 10 times more powerful than non-employees holding a sign saying “you’re evil.” - Especially a tiny minority of society holding those signs
- Though then again, moral condemnation is a source of government regulation.
Disclaimer: not an expert just a guy on the internet
Strong disagree, but strong upvote because it’s “big if true.” Thank you for proposing a big crazy idea that you believe will work. I’ve done that a number of times, and I’ve been downvoted into the ground without explanation, instead of given any encouraging “here’s why I don’t think this will work, but thank you.”
I’m curious whether you read the longer piece about moral stigmatization that I linked to at EA Forum? It’s here, and it addresses several of your points.
I have a much more positive view about the effectiveness of moral stigmatization, which I think has been at the heart of almost every successful moral progress movement in history. The anti-slavery movement stigmatized slavery. The anti-vivisection movement stigmatized torturing animals for ‘experiments’. The women’s rights movement stigmatized misogyny. The gay rights movement stigmatized homophobia.
After the world wars, biological and chemical weapons were not just regulated, but morally stigmatized. The anti-landmine campaign stigmatized landmines.
Even in the case of nuclear weapons, the anti-nukes peace movement stigmatized the use and spread of nukes, and was important in nuclear non-proliferation, and IMHO played a role in the heroic individual decisions by Arkhipov and others not to use nukes when they could have.
Regulation and treaties aimed to reduce the development, spread, and use of Bad Thing X, without moral stigmatization of Bad Thing X, doesn’t usually work very well. Formal law and informal social norms must typically reinforce each other.
I see no prospect for effective, strongly enforced regulation of ASI development without moral stigmatization of ASI development. This is because, ultimately, ‘regulation’ relies on the coercive power of the state—which relies on agents of the state (e.g. police, military, SWAT teams, special ops teams) being willing to enforce regulations even against people with very strong incentives not to comply. And these agents of the state simply won’t be willing to use government force against ASI devs violating regulations unless these agents already believe that the regulations are righteous and morally compelling.
That’s a very good point, and these examples really changes my intuition from “I can’t see this being a good idea,” to “this might make sense, this might not, it’s complicated.” And my earlier disagreement mostly came from my intuition.
I still have disagreements, but just to clarify I now agree your idea deserves more attention that it’s getting.
My remaining disagreement is I think stigmatization only reaches the extreme level of “these people are literally evil and vile,” after the majority of people already agree.
In places in India where the majority of people are already vegetarians, and already feel that eating meat is wrong, the social punishment of meat eaters does seem to deter them.
But in places where most people don’t think eating meat is wrong, prematurely calling meat eaters evil may backfire. This is because you’ve created a “moral-duel” where you force outside observers to either think the meat-eater is the bad guy, or you’re the bad guy (or stupid guy). This “moral-duel” drains the moral standing of both sides.
If you’re near the endgame, and 90% of people already are vegetarians, then this moral-duel will first deplete the meat-eater’s moral standing, and may solidify vegetarianism.
But if you’re at the beginning, when only 1% of people support your movement. You desperately want to invest your support and credibility into further growing your support and credibility, rather than burning it in a moral-duel against the meat-eater majority the way militant vegans did.
Nurturing credibility is especially important for AI Notkilleveryoneism, where the main obstacle is a lack of credibility and “this sounds like science fiction.”
Finally, at least only go after the AI lab CEOs, as they have relatively less moral standing, compared to the rank and file researchers.
E.g. in this quicktake Mikhail Samin appealed to researchers as friends asking them to stop “deferring” to their CEO.
Even for nuclear weapons, biological weapons, chemical weapons, landmines, it was hard to punish scientists researching it. Even for the death penalty, it was hard to punish the firing squad soldiers. It’s easier to stick it to the leaders. In an influential book by early feminist Lady Constance Lytton, she repeatedly described the policemen (who fought the movement) and even prison guards as very good people and focused the blame on the leaders.
PS: I read your post, it was a fascinating read. I agree with the direction of it and I agree the factors you mention are significant, but it might not go quite as far as you describe?
Knight—thanks again for the constructive engagement.
I take your point that if a group is a tiny and obscure minority, and they’re calling the majority view ‘evil’, and trying to stigmatize their behavior, that can backfire.
However, the surveys and polls I’ve seen indicate that the majority of humans already have serious concerns about AI risks, and in some sense are already onboard with ‘AI Notkilleveryoneism’. Many people are under-informed or misinformed in various ways about AI, but convincing the majority of humanity that the AI industry is acting recklessly seems like it’s already pretty close to feasible—if not already accomplished.
I think the real problem here is raising public awareness about how many people are already on team ‘AI Notkilleveryoneism’ rather than team ‘AI accelerationist’. This is a ‘common knowledge’ problem from game theory—the majority needs to know that they’re in the majority, in order to do successful moral stigmatization of the minority (in this case, the AI developers).
Haha you’re right, in another comment I was saying
55% of Americans surveyed agree that “mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” Only 12% disagree.
To be honest, I’m extremely confused. Somehow, AI Notkilleveryoneism… is both a tiny minority and a majority at the same time.
I think the real problem here is raising public awareness about how many people are already on team ‘AI Notkilleveryoneism’ rather than team ‘AI accelerationist’. This is a ‘common knowledge’ problem from game theory—the majority needs to know that they’re in the majority,
That makes sense, it seems to explain things. The median AI expert also has a 5% to 10% chance of extinction, which is huge.
I’m still not in favour of stigmatizing AI developers, especially right now. Whether AI Notkilleveryoneism is a real minority or an imagined minority, if it gets into a moral-duel with AI developers, it will lose status, and it will be harder for it to grow (by convincing people to agree with it, or by convincing people who privately agree to come out of the closet).
People tend to follow “the experts” instead of their very uncertain intuitions about whether something is dangerous. With global warming, the experts were climatologists. With cigarette toxicity, the experts were doctors. But with AI risk, you were saying that,
Thousands of people signed the 2023 CAIS statement on AI risk, including almost every leading AI scientist, AI company CEO, AI researcher, AI safety expert, etc.
It sounds like, the expertise people look to when deciding “whether AI risk is serious or sci-fi,” comes from leading AI scientists, and even AI company CEOs. Very unfortunately, we may depend on our good relations with them… :(
Moral ostracisation of factory farmers is somewhat ineffective because the vast majority of people are implicated in factory farming. They fund it every day and view eating animals as a key part of their identity.
Calling factory farming murder/torture is calling nearly every member of the public a murderer/torturer. (Which may be true but is unlikely to get them to change their habits)
Calling the race to ASI murder is only calling AI researchers and funders murderers. The general public are not morally implicated and don’t view use of AI as a key part of their identity.
The polling shows that they’re not on board with the pace of AI development, think it poses a significant risk of human extinction, and that they don’t trust the CEOs of AI companies to act responsibly.
That’s a very good point, and I didn’t really analyze the comparison.
I guess maybe meat eating isn’t the best comparison.
The closest comparison might be researchers developing some other technology, which maybe 2⁄3 people see as a net negative. E.g. nuclear weapons, autonomous weapons, methods for extracting fossil fuel, tobacco, etc.
But no campaign even really tried to stigmatize these researchers. Every single campaign against these technologies have targeted the companies, CEOs, or politicians leading them, without really any attack towards the researchers. Attacking them is sort of untested.
We have to denounce them as the Bad Guys. As traitors to our species. And then, later, once they’ve experienced the most intense moral shame they’ve ever felt, [etc. contd. p.94]
This is self-indulgent, impotent fantasy. Everyone agrees that people hurting children is bad. People are split on whether AGI/ASI is an existential threat.[1] There is no “we” beyond “people who agree with you”. “They” are not going to have anything like the reaction you’re imagining. Your strategy of screaming and screaming and screaming and screaming and screaming and screaming and screaming and screaming is not an effective way of changing anyone’s mind.
Richard—I think you’re just factually wrong that ‘people are split on whether AGI/ASI is an existential threat’.
Thousands of people signed the 2023 CAIS statement on AI risk, including almost every leading AI scientist, AI company CEO, AI researcher, AI safety expert, etc.
There are a few exceptions, such as Yann LeCun. And there are a few AI CEOs, such as Sam Altman, who had previously acknowledged the existential risks, but now downplay them.
But if all the leading figures in the industry—including Altman, Amodei, Hassabis, etc—have publicly and repeatedly acknowledged the existential risks, why would you claim ‘people are split’?
But if all the leading figures in the industry—including Altman, Amodei, Hassabis, etc—have publicly and repeatedly acknowledged the existential risks, why would you claim ‘people are split’?
You just mentioned LeCun and “a few AI CEOs, such as Sam Altman” as exceptions, so it isn’t by any means “all the leading figures”. I would also name Mark Zuckerberg, who has started “Superintelligence Labs” with the aim of “personal superintelligence for everyone”, with nary a mention of how if anyone builds it, everyone dies. Presumably all the talent he’s bought are on board with that.
I also see various figures (no names to hand) pooh-poohing the very idea of ASI at all, or of ASI as existential threat. They may be driven by the bias of dismissing the possibility of anything so disastrous as to make them have to Do Something and miss lunch, but right or wrong, that’s what they say.
And however many there are on each side, I stand by my judgement of the futility of screaming shame at the other side, and of the self-gratifying fantasy about how “they” will react.
I think this is a pretty unhelpful frame. Most people working at an AI lab are somewhere between “person of unremarkable moral character who tells themselves a vague story about how they’re doing good things” and “deeply principled person trying their best to improve the world as best they can”. I think working at an AI lab requires less failure of moral character than, say, working at a tobacco company, for all that the former can have much worse effects on the world.
There are a few people I think it is fair to describe as actively morally bad, and willfully violating deontology—it seems likely to me that this is true of Sam Altman, for instance—but I think “evil” is just not a very helpful word here, will not usefully model the actions of AI lab employees, and will come across as obviously disingenuous to anyone who hears such rhetoric if they actually interact with any of the people you’re denigrating. If you had to be evil to end the world, the world would be a lot safer!
I think it’s fine and good to concentrate moral opprobrium at specific actions people take that are unprincipled or clear violations of deontology—companies going back on commitments, people taking on roles or supporting positions that violate principles they’ve previously expressed, people making cowardly statements that don’t accurately reflect their beliefs for the sake of currying favor. I think it’s also fine and good to try and convince people that what they’re doing is harmful, and that they should quit their jobs or turn whistleblower or otherwise change course. But the mere choice of job title is usually not a deontology violation for these people, because they don’t think it has the harms to the world you think it does! (I think at this point it is probably somewhat of a deontological violation to work in most roles at OpenAI or Meta AI even under typical x-risk-skeptical worldviews, but only one that indicates ethical mediocrity rather than ethical bankruptcy.)
(For context, I work on capabilities at Anthropic, because I think that reduces existential risk on net; I think there’s around a 25% chance that this is a horrible mistake and immensely harmful for the world. I think it’s probably quite bad for the world to work on capabilities at other AI labs.)
But the mere choice of job title is usually not a deontology violation for these people, because they don’t think it has the harms to the world you think it does!
I don’t think this step is locally valid? Or at least, in many situations, I don’t think ignorance of the consequences if your actions absolves you of responsibility for them.
As an example, if you work hard to help elect a politician who you believe was principled and good, and then when they get into office they’re a craven sellout who causes thousands of people to die, you bear some responsibility for it and for cleaning up your mess. As another example, if you work hard at a company and then it turns out the company is a scam and you’ve stolen money from all your customers, you bear some responsibility to clean up the mess and help the people whose lives your work ruined.
Relatedly, it is often the case that the right point to apply liability is when someone takes an action with a lot of downside, regardless of intent. Here are some legal examples a shoggoth gave me of holding people accountable even if they didn’t know the harm they were causing.
A company can be liable for harm caused by its products even if it followed all safety procedures.
Employers are held responsible for harms caused by employees acting within the scope of their job.
Sellers may be liable for false statements that cause harm, even if they made them in good faith.
These examples are a bit different. Anyhow, I think that if you work at a company that builds a doomsday machine, you bear some responsibility for that even if you didn’t know.
Yeah, sorry—I agree that was a bit sloppy of me. I think it is very reasonable to accuse people working at major AI labs of something like negligence / willful ignorance, and I agree that can be a pretty serious moral failing (indeed I think it’s plausibly the primary moral failing of many AI lab employees). My objection is more to the way the parent comment is connoting “evil” just from one’s employer leading to bad outcomes as if those outcomes are the known intent of such employees.
Drake—this seems like special pleading from an AI industry insider.
You wrote ‘I think working at an AI lab requires less failure of moral character than, say, working at a tobacco company, for all that the former can have much worse effects on the world.’
That doesn’t make sense to me. Tobacco kills about 8 million people a year globally. ASI could kill about 8 billion. The main reason that AI lab workers think that their moral character is better than that of tobacco industry workers is that the tobacco industry has already been morally stigmatized over the last several decades—whereas the AI industry has not yet been morally stigmatized in proportion to its likely harms.
Of course, ordinary workers in any harm-imposing industry can always make the argument that they’re good (or at least ethically mediocre) people, that they’re just following orders, trying to feed their families, weren’t aware of the harms, etc.
But that argument does not apply to smart people working in the AI industry—who have mostly already been exposed to the many arguments that AGI/ASI is a uniquely dangerous technology. And their own CEOs have already acknowledged these risks. And yet people continue to work in this industry.
Maybe a few workers at a few AI companies might be having a net positive impact in reducing AI X-risk. Maybe you’re one of the lucky few. Maybe.
The future is much much bigger than 8 billion people. Causing the extinction of humanity is much worse than killing 8 billion people. This really matters a lot for arriving at the right moral conclusions here.
Could you say a bit more about why you view the “extinction >>> 8B” as so important?
I’d have assumed that at your P(extinction), even treating extinction as just 8B deaths still vastly outweighs the possible lives saved from AI medical progress?
I don’t think it’s remotely as obvious then! If you don’t care about future people, then your key priority is to achieve immortality for the current generation, for which I do think building AGI is probably your best bet.
If it were to take 50+ years to build AGI, that would imply most people on earth have died of aging, and so you should have probably just rushed towards AGI if you think that would have been less than 50% likely to cause extinction.
People who hold this position are arguing for things like “we should only slow down AI development if for each year of slowing down we would be reducing risk of human extinction by more than 1%”, which is a policy that if acted on consistently would more likely than not cause humanity’s extinction within 100 years (as you would be accepting a minimum of a 1% chance of death each year in exchange for faster AI development).
Share of people alive today who are expected to die*
Rough number of deaths (out of 8.23 billion)
10 years (2035)
≈ 8 %
~0.65 billion
20 years (2045)
≈ 25 %
~2.0 billion
30 years (2055)
≈ 49 %
~4.0 billion
40 years (2065)
≈ 74 %
~6.1 billion
50 years (2075)
≈ 86 %
~7.1 billion
By the logic of the future not being bigger than 8 billion people, you should lock in a policy that has a 50% chance of causing human extinction, if it allows current people alive to extend their lifespan by more than ~35 years. I am more doomy than that about AI, in that I assign much more than 50% probability that deploying superintelligence would kill everyone, but it’s definitely a claim that requires a lot more thinking through than the usual “the risk is at least 10% or so”.
Thanks for explaining that, really appreciate it! One thing I notice I’d been assuming: that “8B-only” people would have a policy like “care about the 8B people who are living today, but also the people in say 20 years who’ve been born in the intervening time period.” But that’s basically just a policy of caring about future people! Because there’s not really a difference between “future people at the point that they’ve actually been born” and “future people generally”
I have different intuitions about “causing someone not to be born” versus “waiting for someone to be born, and then killing them”. So I do think that if someone sets in motion today events that reliably end in the human race dying out in 2035, the moral cost of this might be any of
“the people alive in both 2025 and 2035”
“everyone alive in 2035”
“everyone alive in 2035, plus (perhaps with some discounting) all the kids they would have had, and the kids they would have had...”
according to different sets of intuitions. And actually I guess (1) would be rarest, so even though both (2) and (3) involve “caring about future people” in some sense, I do think they’re important to distinguish. (Caring about “future-present” versus “future-future” people?)
People who hold this position are arguing for things like “we should only slow down AI development if for each year of slowing down we would be reducing risk of human extinction by more than 1%”, which is a policy that if acted on consistently would more likely than not cause humanity’s extinction within 100 years (as you would be accepting a minimum of a 1% chance of death each year in exchange for faster AI development).
If your goal is to maximize the expected fraction of currently alive humans who live for over 1000 years, you shouldn’t in fact make ongoingly make gambles that make it more likely than not everyone dies unless it turns out that it’s really hard to achieve this without immense risk. Perhaps that is your view: the only (realistic) way to get risk below ~50% is to delay for over 30 years. But this by no means a consensus perspective among those who are very worried about AI risk.
Separately, I don’t expect that we have many tradeoffs between elimination of human control over the future and the probability of currently alive people living for much longer other than AI, so after we eat that, there aren’t further tradeoffs to make. I think you agree with this, but your wording makes it seem as though you think there are ongoing hard tradeoffs that can’t be avoided.
I think that “we should only slow down AI development if for each year of slowing down we would be reducing risk of human extinction by more than 1%” is not a sufficient crux for the (expensive) actions which I most want at current margins, at least if you have my empirical views. I think it is very unlikely (~7%?) that in practice we reach near the level of response (in terms of spending/delaying for misalignment risk reduction) that would be rational given this “1% / year” view and my empirical views, so my empirical views suffice to imply very different actions.
For instance, delaying for ~10 years prior to building wildly superhuman AI (while using controlled AIs at or somewhat below the level of top human experts) seems like it probably makes sense on my views but this moral perspective, especially if you can use the controlled AIs to substantially reduce/delay ongoing deaths which seems plausible. Things like massively investing in safety/alignment work also easily makes sense. There are policies that substantially reduce the risk which merely require massive effort (and which don’t particularly delay powerful AI) that we could be applying.
I do think that this policy wouldn’t be on board with the sort of long pause that (e.g.) MIRI often discusses and it does materially alter what look like the best policies (though ultimately I don’t expect to get close to these best policies anyway).
habryka—‘If you don’t care about future people’—but why would any sane person not care at all about future people?
You offer a bunch of speculative math about longevity vs extinction risk.
OK, why not run some actual analysis on which is more likely to promote longevity research: direct biomedical research on longevity, or indirect AI research on AGI in hopes that it somehow, speculatively, solves longevity?
The AI industry is currently spending something on the order of $200 billion a year on research. The biomedical research on longevity, by contrast, is currently far less than $10 billion a year.
If we spent the $200 billion a year on longevity, instead of on AI, do you seriously think that we’d do worse on solving longevity? That’s what I would advocate. And it would involve virtually no extinction risk.
You are reading things into my comments I didn’t say. I of course don’t agree, or consider it reasonable, to “not care about future people”, that’s the whole context of this subthread.
My guess is if one did adopt a position that no future people matter (which again I do not think is a reasonable position), then I think the case for slowing down AI looks a lot worse. Not bad enough to make it an obvious slam that it’s bad, and my guess overall even under that worldview it would be dumb to rush towards developing AGI like we are currently doing, but it makes the case a lot weaker. There is much less to lose if you do not care about the future.
If we spent the $200 billion a year on longevity, instead of on AI, do you seriously think that we’d do worse on solving longevity? That’s what I would advocate. And it would involve virtually no extinction risk.
My guess is for the purpose of just solving longevity, AGI investment would indeed strongly outperform general biomedical investment. Humanity just isn’t very good at turning money into medical progress on demand like this.
It seems virtuous and good to be clear about which assumptions are load-bearing to my recommended actions. If I didn’t care about the future, I would definitely be advocating for a different mix of policies, though it likely would still involve marginal AI slowdown, but my guess is less forcefully, and a bunch of slowdown-related actions would become net bad.
I agree with you that a typical instance of working at an AI lab has worse consequences in expectation than working at a tobacco company, and I think that for a person who shares all your epistemic beliefs to work in a typical role at an AI lab would indeed be a worse failure of moral character than to work at a tobacco company.
I also agree that in many cases people at AI labs have been exposed at least once to arguments which, if they had better epistemics and dedicated more time to thinking about the consequences of their work, could have convinced them that it was bad for the world for them to do such work. And I do think the failure to engage with such arguments and seriously consider them, in situations like these, is a stain on someone’s character! But I think it’s the sort of ethical failure which a majority of humans will make by default, rather than something indicative of remarkably bad morality.
Tobacco kills about 8 million people a year globally. ASI could kill about 8 billion.
I just don’t think this sort of utilitarian calculus makes sense to apply when considering the actions of people who don’t share the object-level beliefs at hand! I think people who worked to promulgate communism in the late 19th century were not unusually evil, for instance.
And I do think the failure to engage with such arguments and seriously consider them, in situations like these, is a stain on someone’s character! But I think it’s the sort of ethical failure which a majority of humans will make by default, rather than something indicative of remarkably bad morality
This also seems locally invalid. Most people in fact don’t make this ethical failure because they don’t work at AI labs, nor do they dedicate their lives to work which has nearly as much power or influence on others as this.
It does seem consistent (and agree with commonsense morality) to say that if you are smart enough to locate the levers of power in the world, and you pursue them, then you have a moral responsibility to make sure you use them right if you do get your hands on them, otherwise we will call you evil and grossly irresponsible.
Oh cool, if we’re deciding it’s now virtuous to ostracize people we don’t like and declare them evil, I have a list of enemies I’d like to go after too. This is a great weapon, and fun to use! (Why did we ever stop using it?) Who else can we persecute? There are several much weaker and more-hated groups we could do to warm up.
Sure. But if an AI company grows an ASI that extinguishes humanity, who is left to sue them? Who is left to prosecute them?
The threat of legal action for criminal negligence is not an effective deterrent if there is no criminal justice system left, because there is no human species left.
A plea for having the courage to morally stigmatize the people working in the AGI industry:
I agree with Nate Soares that we need to show much more courage in publicly sharing our technical judgments about AI risks—based on our understanding of AI, the difficulties of AI alignment, the nature of corporate & geopolitical arms races, the challenges of new technology regulation & treaties, etc.
But we also need to show much more courage in publicly sharing our social and moral judgments about the evils of the real-life, flesh-and-blood people who are driving these AI risks—specifically, the people leading the AI industry, working in it, funding it, lobbying for it, and defending it on social media.
Sharing our technical concerns about these abstract risks isn’t enough. We also have to morally stigmatize the specific groups of people imposing these risks on all of us.
We need the moral courage to label other people evil when they’re doing evil things.
If we don’t do this, we look like hypocrites who don’t really believe that AGI/ASI would be dangerous.
Moral psychology teaches us that moral judgments are typically attached not just to specific actions, or to emergent social forces (e.g. the ‘Moloch’ of runaway competition), or to sad Pareto-inferior outcomes of game-theoretic dilemmas, but to people. We judge people. As moral agents. Yes, even including AI researchers and devs.
If we want to make credible claims that ‘the AGI industry is recklessly imposing extinction risks on all of our kids’, and we’re not willing to take the next step of saying ‘and also, the people working on AGI are reckless and evil and wrong and should be criticized, stigmatized, ostracized, and punished’, then nobody will take us seriously.
As any parent knows, if some Bad Guy threatens your kids, you defend your kids and you denounce the Bad Guy. Your natural instinct is to rally social support to punish them. This is basic social primate parenting, of the sort that’s been protecting kids in social groups for tens of millions of years.
If you don’t bother to rally morally outraged support against those who threaten kids, then the threat wasn’t real. This is how normal people think. And rightfully so.
So why don’t we have the guts to vilify the AGI leaders, devs, investors, and apologists, if we’re so concerned about AGI risk?
Because too many rationalists, EAs, tech enthusiasts, LessWrong people, etc still see those AI guys as ‘in our tribe’, based on sharing certain traits we hold dear—high IQ, high openness, high decoupling, Aspy systematizing, Bay Area Rationalist-adjacent, etc. You might know some of the people working in OpenAI, Anthropic, DeepMind, etc. -- they might be your friends, housemates, neighbors, relatives, old school chums, etc.
But if you take seriously their determination to build AGI/ASI—or even to work in ‘AI safety’ at those companies, doing their performative safety-washing and PR—then they are not the good guys.
We have to denounce them as the Bad Guys. As traitors to our species. And then, later, once they’ve experienced the most intense moral shame they’ve ever felt, and gone through a few months of the worst existential despair they’ve ever felt, and they’ve suffered the worst social ostracism they’ve ever experienced, we need to offer them a path towards redemption—by blowing the whistle on their former employers, telling the public what they’ve seen on the inside of the AI industry, and joining the fight against ASI.
This isn’t ‘playing dirty’ or ‘giving in to our worst social instincts’. On the contrary. Moral stigmatization and ostracism of evil-doers is how social primate groups have enforced cooperation norms for millions of years. It’s what keeps the peace, and supports good social norms, and protects the group. If we’re not willing to use the moral adaptations that evolved specifically to protect our social groups from internal and external threats, then we’re not really taking those threats seriously.
PS I outlined this ‘moral backlash’ strategy for slowing reckless AI development in this EA Forum point
I’m with you up until here; this isn’t just a technical debate, it’s a moral and social and political conflict with high stakes, and good and bad actions.
To be really nitpicky, I technically agree with this as stated: we should stigmatize groups as such, e.g. “the AGI capabilities research community” is evil.
Oops, this is partially but importantly WRONG. From Braχot 10a:
Not everyone who is doing evil things is evil. Some people are evil. You should hate no more than necessary, but not less than that. You should hate evil, and hate evildoers if necessary, but not if not necessary.
Schmidhuber? Evil. Sutton? Evil. Larry Page? Evil. If, after reflection, you endorse omnicide, you’re evil. Altman? Evil and probably a sociopath.
Up-and-coming research star at an AI lab? Might be evil, might not be. Doing something evil? Yes. Is evil? Maybe, it depends.
Essentializing someone by calling them evil is an escalation of a conflict. You’re closing off lines of communication and gradual change. You’re polarizing things: it’s harder for that one person to make gradual moves in belief space and social space and life-narrative space, and it’s harder for groups to have group negotiations. Sometimes escalation is good and difficult and necessary, but sometimes escalation is really bad! Doing a more complicated subtle thing with more complicated boundaries is more difficult. And more brave, if we’re debating bravery here.
So:
Good:
Good:
Good:
Bad:
Sidenote:
I agree that this is an improper motivation for treating some actions with kid gloves, which will lead to incorrect action; and that this is some of what’s actually happening.
TsviBT—thanks for a thoughtful comment.
I understand your point about labelling industries, actions, and goals as evil, but being cautious about labelling individuals as evil.
But I don’t think it’s compelling.
You wrote ‘You’re closing off lines of communication and gradual change. You’re polarizing things.’
Yes, I am. We’ve had open lines of communication between AI devs and AI safety experts for a decade. We’ve had pleas for gradual change. Mutual respect, and all that. Trying to use normal channels of moral persuasion. Well-intentioned EAs going to work inside the AI companies to try to nudge them in safer directions.
None of that has worked. AI capabilities development is outstripping AI safety developments at an ever-increasing rate. The financial temptations to stay working inside AI companies keep increasing, even as the X risks keep increasing. Timelines are getting shorter.
The right time to ‘polarize things’ is when we still have some moral and social leverage to stop reckless ASI development. The wrong time is after it’s too late.
Altman, Amodei, Hassabis, and Wang are buying people’s souls—paying them hundreds of thousands or millions of dollars a year to work on ASI development, despite most of their workers they supervise knowing that they’re likely to be increasing extinction risk.
This isn’t just a case of ‘collective evil’ being done by otherwise good people. This is a case of paying people so much that they ignore their ethical qualms about what they’re doing. That makes the evil very individual, and very specific. And I think that’s worth pointing out.
(This rhetoric is not quite my rhetoric, but I want to affirm that I do believe that ~most people working at big AI companies are contributing to the worst atrocity in human history, are doing things that are deontologically prohibited, and are morally responsible for that.)
Ben—so, we’re saying the same things, but you’re using gentler euphemisms.
I say ‘evil’; you say ‘deontologically prohibited’.
Given the urgency of communicating ASI extinction risks to the public, why is this the time for gentle euphemisms?
For one, I think I’m a bit scared of regretting my choices. Like, calling someone evil and then being wrong about it isn’t something where you just get to say “oops, I made a mistake” afterwards, you did meaningfully move to socially ostracize someone, mark them as deeply untrustworthy, and say that good people should remove their power, and you kind of owe them something significant if you get that wrong.
For two, a person who has done evil, versus a person who is evil, are quite different things. I think that it’s sadly not always the case that a person’s character is aligned with a particular behavior of theirs. I think it’s not accurate to think of all the people building the doomsday machines as generically evil and someone who will do awful things in lots of different contexts, I think there’s a lot of variation in the people and their psychologies and predispositions, and some are screwing up here (almost unforgivably, to be clear) in ways they wouldn’t screw up in different situations.
I do think many of the historical people most widely considered to be evil now were similarly not awful in full generality, or even across most contexts. For example, Eichmann, the ops lead for the Holocaust, was apparently a good husband and father, and generally took care not to violate local norms in his life or work. Yet personally I feel quite comfortable describing him as evil, despite “evil” being a fuzzy folk term of the sort which tends to imperfectly/lossily describe any given referent.
I’m not quite sure what I make of this, I’ll take this opportunity to think aloud about it.
I often take a perspective where most people are born a kludgey mess, and then if they work hard they can become something principled and consistent and well-defined. But without that, they don’t have much in the way of persistent beliefs or morals such that they can be called ‘good’ or ‘evil’.
I think of an evil person as someone more like Voldemort in HPMOR, who has reflected on his principles and will be persistently a murdering sociopath, than someone who ended up making horrendous decisions but wouldn’t in a different time and place. I think if you put me under a lot of unexpected political forces and forced me to make high-stakes decisions, I could make bad decisions, but not because I’m a fundamentally bad person.
I do think it makes sense to write people off as bad people, in our civilization. There are people who have poor impulse control, who have poor empathy, who are pathological liars, and who aren’t save-able by any of our current means, and will always end up in jail or hurting people around them. I rarely interact with such people so it’s hard for me to keep this in mind, but I do believe such people exist.
But evil seems a bit stronger than that, it seems a bit more exceptional. Perhaps I would consider SBF an evil person; he seems to me someone who knew he was a sociopath from a young age, and didn’t care about people, and would lie and deceive, was hyper-competent, and I expect that if you release him into society he will robustly continue to do extreme amounts of damage.
Is that who Eichmann was? I haven’t read the classic book on him, but I thought the point of ‘the banality of evil’ was that he seemed quite boring and like many other people? Is it the case that you could replace Eichmann with like >10% of the population and get similar outcomes? 1%? I am not sure if it is accurate to think of that large a chunk of people as ‘evil’, as being the kind of robustly bad people who should probably be thrown in prison for the protection of civilization. My current (superficial) understanding is that Eichmann enacted an atrocity without being someone who would persistently do so in many societies. He had the capacity for great evil, but this was not something he would reliably seek out.
It is possible that somehow thousands of people like SBF and Voldemort have gotten together to work at AI companies; I don’t currently believe that. To be clear, I think that if we believe there are evil people, then it must surely describe some of the people working at big AI companies that are building doomsday machines, who are very resiliently doing so in the face of knowing that they’re hastening the end of humanity, but I don’t currently think it describes most of the people.
This concludes my thinking aloud; I would be quite interested to read more of how your perspective differs, and why.
(cf. Are Your Enemies Innately Evil? from the Sequences)
Ben—your subtext here seems to be that only lower-class violent criminals are truly ‘evil’, whereas very few middle/upper-class white-collar people are truly evil (with a few notable exceptions such as SBF or Voldemort) -- with the implications that the majority of ASI devs can’t possibly be evil in the ways I’ve argued.
I think that doesn’t fit the psychological and criminological research on the substantial overlap between psychopathy and sociopathy, and between violent and non-violent crime.
It also doesn’t fit the standard EA point that a lot of ‘non-evil’ people can get swept up in doing evil collective acts as parts of collectively evil industries, such as slave-trading, factory farming, Big Tobacco, the private prison system, etc. - but that often, the best way to fight such industries is to use moral stigmatization.
You mis-read me on the first point; I said that (something kind of like) ‘lower-class violent criminals’ are sometimes dysfunctional and bad people, but I was distinguishing that from someone more hyper competent and self-aware like SBF or Voldemort; I said that only the latter are evil. (For instance, they’ve hurt orders of magnitude more people.)
(I’m genuinely not sure what research you’re referring to – I am expect you are 100x as familiar with the literature as I am, and FWIW I’d be happy to get a pointer or two of things to read.[1])
The standard EA point is to use moral stigmatization? Even if that’s accurate, I’m afraid I no longer have any trust in EAs to do ethics well. As an example that you will be sympathetic to, lots of them have endorsed working at AI companies over the past decade (but many many other examples have persuaded me of this point).
To be clear, I am supportive of moral stigma being associated with working at AI companies. I’ve shown up to multiple protests outside the companies (and I brought my mum!). If you have any particular actions in mind to encourage me to do (I’m probably not doing as much as I could) I’m interested to hear them. Perhaps you could write a guide to how to act when dealing with people in your social scene who work on building doomsday devices in a way that holds a firm moral line while not being socially self-destructive / not immediately blowing up all of your friendships. I do think more actionable advice would be helpful.
I expect it’s the case that crime rates correlate with impulsivity, low-IQ, and wealth (negatively). Perhaps you’re saying that psychopathy and sociopathy do not correlate with social class? That sounds plausible. (I’m also not sure what you’re referring to with the violent part, my guess is that violent crimes do correlate with social class.)
Eichmann was definitely evil. The popular conception of Eichmann as merely an ordinary guy who was “just doing his job” and was “boring” is partly mischaracterization of Arendt’s work, partly her own mistakes (i.e., her characterizations, which are no longer considered accurate by historians).
An example of the former:
(Hmm… that last line is rather reminiscent of something, no?)
Concerning the latter:
I am one of those people that are supposed to be stigmatized/detterred by this action. I doubt this tactic will be effective. This thread (including the disgusting comparison to Eichmann who directed the killing of millions in the real world—not in some hypothetical future one) does not motivate me to interact with the people holding such positions. Given that much of my extended family was wiped out by the holocaust, I find these Nazi comparisons abhorrent, and would not look forward to interact with people making them whether or not they decide to boycott me.
BTW this is not some original tactic, PETA is using similar approaches for veganism. I don’t think they are very effective either.
To @So8res—I am surprised and disappointed that this Godwin’s law thread survived a moderation policy that is described as “Reign of Terror”
I’ve often appreciated your contributions here, but given the stakes of existential risk, I do think that if my beliefs about risk from AI are even remotely correct, then it’s hard to escape the conclusion that the people presently working at labs are committing the greatest atrocity that anyone in human history has or will ever commit.
The logic of this does not seem that complicated, and while I disagree with Geoffrey Miller on how he goes about doing things, I have even less sympathy for someone reacting to a bunch of people really thinking extremely seriously and carefully about whether what that person is doing might be extremely bad with “if people making such comparisons decide to ostracize me then I consider it a nice bonus”. You don’t have to agree, but man, I feel like you clearly have the logical pieces to understand why one could believe you are causing extremely great harm, without that implying the insanity of the person believing that.
I respect at least some of the people working at capability labs. One thing that unites all of the ones I do respect is that they treat their role at those labs with the understanding that they are in a position of momentous responsibility, and that them making mistakes could indeed cause historically unprecedented levels of harm. I wish you did the same here.
I edited the original post to make the same point with less sarcasm.
I take risk from AI very seriously which is precisely why I am working in alignment at OpenAI. I am also open to talking with people having different opinions, which is why I try to follow this forum (and also preordered the book). But I do draw the line at people making Nazi comparisons.
FWIW I think radicals often hurt the causes they espouse, whether it is animal rights, climate change, or Palestine. Even if after decades the radicals are perceived to have been on “the right side of history”, their impact was often negative and it caused that to have taken longer: David Shor was famously cancelled for making this point in the context of the civil rights movement.
Sorry to hear the conversation was on a difficult topic for you; I imagine that is true for many of the Jewish folks we have around these parts.
FWIW I think we were discussing Eichmann in order to analyze what ‘evil’ is or isn’t, and did not make any direct comparisons between him and anyone.
...oh, now I see what Said’s “Hmm… that last line is rather reminiscent of something, no?” is probably making such a comparison (I couldn’t tell what he meant it when I read it initially). I can see why you’d respond negatively to that. While there’s a valid point to be made about how people who just try to gain status/power/career-capital without thinking about ethics can do horrendous things, I do not think that it is healthy for discourse to express that in the passive-aggressive way that Said did.
The comparisons invite themselves, frankly. “Careerism without moral evaluation of the consequences of one’s work” is a perfect encapsulation of the attitudes of many of the people who work in frontier AI labs, and I decline to pretend otherwise.
(And I must also say that I find the “Jewish people must not be compared to Nazis” stance to be rather absurd, especially in this sort of case. I’m Jewish myself, and I think that refusing to learn, from that particular historical example, any lessons whatsoever that could possibly ever apply to our own behavior, is morally irresponsible in the extreme.)
EDIT: Although the primary motivation of my comment about Eichmann was indeed to correct the perception of the historians’ consensus, so if you prefer, I can remove the comparison to a separate comment; the rest of the comment stands without that part.
I agree with your middle paragraph.
To be clear, I would approve more of a comment that made the comparison overtly[0], rather than one that made it in a subtle way that was harder to notice or that people missed (I did not realize what you were referring to until I tried to puzzle at why boaz had gotten so upset!). I think it is not healthy for people to only realize later that they were compared to Nazis. And I think it fair for them to consider that an underhanded way to cause them social punishment, to do it in a way that was hard to directly respond to. I believe it’s healthier for attacks[1] to be open and clear.
[0] To be clear, there may still be good reasons to not throw in such a jab at this point in the conversation, but my main point is that doing it with subtlety makes it worse, not better, because it also feels sneaky.
[1] “Attacks”, a word which here means “statements that declare someone has a deeply rotten character or whose behavior has violated an important norm, in a way that if widely believed will cause people to punish them”.
(I don’t mean to derail this thread with discussion of discussion norms. Perhaps if we build that “move discourse elsewhere button” that can later be applied back to this thread.)
Thank you Ben. I don’t think name calling and comparisons are helpful to a constructive debate, which I am happy to have. Happy 4th!
boazbarak—I don’t understand your implication that my position is ‘radical’.
I have exactly the same view on the magnitude of ASI extinction risk that every leader of a major AI company does—that it’s a significant risk.
The main difference between them and me is that they are willing to push ahead with ASI development despite the significant risk of human extinction, and I think they are utterly evil for doing so, because they’re endangering all of our kids.
In my view, risking extinction for some vague promise of an ASI utopia is the radical position. Protecting us from extinction is a mainstream, commonsense, utterly normal human position.
(From a moderation perspective:
I consider the following question-cluster to be squarely topical: “Suppose one believes it is evil to advance AI capabilities towards superintelligence, on the grounds that such a superintelligence would quite likely to kill us all. Suppose further that one fails to unapologetically name this perceived evil as ‘evil’, e.g. out of a sense of social discomfort. Is that a failure of courage, in the sense of this post?”
I consider the following question-cluster to be a tangent: “Suppose person X is contributing to a project that I believe will, in the future, cause great harms. Does person X count as ‘evil’? Even if X agrees with me about which outcomes are good and disagrees about the consequences of the project? Even if the harms of the project have not yet occurred? Even if X would not be robustly harmful in other circumstances? What if X thinks they’re trying to nudge the project in a less-bad direction?”
I consider the following sort of question to be sliding into the controversy attractor: “Are people working at AI companies evil?”
The LW mods told me they’re considering implementing a tool to move discussions to the open thread (so that they may continue without derailing the topical discussions). FYI @habryka: if it existed, I might use it on the tangents, idk. I encourage people to pump against the controversy attractor.)
I agree with you on the categorization of 1 and 2. I think there is a reason why Godwin’s law was created once thread follow the controversy attractor to this direction they tend to be unproductive.
I completely agree this discussion should be moved outside your post. But the counterintuitive mechanics of LessWrong mean a derailing discussion may actually increase the visibility and upvotes of your original message (by bumping it in the “recent discussion”).
(It’s probably still bad if it’s high up in the comment section.)
It’s too bad you can only delete comment threads, you can’t move them to the bottom or make them collapsed by default.
The apparent aim of OpenAI (making AGI, even though we don’t know how to do so without killing everyone) is evil.
I agree that a comparison to Eichmann is not optimal.
Instead, if AI turns out to have consequences so bad that they outweigh the good, it’ll have been better to compare people working at AI labs to Thomas Midgley, who insisted that his leaded gasoline couldn’t be dangerous even when presented with counter-evidence, and Edward Teller, who (as far as I can tell) was simply fascinated by the engineering challenges of scaling hydrogen bombs to levels that could incinerate entire continents.
These two people still embody two archetypes of what could reasonably be called “evil”, but arguably fit better with the psychology of people currently working at AI labs.
These two examples also avoid Godwin’s law type attractors.
That’s interesting to hear that many historians believe he was secretly more ideologically motivated than Arendt thought, and also believe that he portrayed a false face during all of the trials, thanks for the info.
Yes, it takes courage to call people out as evil, because you might be wrong, you might unjustly ruin their lives, you might have mistakenly turned them into scapegoat, etc. Moral stigmatization carries these risks. Always has.
And people understand this. Which is why, if we’re not willing to call the AGI industry leaders and devs evil, then people will see us failing to have the courage of our convictions. They will rightly see that we’re not actually confident enough in our judgments about AI X-risk to take the bold step of pointing fingers and saying ‘WRONG!’.
So, we can hedge our social bets, and try to play nice with the AGI industry, and worry about making such mistakes. Or, we can save humanity.
To be clear, I think it would probably be reasonable for some external body like the UN to attempt to prosecute & imprison ~everyone working at big AI companies for their role in racing to build doomsday machines. (Most people in prison are not evil.) I’m a bit unsure if it makes sense to do things like this retroactively rather than to just outlaw it going forward, but I think it sometime makes sense to prosecute atrocities after the fact even if there wasn’t a law against it at the time. For instance my understanding is that the Nuremberg trials set precedents for prosecuting people for war crimes, crimes against humanity, and crimes against peace, even though legally they weren’t crimes at the time that they happened.
I just have genuine uncertainty about the character of many of the people in the big AI companies and I don’t believe they’re all fundamentally rotten people! And I think language is something that can easily get bent out of shape when the stakes are high, and I don’t want to lose my ability to speak and be understood. Consequently I find I care about not falsely calling people’s character/nature evil when what I think is happening is that they are committing an atrocity, which is similar but distinct.
My answer to this is “because framing things in terms of evil turns the situation more mindkilly, not really the right gears, and I think this domain needs clarity-of-thought more than it needs a social-conflict orientation”
(I’m not that confident about that, and don’t super object to other people calling them evil. But I think “they are most likely committing a great atrocity” is pretty non-euphemistic and more true)
Openly calling people evil has some element of “deciding who to be in social conflict with”, but insofar as it has some element of “this is simply an accurate description of the world” then FWIW I want to note that this consideration partially cuts in favor of just plainly stating what counts as evil, even whether specific people have met that bar.
If I thought evil was a more useful gear here I’d be more into it. Like, I think “they probably committing an atrocity” carves reality at the joints.
I think there are maybe 3 people involved with AGI development who seem like they might be best described as “evil” (one of whom is Sam Altman, who I feel comfortable naming because I think we’ve seen evidence of him doing nearterm more mundane evil, rather than making guesses about their character and motivations)
I think it probably isn’t helpful to think of Eichmann as evil, though again fine to say “he commited atrocities” or even “he did evil.”
Yeah that makes sense.
As an aside, I notice that I currently feel much more reticent to name individuals who there is not some sort of legal/civilizational consensus about their character. I think I am a bit worried about contributing to the dehumanization of people who are active players, and a drop in basic standards of decency toward them, even if I were to believe they were evil.
I’m personally against this as matter of principle, and I also don’t think it’ll work.
Moral stigmatizing only works against a captive audience. It doesn’t work against people who can very easily ignore you.
You’re more likely to stop eating meat if a kind understanding vegetarian/vegan talks to you and makes you connect with her story of how she stopped eating meat. You’re more likely to simply ignore a militant one who calls you a murderer.
Moral stigmatizing failed to stop nuclear weapon developers, even though many of them were the same kind of “nerd” as AI researchers.
People see Robert Oppenheimer saying “Now, I am become Death, the destroyer of worlds” as some morally deep stuff. “The scientific community ostracized [Edward] Teller,” not because he was very eager to build bigger bombs (like the hydrogen bomb and his proposed Sundial), but because he made Oppenheimer lose his security clearance by saying bad stuff about him.
Which game do you choose to play? The game of dispassionate discussion, where the truth is on your side? Or the game of Twitter-like motivated reasoning, where your side looks much more low status than the AI lab people, and the status quo is certainly not on your side?
Imagine how badly we’ll lose the argument if people on our side are calling them evil and murderous and they’re talking like a sensible average Joe trying to have a conversation with us.
Moral stigmatization seems to backfire rather than help for militant vegans because signalling hostility is a bad strategy when you’re the underdog going against the mainstream. It’s extremely big ask for ordinary people show hostility towards other ordinary people who no one else is hostile towards. It’s even difficult for ordinary people to be associated with a movement which shows such hostility. Most people just want to move on with their lives.
I think you’re underestimating the power of backlashes to aggressive activism. And I say this, despite the fact just a few minutes ago I was arguing to others that they’re overestimating the power of backlashes.
The most promising path to slowing down AI is government regulation, not individuals ceasing to do AI research.
- Think about animal cruelty. Government regulation has succeeded on this many times. Trying to shame people who work in factory farms into stopping, has never worked, and wise activists don’t even consider doing this.
- Think about paying workers more. Raising the minimum wage works. Shaming companies into feeling guilty doesn’t. Even going on strike doesn’t work as well as minimum wage laws.
- Despite the fact half of the employees refusing to work is like 10 times more powerful than non-employees holding a sign saying “you’re evil.”
- Especially a tiny minority of society holding those signs
- Though then again, moral condemnation is a source of government regulation.
Disclaimer: not an expert just a guy on the internet
Strong disagree, but strong upvote because it’s “big if true.” Thank you for proposing a big crazy idea that you believe will work. I’ve done that a number of times, and I’ve been downvoted into the ground without explanation, instead of given any encouraging “here’s why I don’t think this will work, but thank you.”
Hi Knight, thanks for the thoughtful reply.
I’m curious whether you read the longer piece about moral stigmatization that I linked to at EA Forum? It’s here, and it addresses several of your points.
I have a much more positive view about the effectiveness of moral stigmatization, which I think has been at the heart of almost every successful moral progress movement in history. The anti-slavery movement stigmatized slavery. The anti-vivisection movement stigmatized torturing animals for ‘experiments’. The women’s rights movement stigmatized misogyny. The gay rights movement stigmatized homophobia.
After the world wars, biological and chemical weapons were not just regulated, but morally stigmatized. The anti-landmine campaign stigmatized landmines.
Even in the case of nuclear weapons, the anti-nukes peace movement stigmatized the use and spread of nukes, and was important in nuclear non-proliferation, and IMHO played a role in the heroic individual decisions by Arkhipov and others not to use nukes when they could have.
Regulation and treaties aimed to reduce the development, spread, and use of Bad Thing X, without moral stigmatization of Bad Thing X, doesn’t usually work very well. Formal law and informal social norms must typically reinforce each other.
I see no prospect for effective, strongly enforced regulation of ASI development without moral stigmatization of ASI development. This is because, ultimately, ‘regulation’ relies on the coercive power of the state—which relies on agents of the state (e.g. police, military, SWAT teams, special ops teams) being willing to enforce regulations even against people with very strong incentives not to comply. And these agents of the state simply won’t be willing to use government force against ASI devs violating regulations unless these agents already believe that the regulations are righteous and morally compelling.
That’s a very good point, and these examples really changes my intuition from “I can’t see this being a good idea,” to “this might make sense, this might not, it’s complicated.” And my earlier disagreement mostly came from my intuition.
I still have disagreements, but just to clarify I now agree your idea deserves more attention that it’s getting.
My remaining disagreement is I think stigmatization only reaches the extreme level of “these people are literally evil and vile,” after the majority of people already agree.
In places in India where the majority of people are already vegetarians, and already feel that eating meat is wrong, the social punishment of meat eaters does seem to deter them.
But in places where most people don’t think eating meat is wrong, prematurely calling meat eaters evil may backfire. This is because you’ve created a “moral-duel” where you force outside observers to either think the meat-eater is the bad guy, or you’re the bad guy (or stupid guy). This “moral-duel” drains the moral standing of both sides.
If you’re near the endgame, and 90% of people already are vegetarians, then this moral-duel will first deplete the meat-eater’s moral standing, and may solidify vegetarianism.
But if you’re at the beginning, when only 1% of people support your movement. You desperately want to invest your support and credibility into further growing your support and credibility, rather than burning it in a moral-duel against the meat-eater majority the way militant vegans did.
Nurturing credibility is especially important for AI Notkilleveryoneism, where the main obstacle is a lack of credibility and “this sounds like science fiction.”
Finally, at least only go after the AI lab CEOs, as they have relatively less moral standing, compared to the rank and file researchers.
E.g. in this quicktake Mikhail Samin appealed to researchers as friends asking them to stop “deferring” to their CEO.
Even for nuclear weapons, biological weapons, chemical weapons, landmines, it was hard to punish scientists researching it. Even for the death penalty, it was hard to punish the firing squad soldiers. It’s easier to stick it to the leaders. In an influential book by early feminist Lady Constance Lytton, she repeatedly described the policemen (who fought the movement) and even prison guards as very good people and focused the blame on the leaders.
PS: I read your post, it was a fascinating read. I agree with the direction of it and I agree the factors you mention are significant, but it might not go quite as far as you describe?
Knight—thanks again for the constructive engagement.
I take your point that if a group is a tiny and obscure minority, and they’re calling the majority view ‘evil’, and trying to stigmatize their behavior, that can backfire.
However, the surveys and polls I’ve seen indicate that the majority of humans already have serious concerns about AI risks, and in some sense are already onboard with ‘AI Notkilleveryoneism’. Many people are under-informed or misinformed in various ways about AI, but convincing the majority of humanity that the AI industry is acting recklessly seems like it’s already pretty close to feasible—if not already accomplished.
I think the real problem here is raising public awareness about how many people are already on team ‘AI Notkilleveryoneism’ rather than team ‘AI accelerationist’. This is a ‘common knowledge’ problem from game theory—the majority needs to know that they’re in the majority, in order to do successful moral stigmatization of the minority (in this case, the AI developers).
Haha you’re right, in another comment I was saying
To be honest, I’m extremely confused. Somehow, AI Notkilleveryoneism… is both a tiny minority and a majority at the same time.
That makes sense, it seems to explain things. The median AI expert also has a 5% to 10% chance of extinction, which is huge.
I’m still not in favour of stigmatizing AI developers, especially right now. Whether AI Notkilleveryoneism is a real minority or an imagined minority, if it gets into a moral-duel with AI developers, it will lose status, and it will be harder for it to grow (by convincing people to agree with it, or by convincing people who privately agree to come out of the closet).
People tend to follow “the experts” instead of their very uncertain intuitions about whether something is dangerous. With global warming, the experts were climatologists. With cigarette toxicity, the experts were doctors. But with AI risk, you were saying that,
It sounds like, the expertise people look to when deciding “whether AI risk is serious or sci-fi,” comes from leading AI scientists, and even AI company CEOs. Very unfortunately, we may depend on our good relations with them… :(
Moral ostracisation of factory farmers is somewhat ineffective because the vast majority of people are implicated in factory farming. They fund it every day and view eating animals as a key part of their identity.
Calling factory farming murder/torture is calling nearly every member of the public a murderer/torturer. (Which may be true but is unlikely to get them to change their habits)
Calling the race to ASI murder is only calling AI researchers and funders murderers. The general public are not morally implicated and don’t view use of AI as a key part of their identity.
The polling shows that they’re not on board with the pace of AI development, think it poses a significant risk of human extinction, and that they don’t trust the CEOs of AI companies to act responsibly.
That’s a very good point, and I didn’t really analyze the comparison.
I guess maybe meat eating isn’t the best comparison.
The closest comparison might be researchers developing some other technology, which maybe 2⁄3 people see as a net negative. E.g. nuclear weapons, autonomous weapons, methods for extracting fossil fuel, tobacco, etc.
But no campaign even really tried to stigmatize these researchers. Every single campaign against these technologies have targeted the companies, CEOs, or politicians leading them, without really any attack towards the researchers. Attacking them is sort of untested.
This is self-indulgent, impotent fantasy. Everyone agrees that people hurting children is bad. People are split on whether AGI/ASI is an existential threat.[1] There is no “we” beyond “people who agree with you”. “They” are not going to have anything like the reaction you’re imagining. Your strategy of screaming and screaming and screaming and screaming and screaming and screaming and screaming and screaming is not an effective way of changing anyone’s mind.
Anyone responding “but it IS an existential threat!!” is missing the point.
Richard—I think you’re just factually wrong that ‘people are split on whether AGI/ASI is an existential threat’.
Thousands of people signed the 2023 CAIS statement on AI risk, including almost every leading AI scientist, AI company CEO, AI researcher, AI safety expert, etc.
There are a few exceptions, such as Yann LeCun. And there are a few AI CEOs, such as Sam Altman, who had previously acknowledged the existential risks, but now downplay them.
But if all the leading figures in the industry—including Altman, Amodei, Hassabis, etc—have publicly and repeatedly acknowledged the existential risks, why would you claim ‘people are split’?
You just mentioned LeCun and “a few AI CEOs, such as Sam Altman” as exceptions, so it isn’t by any means “all the leading figures”. I would also name Mark Zuckerberg, who has started “Superintelligence Labs” with the aim of “personal superintelligence for everyone”, with nary a mention of how if anyone builds it, everyone dies. Presumably all the talent he’s bought are on board with that.
I also see various figures (no names to hand) pooh-poohing the very idea of ASI at all, or of ASI as existential threat. They may be driven by the bias of dismissing the possibility of anything so disastrous as to make them have to Do Something and miss lunch, but right or wrong, that’s what they say.
And however many there are on each side, I stand by my judgement of the futility of screaming shame at the other side, and of the self-gratifying fantasy about how “they” will react.
I think this is a pretty unhelpful frame. Most people working at an AI lab are somewhere between “person of unremarkable moral character who tells themselves a vague story about how they’re doing good things” and “deeply principled person trying their best to improve the world as best they can”. I think working at an AI lab requires less failure of moral character than, say, working at a tobacco company, for all that the former can have much worse effects on the world.
There are a few people I think it is fair to describe as actively morally bad, and willfully violating deontology—it seems likely to me that this is true of Sam Altman, for instance—but I think “evil” is just not a very helpful word here, will not usefully model the actions of AI lab employees, and will come across as obviously disingenuous to anyone who hears such rhetoric if they actually interact with any of the people you’re denigrating. If you had to be evil to end the world, the world would be a lot safer!
I think it’s fine and good to concentrate moral opprobrium at specific actions people take that are unprincipled or clear violations of deontology—companies going back on commitments, people taking on roles or supporting positions that violate principles they’ve previously expressed, people making cowardly statements that don’t accurately reflect their beliefs for the sake of currying favor. I think it’s also fine and good to try and convince people that what they’re doing is harmful, and that they should quit their jobs or turn whistleblower or otherwise change course. But the mere choice of job title is usually not a deontology violation for these people, because they don’t think it has the harms to the world you think it does! (I think at this point it is probably somewhat of a deontological violation to work in most roles at OpenAI or Meta AI even under typical x-risk-skeptical worldviews, but only one that indicates ethical mediocrity rather than ethical bankruptcy.)
(For context, I work on capabilities at Anthropic, because I think that reduces existential risk on net; I think there’s around a 25% chance that this is a horrible mistake and immensely harmful for the world. I think it’s probably quite bad for the world to work on capabilities at other AI labs.)
I don’t think this step is locally valid? Or at least, in many situations, I don’t think ignorance of the consequences if your actions absolves you of responsibility for them.
As an example, if you work hard to help elect a politician who you believe was principled and good, and then when they get into office they’re a craven sellout who causes thousands of people to die, you bear some responsibility for it and for cleaning up your mess. As another example, if you work hard at a company and then it turns out the company is a scam and you’ve stolen money from all your customers, you bear some responsibility to clean up the mess and help the people whose lives your work ruined.
Relatedly, it is often the case that the right point to apply liability is when someone takes an action with a lot of downside, regardless of intent. Here are some legal examples a shoggoth gave me of holding people accountable even if they didn’t know the harm they were causing.
A company can be liable for harm caused by its products even if it followed all safety procedures.
Employers are held responsible for harms caused by employees acting within the scope of their job.
Sellers may be liable for false statements that cause harm, even if they made them in good faith.
These examples are a bit different. Anyhow, I think that if you work at a company that builds a doomsday machine, you bear some responsibility for that even if you didn’t know.
Yeah, sorry—I agree that was a bit sloppy of me. I think it is very reasonable to accuse people working at major AI labs of something like negligence / willful ignorance, and I agree that can be a pretty serious moral failing (indeed I think it’s plausibly the primary moral failing of many AI lab employees). My objection is more to the way the parent comment is connoting “evil” just from one’s employer leading to bad outcomes as if those outcomes are the known intent of such employees.
Drake—this seems like special pleading from an AI industry insider.
You wrote ‘I think working at an AI lab requires less failure of moral character than, say, working at a tobacco company, for all that the former can have much worse effects on the world.’
That doesn’t make sense to me. Tobacco kills about 8 million people a year globally. ASI could kill about 8 billion. The main reason that AI lab workers think that their moral character is better than that of tobacco industry workers is that the tobacco industry has already been morally stigmatized over the last several decades—whereas the AI industry has not yet been morally stigmatized in proportion to its likely harms.
Of course, ordinary workers in any harm-imposing industry can always make the argument that they’re good (or at least ethically mediocre) people, that they’re just following orders, trying to feed their families, weren’t aware of the harms, etc.
But that argument does not apply to smart people working in the AI industry—who have mostly already been exposed to the many arguments that AGI/ASI is a uniquely dangerous technology. And their own CEOs have already acknowledged these risks. And yet people continue to work in this industry.
Maybe a few workers at a few AI companies might be having a net positive impact in reducing AI X-risk. Maybe you’re one of the lucky few. Maybe.
The future is much much bigger than 8 billion people. Causing the extinction of humanity is much worse than killing 8 billion people. This really matters a lot for arriving at the right moral conclusions here.
Could you say a bit more about why you view the “extinction >>> 8B” as so important?
I’d have assumed that at your P(extinction), even treating extinction as just 8B deaths still vastly outweighs the possible lives saved from AI medical progress?
I don’t think it’s remotely as obvious then! If you don’t care about future people, then your key priority is to achieve immortality for the current generation, for which I do think building AGI is probably your best bet.
If it were to take 50+ years to build AGI, that would imply most people on earth have died of aging, and so you should have probably just rushed towards AGI if you think that would have been less than 50% likely to cause extinction.
People who hold this position are arguing for things like “we should only slow down AI development if for each year of slowing down we would be reducing risk of human extinction by more than 1%”, which is a policy that if acted on consistently would more likely than not cause humanity’s extinction within 100 years (as you would be accepting a minimum of a 1% chance of death each year in exchange for faster AI development).
Here are ChatGPTs actuarial tables about how long the current population is expected to survive:
By the logic of the future not being bigger than 8 billion people, you should lock in a policy that has a 50% chance of causing human extinction, if it allows current people alive to extend their lifespan by more than ~35 years. I am more doomy than that about AI, in that I assign much more than 50% probability that deploying superintelligence would kill everyone, but it’s definitely a claim that requires a lot more thinking through than the usual “the risk is at least 10% or so”.
Thanks for explaining that, really appreciate it! One thing I notice I’d been assuming: that “8B-only” people would have a policy like “care about the 8B people who are living today, but also the people in say 20 years who’ve been born in the intervening time period.” But that’s basically just a policy of caring about future people! Because there’s not really a difference between “future people at the point that they’ve actually been born” and “future people generally”
I have different intuitions about “causing someone not to be born” versus “waiting for someone to be born, and then killing them”. So I do think that if someone sets in motion today events that reliably end in the human race dying out in 2035, the moral cost of this might be any of
“the people alive in both 2025 and 2035”
“everyone alive in 2035”
“everyone alive in 2035, plus (perhaps with some discounting) all the kids they would have had, and the kids they would have had...”
according to different sets of intuitions. And actually I guess (1) would be rarest, so even though both (2) and (3) involve “caring about future people” in some sense, I do think they’re important to distinguish. (Caring about “future-present” versus “future-future” people?)
If your goal is to maximize the expected fraction of currently alive humans who live for over 1000 years, you shouldn’t in fact make ongoingly make gambles that make it more likely than not everyone dies unless it turns out that it’s really hard to achieve this without immense risk. Perhaps that is your view: the only (realistic) way to get risk below ~50% is to delay for over 30 years. But this by no means a consensus perspective among those who are very worried about AI risk.
Separately, I don’t expect that we have many tradeoffs between elimination of human control over the future and the probability of currently alive people living for much longer other than AI, so after we eat that, there aren’t further tradeoffs to make. I think you agree with this, but your wording makes it seem as though you think there are ongoing hard tradeoffs that can’t be avoided.
I think that “we should only slow down AI development if for each year of slowing down we would be reducing risk of human extinction by more than 1%” is not a sufficient crux for the (expensive) actions which I most want at current margins, at least if you have my empirical views. I think it is very unlikely (~7%?) that in practice we reach near the level of response (in terms of spending/delaying for misalignment risk reduction) that would be rational given this “1% / year” view and my empirical views, so my empirical views suffice to imply very different actions.
For instance, delaying for ~10 years prior to building wildly superhuman AI (while using controlled AIs at or somewhat below the level of top human experts) seems like it probably makes sense on my views but this moral perspective, especially if you can use the controlled AIs to substantially reduce/delay ongoing deaths which seems plausible. Things like massively investing in safety/alignment work also easily makes sense. There are policies that substantially reduce the risk which merely require massive effort (and which don’t particularly delay powerful AI) that we could be applying.
I do think that this policy wouldn’t be on board with the sort of long pause that (e.g.) MIRI often discusses and it does materially alter what look like the best policies (though ultimately I don’t expect to get close to these best policies anyway).
habryka—‘If you don’t care about future people’—but why would any sane person not care at all about future people?
You offer a bunch of speculative math about longevity vs extinction risk.
OK, why not run some actual analysis on which is more likely to promote longevity research: direct biomedical research on longevity, or indirect AI research on AGI in hopes that it somehow, speculatively, solves longevity?
The AI industry is currently spending something on the order of $200 billion a year on research. The biomedical research on longevity, by contrast, is currently far less than $10 billion a year.
If we spent the $200 billion a year on longevity, instead of on AI, do you seriously think that we’d do worse on solving longevity? That’s what I would advocate. And it would involve virtually no extinction risk.
You are reading things into my comments I didn’t say. I of course don’t agree, or consider it reasonable, to “not care about future people”, that’s the whole context of this subthread.
My guess is if one did adopt a position that no future people matter (which again I do not think is a reasonable position), then I think the case for slowing down AI looks a lot worse. Not bad enough to make it an obvious slam that it’s bad, and my guess overall even under that worldview it would be dumb to rush towards developing AGI like we are currently doing, but it makes the case a lot weaker. There is much less to lose if you do not care about the future.
My guess is for the purpose of just solving longevity, AGI investment would indeed strongly outperform general biomedical investment. Humanity just isn’t very good at turning money into medical progress on demand like this.
It seems virtuous and good to be clear about which assumptions are load-bearing to my recommended actions. If I didn’t care about the future, I would definitely be advocating for a different mix of policies, though it likely would still involve marginal AI slowdown, but my guess is less forcefully, and a bunch of slowdown-related actions would become net bad.
I agree with you that a typical instance of working at an AI lab has worse consequences in expectation than working at a tobacco company, and I think that for a person who shares all your epistemic beliefs to work in a typical role at an AI lab would indeed be a worse failure of moral character than to work at a tobacco company.
I also agree that in many cases people at AI labs have been exposed at least once to arguments which, if they had better epistemics and dedicated more time to thinking about the consequences of their work, could have convinced them that it was bad for the world for them to do such work. And I do think the failure to engage with such arguments and seriously consider them, in situations like these, is a stain on someone’s character! But I think it’s the sort of ethical failure which a majority of humans will make by default, rather than something indicative of remarkably bad morality.
I just don’t think this sort of utilitarian calculus makes sense to apply when considering the actions of people who don’t share the object-level beliefs at hand! I think people who worked to promulgate communism in the late 19th century were not unusually evil, for instance.
This also seems locally invalid. Most people in fact don’t make this ethical failure because they don’t work at AI labs, nor do they dedicate their lives to work which has nearly as much power or influence on others as this.
It does seem consistent (and agree with commonsense morality) to say that if you are smart enough to locate the levers of power in the world, and you pursue them, then you have a moral responsibility to make sure you use them right if you do get your hands on them, otherwise we will call you evil and grossly irresponsible.
Oh cool, if we’re deciding it’s now virtuous to ostracize people we don’t like and declare them evil, I have a list of enemies I’d like to go after too. This is a great weapon, and fun to use! (Why did we ever stop using it?) Who else can we persecute? There are several much weaker and more-hated groups we could do to warm up.
Criminal negligence leading to catastrophic consequences is already ostracized and persecuted, because, well, it’s a crime.
Sure. But if an AI company grows an ASI that extinguishes humanity, who is left to sue them? Who is left to prosecute them?
The threat of legal action for criminal negligence is not an effective deterrent if there is no criminal justice system left, because there is no human species left.