Kaj_Sotala comments on Worries About AI Are Usually Complements Not Substitutes

Kaj_Sotala 26 Apr 2025 13:13 UTC
2 points
0
Plausibly the result is true for people who are only getting a superficial familiarity with/investment into the topics, but I’ve certainly seen people strongly into one camp or the other act strongly dismissive of the other. E.g. Eliezer has on occasion complained about “modern small issues that obviously wouldn’t kill everyone” being something that “derailed AGI notkilleveryoneism”.
- yams 26 Apr 2025 18:40 UTC
  2 points
  0
  Parent
  1. Importantly, the conclusion of the above paper is ‘x-risk concerns don’t diminish near-term risk concerns’, and not ‘near-term concerns don’t diminish x-risk concerns.’ There’s no strong reason to assume that this property would be commutative, and I’d want to see some research going the other way around before claiming the other thing.
  2. I want to talk a bit about how I receive the kind of thing from Eliezer that you linked to above. There’s something like a fallacy of composition that occurs when talking about ‘the problem with AI.’ Like, if we accidentally make an AI that is racist, that is very bad! If we make an AI that breaks markets, without some other mechanisms in place, that is also very bad! If we make an AI that enables authoritarianism — again — bad!
    
    Fortunately, a sufficiently powerful aligned intelligence wouldn’t do any of those things and, the ones it would do, it would actually put mechanisms in place to dissolve the downside. This is ~definitionally true of an aligned ASI. The current solutions to the imminent threat of doom (‘don’t fucking build it’) also address many of these other concerns inherently (or, at least, give you extra time to figure out what to do about them), making the positions truly synergistic, conditional on prioritizing x-risk.
    
    However, the inverse is not true. If someone thinks the ‘real problem with AI’ is one of the short-term issues above, then they’re tempted to come up with Clever Solutions, and even to syphon funding/talent away from, i.e., technical governance, alignment research, comms (setting aside for now that these directions are complicated and one can plausibly not have much credence on them working out), thus feeding into a different goal entirely (the goal of ‘no just build it as long as it doesn’t demonstrate this particular failure mode’). Alignment is (probably) not an elephant you can eat one bite at a time (or, at least, we don’t have reason to believe it is), and trying to eat it one bite at a time via the current paradigm largely does more acceleration than it does safety-ification.
    
    But instead of saying this kind of complicated thing, Eliezer says the true short-hand version that looks like he’s just calling people with short-term concerns stupid and, most ungenerously, like he’s actually just FINE with the world where we get the racist/economy-breaking/authoritarianism-enabling AI that manifests some awful world for all current and future humans (this is not true; I think I can go out on a limb and say ‘Eliezer wants good things, and racist AIs, mass unemployment with no fundamental structural change or mitigation of what it means to be unemployed, or God-King Sama are all bad things Eliezer doesn’t want’).
    
    I think, historically, coalition-building has been less important (if the Good Story you’re trying to tell is ‘we’re just gonna align the damn thing’ or ‘we’re just gonna die while shouting the truth as loud as possible’), and so saying the short version of the point probably didn’t appear especially near-term costly at the time. Now it’s much more costly, since The Thing We [here meaning ‘the part of the ai safety ecosystem I identify with’; not ‘everyone on lw’ or ‘MIRI’] Are Trying To Do is get a wide range of people to understand that halting development is what’s best for their interests, not only because of x-risk (although this is the most important part), but also because it averts/mitigates near-term dystopias. I really hope people start saying the long version, or just say the short version of “Yes, that also matters, and the solution I have for you also goes a long way toward addressing that particular concern.”
    
    (I’ll admit that this explanation is somewhat motivated; I have some probability on AI winter, and think those worlds still look really bad if we’re not doing anything to mitigate short-term risks and societal upheaval; i.e. “Good luck enforcing your x-risk mitigating governance regime through economic/political/social transformation.” Fortunately, halting / building the off switch seem to me like great first-line solutions to this kind of problem, and getting the issue in the Overton window, in the ways required for such policies to pass, would create opportunities to more robustly address these short-term risks as they come up.)