Wei Dai comments on Statement on AI Extinction—Signed by AGI Labs, Top Academics, and Many Other Notable Figures

Wei Dai 1 Jun 2023 16:12 UTC
LW: 41 AF: 21
33
AF
Is it just me or is it nuts that a statement this obvious could have gone outside the overton window, and is now worth celebrating when it finally (re?)enters?

How is it possible to build a superintelligence at acceptable risk while this kind of thing can happen? What if there are other truths important to safely building a superintelligence, that nobody (or very few) acknowledges because they are outside the overton window?

Now that AI x-risk is finally in the overton window, what’s your vote for the most important and obviously true statement that is still outside it (i.e., that almost nobody is willing to say or is interested in saying)? Here are my top candidates:
1. Dying of old age, as well as physical and mental deterioration from it, are bad and worth substantial coordinated effort to prevent.
2. It’s possible to make serious irreversible mistakes due to having incorrect answers to important philosophical questions. In fact, this is likely, considering how much confusion and disagreement there is on many philosophical questions that seem obviously important.
What links here?
- Wei Dai's comment on Carl Shulman on The Lunar Society (7 hour, two-part podcast) by ESRogs (28 Jun 2023 20:31 UTC; 39 points)
- Daniel Kokotajlo 4 Jun 2023 15:09 UTC
  LW: 7 AF: 3
  2
  AF Parent
  Why is 1 important? It seems like something we can defer discussion of until after (if ever) alignment is solved, no?
  
  2 is arguably in that category also, though idk.
  - Wei Dai 5 Jun 2023 3:41 UTC
    LW: 29 AF: 13
    14
    AF Parent
    Why is 1 important? It seems like something we can defer discussion of until after (if ever) alignment is solved, no?
    
    If aging was solved or looked like it will be solved within next few decades, it would make efforts to stop or slow down AI development less problematic, both practically and ethically. I think some AI accelerationists might be motivated directly by the prospect of dying/deterioration from old age, and/or view lack of interest/progress on that front as a sign of human inadequacy/stagnation (contributing to their antipathy towards humans). At the same time, the fact that pausing AI development has a large cost in lives of current people means that you have to have a high p(doom) or credence in utilitarianism/longtermism to support it (and risk committing a kind of moral atrocity if you turn out to be wrong).
    
    2 is arguably in that category also, though idk.
    
    2 is important because as tech/AI capabilities increase, the possibilities to “make serious irreversible mistakes due to having incorrect answers to important philosophical questions” seem to open up exponentially. Some examples:
    
    premature value lock-in
    value drift,
    handing over too much control/resources to alien/unaligned agents due to negotiation mistakes
    mistakes related to commitment races
    the process of creating/aligning AI might be unethical or creates a costly obligation
    failure to prevent mindcrime inside AIs
    intentionally doing horrible things at astronomical scale due to having wrong values/philosophies
    
    If your point is that we could delegate solving these problems to aligned AI once we have them, my worry is that AI, including aligned AI, will be much better at creating new philosophical problems (opportunities to make mistakes) than at solving them. The task of reducing this risk (e.g., by solving metaphilosophy or otherwise making sure AIs’ philosophical abilities keep up with or outpace their other intellectual abilities) seems super neglected, in part because very few people seem to acknowledge the importance of avoiding errors like the ones listed above.
    
    (BTW I was surprised to see your skepticism about 2, since it feels like I’ve been talking about it on LW like a broken record, and I don’t recall seeing any objections from you before. Would be curious to know if anything I said above is new to you, or you’ve seen me say similar things before but weren’t convinced.)
    What links here?
    Wei Dai's comment on A Playbook for AI Risk Reduction (focused on misaligned AI) by HoldenKarnofsky (7 Jun 2023 1:28 UTC; 25 points)
    - Daniel Kokotajlo 5 Jun 2023 4:02 UTC
      LW: 5 AF: 3
      0
      AF Parent
      Something like 2% of people die every year right? So even if we ignore the value of future people and all sorts of other concerns and just focus on whether currently living people get to live or die, it would be worth delaying a year if we could thereby decrease p(doom) by 2 percentage points. My p(doom) is currently 70% so it is very easy to achieve that. Even at 10% p(doom), which I consider to be unreasonably low, it would probably be worth delaying a few years.
      
      Re: 2: Yeah I basically agree. I’m just not as confident as you are I guess. Like, maybe the answers to the problems you describe are fairly objective, fairly easy for smart AIs to see, and so all we need to do is make smart AIs that are honest and then proceed cautiously and ask them the right questions. I’m not confident in this skepticism and could imagine becoming much more convinced simply by thinking or hearing about the topic more.
      - Wei Dai 5 Jun 2023 22:03 UTC
        LW: 6 AF: 5
        2
        AF Parent
        
        Even at 10% p(doom), which I consider to be unreasonably low, it would probably be worth delaying a few years.
        
        Someone with with 10% p(doom) may worry that if they got into a coalition with others to delay AI, they can’t control the delay precisely, and it could easily become more than a few years. Maybe it would be better not to take that risk, from their perspective.
        
        And lots of people have p(doom)<10%. Scott Aaronson just gave 2% for example, and he’s probably taken AI risk more seriously than most (currently working on AI safety at OpenAI), so probably the median p(doom) (or effective p(doom) for people who haven’t thought about it explicitly) among the whole population is even lower.
        
        I’m just not as confident as you are I guess. Like, maybe the answers to the problems you describe are fairly objective, fairly easy for smart AIs to see, and so all we need to do is make smart AIs that are honest and then proceed cautiously and ask them the right questions.
        
        I think I’ve tried to take into account uncertainties like this. It seems that in order for my position (that the topic is important and too neglected) to be wrong, one has to reach high confidence that these kinds of problems will be easy for AIs (or humans or AI-human teams) to solve, and I don’t see how that kind of conclusion could be reached today. I do have some specific arguments for why the AIs we’ll build may be bad at philosophy, but I think those are not very strong arguments so I’m mostly relying on a prior that says we should be worried about and thinking about this until we see good reasons not to. (It seems hard to have strong arguments either way today, given our current state of knowledge about metaphilosophy and future AIs.)
        
        Another argument for my position is that humans have already created a bunch of opportunities for ourselves to make serious philosophical mistakes, like around nuclear weapons, farmed animals, AI, and we can’t solve those problems by just asking smart honest humans the right questions, as there is a lot of disagreement between philosophers on many important questions.
        
        I’m not confident in this skepticism and could imagine becoming much more convinced simply by thinking or hearing about the topic more.
        
        What’s stopping you from doing this, if anything? (BTW, beyond the general societal level of neglect, I’m especially puzzled by the lack of interest/engagement on this topic from the many people in EA with formal philosophy backgrounds. If you’re already interested in AI and x-risks and philosophy, how is this not an obvious topic to work on or think about?)
        Daniel Kokotajlo 5 Jun 2023 23:14 UTC
        LW: 2 AF: 2
        0
        AF Parent
        I guess I just think it’s pretty unreasonable to have p(doom) of 10% or less at this point, if you are familiar with the field, timelines, etc.
        
        I totally agree the topic is important and neglected. I only said “arguably” deferrable, I have less than 50% credence that it is deferrable. As for why I’m not working on it myself, well, aaaah I’m busy idk what to do aaaaaaah! There’s a lot going on that seems important. I think I’ve gotten wrapped up in more OAI-specific things since coming to OpenAI, and maybe that’s bad & I should be stepping back and trying to go where I’m most needed even if that means leaving OpenAI. But yeah. I’m open to being convinced!
        Wei Dai 6 Jun 2023 0:11 UTC
        LW: 4 AF: 3
        0
        AF Parent
        I guess part of the problem is that the people who are currently most receptive to my message are already deeply enmeshed in other x-risk work, and I don’t know how to reach others for whom the message might be helpful (such as academic philosophers just starting to think about AI?). If on reflection you think it would be worth spending some of your time on this, one particularly useful thing might be to do some sort of outreach/field-building, like writing a post or paper describing the problem, presenting it at conferences, and otherwise attracting more attention to it.
        
        (One worry I have about this is, if someone is just starting to think about AI at this late stage, maybe their thinking process just isn’t very good, and I don’t want them to be working on this topic! But then again maybe there’s a bunch of philosophers who have been worried about AI for a while, but have stayed away due to the overton window thing?)
        Daniel Kokotajlo 6 Jun 2023 0:33 UTC
        LW: 2 AF: 2
        0
        AF Parent
        Somehow there are 4 copies of this post
        [ ]
        [deleted]
        [ ]
        [deleted]
        [ ]
        [deleted]
- dr_s 1 Jun 2023 18:17 UTC
  3 points
  2
  Parent
  1 is an obvious one that many would deny out of sheer copium. Though of course “not dying” has to go hand in hand with “not aging” or it would rightly be seen as torture.
  
  2 seems vague enough that I don’t think people would vehemently disagree. If you specify, such as suggesting that there are absolutely correct or wrong answers to ethical questions, for example, then you’ll get disagreement (including mine, for that matter, on that specific hypothetical claim).