Rohin Shah comments on [AN #80]: Why AI risk might be solved without additional intervention from longtermists

Rohin Shah 21 Feb 2020 23:26 UTC
LW: 4 AF: 3
0
AF
It seems worth clarifying that you’re only optimistic about certain types of AI safety problems.
Tbc, I’m optimistic about all the types of AI safety problems that people have proposed, including the philosophical ones. When I said “all else equal those seem more likely to me”, I meant that if all the other facts about the matter are the same, but one risk affects only future people and not current people, that risk would seem more likely to me because people would care less about it. But I am optimistic about the actual risks that you and others argue for.
That said, over the last week I have become less optimistic specifically about overcoming race dynamics, mostly from talking to people at FHI / GovAI. I’m not sure how much to update though. (Still broadly optimistic.)
it seems that when you wrote the title of this newsletter “Why AI risk might be solved without additional intervention from longtermists” you must have meant “Why some forms of AI risk …”, or perhaps certain forms of AI risk just didn’t come to your mind at that time.
It’s notable that AI Impacts asked for people who were skeptical of AI risk (or something along those lines) and to my eye it looks like all four of the people in the newsletter independently interpreted that as accidental technical AI risk in which the AI is adversarially optimizing against you (or at least that’s what the four people argued against). This seems like pretty strong evidence that when people hear “AI risk” they now think of technical accidental AI risk, regardless of what the historical definition may have been. I know certainly that is my default assumption when someone (other than you) says “AI risk”.
I would certainly support having clearer definitions and terminology if we could all agree on them.
- Wei Dai 22 Feb 2020 0:17 UTC
  LW: 4 AF: 3
  0
  AF Parent
  
  But I am optimistic about the actual risks that you and others argue for.
  
  Why? I actually wrote a reply that was more questioning in tone, and then changed it because I found some comments you made where you seemed to be concerned about the additional AI risks. Good thing I saved a copy of the original reply, so I’ll just paste it below:
  
  I wonder if you would consider writing an overview of your perspective on AI risk strategy. (You do have a sequence but I’m looking for something that’s more comprehensive, that includes e.g. human safety and philosophical problems. Or let me know if there’s an existing post that I’ve missed.) I ask because you’re one of the most prolific participants here but don’t fall into one of the existing “camps” on AI risk for whom I already have good models for. It’s happened several times that I see a comment from you that seems wrong or unclear, but I’m afraid to risk being annoying or repetitive with my questions/objections. (I sometimes worry that I’ve already brought up some issue with you and then forgot your answer.) It would help a lot to have a better model of you in my head and in writing so I can refer to that to help me interpret what the most likely intended meaning of a comment is, or to predict how you would likely answer if I were to ask certain questions.
  
  It’s notable that AI Impacts asked for people who were skeptical of AI risk (or something along those lines) and to my eye it looks like all four of the people in the newsletter independently interpreted that as accidental technical AI risk in which the AI is adversarially optimizing against you (or at least that’s what the four people argued against).
  
  Maybe that’s because the question was asked in a way that indicated the questioner was mostly interested in technical accidental AI risk? And some of them may be fine with defining “AI risk” as “AI-caused x-risk” but just didn’t have the other risks on the top of their minds, because their personal focus is on the technical/accidental side. In other words I don’t think this is strong evidence that all 4 people would endorse defining “AI risk” as “technical accidental AI risk”. It also seems notable that I’ve been using “AI risk” in a broad sense for a while and no one has objected to that usage until now.
  
  I would certainly support having clearer definitions and terminology if we could all agree on them.
  
  The current situation seems to be that we have two good (relatively clear) terms “technical accidental AI risk” and “AI-caused x-risk” and the dispute is over what plain “AI risk” should be shorthand for. Does that seem fair?
  - Rohin Shah 22 Feb 2020 9:14 UTC
    LW: 5 AF: 3
    0
    AF Parent
    I ask because you’re one of the most prolific participants here but don’t fall into one of the existing “camps” on AI risk for whom I already have good models for.
    Seems right, I think my opinions fall closest to Paul’s, though it’s also hard for me to tell what Paul’s opinions are. I think this older thread is a relatively good summary of the considerations I tend to think about, though I’d place different emphases now. (Sadly I don’t have the time to write a proper post about what I think about AI strategy—it’s a pretty big topic.)
    The current situation seems to be that we have two good (relatively clear) terms “technical accidental AI risk” and “AI-caused x-risk” and the dispute is over what plain “AI risk” should be shorthand for. Does that seem fair?
    Yes, though I would frame it as “the ~5 people reading these comments have two clear terms, while everyone else uses a confusing mishmash of terms”. The hard part is in getting everyone else to use the terms. I am generally skeptical of deciding on definitions and getting everyone else to use them, and usually try to use terms the way other people use terms.
    In other words I don’t think this is strong evidence that all 4 people would endorse defining “AI risk” as “technical accidental AI risk”. It also seems notable that I’ve been using “AI risk” in a broad sense for a while and no one has objected to that usage until now.
    Agreed with this, but see above about trying to conform with the way terms are used, rather than defining terms and trying to drag everyone else along.
    - Matthew Barnett 22 Feb 2020 18:08 UTC
      LW: 2 AF: 1
      0
      AF Parent
      see above about trying to conform with the way terms are used, rather than defining terms and trying to drag everyone else along.
      This seems odd given your objection to “soft/slow” takeoff usage and your advocacy of “continuous takeoff” ;)
      - Rohin Shah 23 Feb 2020 18:58 UTC
        LW: 3 AF: 2
        0
        AF Parent
        I don’t think “soft/slow takeoff” has a canonical meaning—some people (e.g. Paul) interpret it as not having discontinuities, while others interpret it as capabilities increasing slowly past human intelligence over (say) centuries (e.g. Superintelligence). If I say “slow takeoff” I don’t know which one the listener is going to hear it as. (And if I had to guess, I’d expect they think about the centuries-long version, which is usually not the one I mean.)
        In contrast, I think “AI risk” has a much more canonical meaning, in that if I say “AI risk” I expect most listeners to interpret it as accidental risk caused by the AI system optimizing for goals that are not our own.
        (Perhaps an important point is that I’m trying to communicate to a much wider audience than the people who read all the Alignment Forum posts and comments. I’d feel more okay about “slow takeoff” if I was just speaking to people who have read many of the posts already arguing about takeoff speeds.)