TurnTrout comments on [AN #80]: Why AI risk might be solved without additional intervention from longtermists

TurnTrout 2 Jan 2020 23:05 UTC
LW: 32 AF: 15
0
AF

There can’t be too many things that reduce the expected value of the future by 10%; if there were, there would be no expected value left. So, the prior that any particular thing has such an impact should be quite low.

I don’t follow this argument; I also checked the transcript, and I still don’t see why I should buy it. Paul said:

A priori you might’ve been like, well, if you’re going to build some AI, you’re probably going to build the AI so it’s trying to do what you want it to do. Probably that’s that. Plus, most things can’t destroy the expected value of the future by 10%. You just can’t have that many things, otherwise there’s not going to be any value left in the end. In particular, if you had 100 such things, then you’d be down to like 1/1000th of your values. ¹⁄₁₀ hundred thousandth? I don’t know, I’m not good at arithmetic.

Anyway, that’s a priori, just aren’t that many things are that bad and it seems like people would try and make AI that’s trying to do what they want.

In my words, the argument is “we agree that the future has nontrivial EV, therefore big negative impacts are a priori unlikely”.

But why do we agree about this? Why are we assuming the future can’t be that bleak in expectation? I think there are good outside-view arguments to this effect, but that isn’t the reasoning here.
What links here?
- paulfchristiano 3 Jan 2020 0:20 UTC
  LW: 17 AF: 9
  0
  AF Parent
  E.g. if you have a broad distribution over possible worlds, some of which are “fragile” and have 100 things that cut value down by 10%, and some of which are “robust” and don’t, then you get 10,000x more value from the robust worlds. So unless you are a priori pretty confident that you are in a fragile world (or they are 10,000x more valuable, or whatever), the robust worlds will tend to dominate.
  Similar arguments work if we aggregate across possible paths to achieving value within a fixed, known world—if there are several ways things can go well, some of which are more robust, those will drive almost all of the EV. And similarly for moral uncertainty (if there are several plausible views, the ones that consider this world a lost cause will instead spend their influence on other worlds) and so forth. I think it’s a reasonably robust conclusion across many different frameworks: your decision shouldn’t end up being dominated by some hugely conjunctive event.
  What links here?
  - [AN #80]: Why AI risk might be solved without additional intervention from longtermists by Rohin Shah (2 Jan 2020 18:20 UTC; 36 points)
  - Rohin Shah's comment on [AN #80]: Why AI risk might be solved without additional intervention from longtermists by Rohin Shah (4 Jan 2020 2:18 UTC; 2 points)
  - Rafael Harth 3 Jan 2020 7:16 UTC
    5 points
    0
    Parent
    I’m more uncertain about this one, but I believe that a separate problem with this answer is that it’s an argument about where value comes from, not an argument about what is probable. Let’s suppose 50% of all worlds are fragile and 50% are robust. If most of the things that destroy a world are due to emerging technology, then we still have similar amounts of both worlds around right now (or similar measure on both classes if they’re infinite many, or whatever). So it’s not a reason to suspect a non-fragile world right now.
    - Zack_M_Davis 24 Aug 2021 18:10 UTC
      10 points
      0
      Parent
      Another illustration: if you’re currently falling from a 90-story building, most of the expected utility is in worlds where there coincidentally happens to be a net to safely catch you before you hit the ground, or interventionist simulators decide to rescue you—even if virtually all of the probability is in worlds where you go splat and die. The decision theory looks right, but this is a lot less comforting than the interview made it sound.
    - Richard_Ngo 4 Jan 2020 20:55 UTC
      6 points
      0
      Parent
      Yes, but the fact that the fragile worlds are much more likely to end in the future is a reason to condition your efforts on being in a robust world.
      While I do buy Paul’s argument, I think it’d be very helpful if the various summaries of the interviews with him were edited to make it clear that he’s talking about value-conditioned probabilities rather than unconditional probabilities—since the claim as originally stated feels misleading. (Even if some decision theories only use the former, most people think in terms of the latter).
      - Rafael Harth 4 Jan 2020 21:25 UTC
        15 points
        0
        Parent
        value-conditioned probabilities
        Is this a thing or something you just coined? “Probability” has a meaning, I’m totally against using it for things that aren’t that.
        I get why the argument is valid for deciding what we should do – and you could argue that’s the only important thing. But it doesn’t make it more likely that our world is robust, which is what the post was claiming. It’s not about probability, it’s about EV.
  - Ofer 3 Jan 2020 5:39 UTC
    LW: 3 AF: 2
    0
    AF Parent
    This argument seems to point at some extremely important considerations in the vicinity of “we should act according to how we want civilizations similar to us to act” (rather than just focusing on causally influencing our future light cone), etc.
    
    The details of the distribution over possible worlds that you use here seem to matter a lot. How robust are the “robust worlds”? If they are maximally robust (i.e. things turn out great with probability 1 no matter what the civilization does) then we should assign zero weight to the prospect of being in a “robust world”, and place all our chips on being in a “fragile world”.
    
    Contrarily, if the distribution over possible worlds assigns sufficient probability to worlds in which there is a single very risky thing that cuts EV down by either 10% or 90% depending on whether the civilization takes it seriously or not, then perhaps such worlds should dominate our decision making.
  - Rafael Harth 3 Jan 2020 7:12 UTC
    1 point
    0
    Parent
    E.g. if you have a broad distribution over possible worlds, some of which are “fragile” and have 100 things that cut value down by 10%, and some of which are “robust” and don’t, then you get 10,000x more value from the robust worlds. So unless you are a priori pretty confident that you are in a fragile world (or they are 10,000x more valuable, or whatever), the robust worlds will tend to dominate.
    This is only true if you assume that there is an equal number of robust and fragile worlds out there, and your uncertainty is strictly random, i.e. you’re uncertain about which of those worlds you live in.
    I’m not super confident that our world is fragile, but I suspect that most worlds look the same. I.e., maybe 99.99% of worlds are robust, maybe 99.99% are fragile. If it’s the latter, then I probably live in a fragile world.
    - Rohin Shah 3 Jan 2020 8:28 UTC
      9 points
      0
      Parent
      If it’s a 50% chance that 99.99% of worlds are robust and 50% chance that 99.99% are fragile, then the vast majority of EV comes from the first option where the vast majority of worlds are robust.
      - Rafael Harth 3 Jan 2020 8:35 UTC
        1 point
        0
        Parent
        You’re right, the nature of uncertainty doesn’t actually matter for the EV. My bad.
        Wei Dai 3 Jan 2020 11:27 UTC
        4 points
        0
        Parent
        I think it does actually, although I’m not sure how exactly. See Logical vs physical risk aversion.