Dagon comments on A world in which the alignment problem seems lower-stakes

Dagon 8 Jul 2021 22:19 UTC
2 points
If “you and your civilization” has an extreme privilege in your utility function, that means the AI is safer when it’s partitioned away from you, but it also means it doesn’t generate much utility in the first place. that’s not changing the problem, it’s just setting the coefficient of u(right) very low.
I imagine you can reduce the stakes of any alignment problem by not caring about the outcome.
- Pattern 9 Jul 2021 22:35 UTC
  2 points
  Parent
  In theory, the main benefit, is that if it works, the same thing could be set lose in your half of the universe. (Which brings up a new kind of treacherous turn.)
  
  that’s not changing the problem, it’s just setting the coefficient of u(right) very low.
  Less than half the universe—just the moon, or Pluto or something (maybe an asteroid) arguably is a region which starts out with low achievable utility, so...if the AI might be really good at optimizing for your utility then, in expectation, it might be a reasonably large gain in utility.
  - Dagon 9 Jul 2021 23:22 UTC
    2 points
    Parent
    Still not on board with the value of this. Why would you expect an AGI that does no harm (as far as you know) in an unpopulated and unobserved portion of the universe also does no harm on Earth (where you keep your stuff, and get the vast majority of your utility, but also with a radically different context—nearly unrelated terms in your utility function).
    - Pattern 10 Jul 2021 5:24 UTC
      2 points
      Parent
      Why would you expect an AGI that does no harm (as far as you know)
      The hypothetical included knowing. So, the hypothetical approach I proposed exploited this.
    - TurnTrout 10 Jul 2021 3:12 UTC
      2 points
      Parent
      The main benefit is that the AI is not at risk of killing you. In the left half of the universe, it is at risk of killing you.
- TurnTrout 8 Jul 2021 22:24 UTC
  2 points
  Parent
  I don’t follow why you disagree. It’s higher-stakes to operate something which can easily kill me, than to operate something which can’t.
  - Dagon 8 Jul 2021 22:39 UTC
    1 point
    Parent
    Sure, but the stakes are embedded in U. The whole post would be simpler to just say “an AGI that can’t impact my utility is low stakes”. The partitioning of the universe only “works” if you don’t attempt to claim that U(left) is significant. To the extent that you CARE about what happens in that part of the universe, your utility is impacted by the AGI’s alignment or lack thereof.