William_S comments on Can corrigibility be learned safely?

William_S 4 Apr 2018 17:45 UTC
LW: 2 AF: 2
0
AF
I don’t have optimism about finding a core which is already highly competent at these tasks.
I’m a little confused about what this statement means. I thought that if you have an overseer that implements some reasoning core, and consider amplify(overseer) with infinite computation time and unlimited ability to query the world (ie. for background information on what humans seem to want, how they behave, etc.), then amplify(overseer) should be able to solve any problem that an agent produced by iterating IDA could solve.
Did you mean to say that
- “already highly competent at these tasks” means that the core should be able to solve these problems without querying the world at all, and this is not likely to be possible?
- you don’t expect to find a core such that only one round of amplification of amplify(overseer) can solve practical tasks in any reasonable amount of time/number of queries?
- There is some other way that the agent produced by IDA would be more competent than the original amplified overseer?
- paulfchristiano 4 Apr 2018 19:38 UTC
  LW: 3 AF: 2
  0
  AF Parent
  I mean that the core itself, as a policy, won’t be able to solve these problems. It also won’t solve it after a small number of amplification steps. And probably it will have to query the world.
  - William_S 4 Apr 2018 19:49 UTC
    LW: 3 AF: 2
    0
    AF Parent
    What is the difference between “core after a small number of amplification steps” and “core after a large number of amplification steps” that isn’t captured in “larger effective computing power” or “larger set of information about the world”, and allows the highly amplified core to solve these problems?
    - paulfchristiano 7 Apr 2018 4:09 UTC
      LW: 5 AF: 3
      0
      AF Parent
      I didn’t mean to suggest there is a difference other than giving it more computation and more data.
      I was imagining Amplify(X) as a procedure that calls X a bounded number of times, so that you need to iterate Amplify in order to have arbitrarily large runtimes, while I think you were imagining a parameterized operation Amplify(X, n) that takes n time and so can be scaled up directly. Your usage also seems fine.
      Even if that’s not the difference, I strongly expect we are on the same page here about everything other than words. I’ve definitely updated some about the difficulty of words.
      - William_S 7 Apr 2018 14:40 UTC
        LW: 1 AF: 1
        0
        AF Parent
        Okay, I agree that we’re on the same page. Amplify(X,n) is what I had in mind.