Gordon Seidoh Worley comments on Alignment is not enough

Gordon Seidoh Worley 12 Jan 2023 5:08 UTC
2 points
0
Hmm, I feel like there’s some misunderstanding here maybe?
What you’re calling “strong alignment” seems more like what most folks I talk to mean by “alignment”. What you call “alignment” seems more like what we often call “corrigibility”.
You’re right that corrigibility is not enough to get alignment on its own (i.e that “alignment” is not enough to get “strong alignment”), but it’s necessary.
- Lalartu 12 Jan 2023 10:39 UTC
  1 point
  0
  Parent
  I have an opposite impression. “Alignment” is usually interpreted as “do whatever a person who gave the order expected”, and what author calls “strong alignment” is aligned AGI ordered to implement CEV.
  - Gordon Seidoh Worley 12 Jan 2023 16:14 UTC
    2 points
    0
    Parent
    I think this is because there’s an active watering down of terms happening in some corners of AI capabilities research as a result of trying to only tackle subproblems in alignment and not being abundently clear that these are subproblems rather than the whole thing.