ryan_greenblatt comments on Current AIs Provide Nearly No Data Relevant to AGI Alignment

ryan_greenblatt 17 Dec 2023 3:16 UTC
9 points
6

it assumes that real AGIs or future AGIs will confidently have certain properties like having deceptive alignment

The post doesn’t claim AGIs will be deceptive aligned, it claims that AGIs will be capable of implementing deceptive alignment due to internally doing large amounts of consequentialist-y reasoning. This seems like a very different claim. This claim might also be false (for reasons I discuss in the second bullet point of this comment), but it’s importantly different and IMO much more defensible.
- Noosphere89 17 Dec 2023 3:25 UTC
  5 points
  0
  Parent
  I was just wrong here, apparently, I misread what Thane Ruthenis is saying, and I’m not sure what to do with my comment up above.