Charlie Steiner comments on A Question about Corrigibility (2015)

Charlie Steiner 3 Dec 2023 4:18 UTC
4 points
0
Yup, this all seems basically right. Though in reality I’m not that worried about the “we might outlaw some good actions” half of the dilemma. In real-world settings, actions are so multi-faceted that being able to outlaw a class of actions based on any simple property would be a research triumph.
Also see https://www.lesswrong.com/posts/LR8yhJCBffky8X3Az/using-predictors-in-corrigible-systems or https://www.lesswrong.com/posts/qpZTWb2wvgSt5WQ4H/defining-myopia for successor lines of reasoning.
- A.H. 3 Dec 2023 14:47 UTC
  1 point
  0
  Parent
  Yes, I too am more concerned from a ‘maybe this framing isn’t super useful as it fails to capture important distinctions between corrigible and non-corrigible’ point of view rather than a ‘we might outlaw some good actions’ point of view.
  Thanks for the links, they look interesting!