Rob Bensinger comments on Counterarguments to the basic AI x-risk case

Rob Bensinger 14 Oct 2022 23:36 UTC
LW: 14 AF: 6
2
AF
Also from Ronny:
There’s also an important disanalogy between generating/recognizing faces and learning ‘human values’, which is that humans are perfect human face recognizers but not perfect recognizers of worlds high in ‘human values’.
That means that there might be world states or plans in the training data or generated by adversarial training that look to us, and ML trained to recognize these things the way we recognize them, like they are awesome, but are actually awful.
- Jeff Rose 15 Oct 2022 15:06 UTC
  4 points
  4
  Parent
  As an empirical fact, humans are not perfect human face recognizers. It is something humans are very good at, but not perfect. We are definitely much better recognizers of human faces than of worlds high in human values. (I think it is perhaps more relevant to say consensus on what constitutes a human face is much. much higher than what constitutes a world high in human values.)
  I am unsure whether this distinction is relevant for the substance of the argument however.
- Rob Bensinger 14 Oct 2022 23:39 UTC
  LW: 3 AF: 2
  0
  AF Parent
  (And we aren’t perfect recognizers of ‘functional, safe-to-use nanofactory’ or other known-to-me things that might save the world.)