interstice comments on Introducing myself: Henry Lieberman, MIT CSAIL, whycantwe.org

interstice 3 Mar 2022 23:54 UTC
20 points

The problem of whether the goals and values of an artificially intelligent agent will align with human goals and values, can be reduced to this problem: Will the goals and values of different human agents ever align with each other?

We can only program AI “in our own image”, so both the features and bugs of humanity will reappear in AI.

Why do you think these statements are true?
- Jalen Lyle-Holmes 4 Mar 2022 9:52 UTC
  5 points
  Parent
  Yes great question. Looking at programming in general, there seem to be many obvious counterexamples, where computers have certain capabilities (‘features’) that humans don’t (e.g. doing millions of arithmetic operations extremely fast with zero clumsy errors) and likewise where they have certain problem (‘bugs’) that we don’t (e.g. adversarial examples for image classifiers, which don’t trip humans up at all but entire ruin the neural nets classification.)