ChristianKl comments on MIRI strategy

ChristianKl 1 Nov 2013 16:52 UTC
2 points
0
How do you decide whether some interaction of a complex neural net is friendly or unfriendly?

It’s very hard to tell what a neural net or complex algorithm is doing even if you have logs.
- Mark_Friedenbach 2 Nov 2013 0:49 UTC
  0 points
  0
  Parent
  Don’t use a neural net (or variants like deep belief networks). The field has advanced quite a bit since the 60′s, and since the late 80′s there have been machine learning and knowledge representation structures which are human and/or auditor comprehensible, such as probabilistic graphical models. This would have to be first class types of the virtual machine which implements the AGI if you are using auditing as a confinement mechanism. But that’s not really a restriction as many AI techniques are already phrased in terms of these models (including Eliezer’s own TDT, for example), and others have simple adaptations.