cousin_it comments on Learning from Human Preferences—from OpenAI (including Christiano, Amodei & Legg)

cousin_it 20 Jun 2017 17:54 UTC
2 points
0
Thank you Dario! All good points, I didn’t wish to detract from your work, it’s the most hopeful thing I’ve seen about AI progress in years. Maybe one reason for my comment is that I’ve worked on “neat” decision theory math, and now you have this promising new idea using math that feels stubbornly alien to me, so I can’t jump into helping you guys save the world :-)
- DarioAmodei 20 Jun 2017 18:04 UTC
  2 points
  0
  Parent
  I have a hunch that semi-neat approaches to AI may come back as a layer on top of neural nets—consider the work on using neural net heuristics to decide the next step in theorem-proving (https://arxiv.org/abs/1606.04442). In such a system, the decision process is opaque, but the result is fully verifiable, at least in the world of math (in a powerful system the theorems may be being proved for ultimate use in some fuzzy interface with reality). The extent to which future systems might look like this, or what that means for safety, isn’t very clear yet (at least not to me), but it’s another paradigm to consider.