Rafael Harth comments on Contest: $1,000 for good questions to ask to an Oracle AI

Rafael Harth 2 Jul 2019 19:20 UTC
2 points
Submission for LBO:
Input a corpus of text (could be multiple posts) describing technical approaches to align a powerful AI. Split this into a finite number of items that are relatively short (such as paragraphs). Ask the oracle to choose the part that is most worth spending more time on. (For example, there might be a paragraph with a dangerous hidden assumption in an otherwise promising approach, and thinking more about it might reveal that and lead to conceptual progress.)
Have a team of researches look into it for an adequate amount of time which is fixed (and told to the oracle) in advance (maybe three months?) After the time is over, have them rate the progress they made compared to some sensible baseline. Use this as the oracle’s reward.
Of course this has the problem of maximizing for apparent insight rather than actual insight.
- Stuart_Armstrong 3 Jul 2019 10:10 UTC
  3 points
  Parent
  
  Of course this has the problem of maximizing for apparent insight rather than actual insight.
  
  Until we can measure actual insight, this will remain a problem ^_^