Rohin Shah comments on AI Safety Debate and Its Applications

Rohin Shah 20 Jan 2020 1:54 UTC
LW: 2 AF: 2
0
AF
Planned summary for the Alignment Newsletter:
This post defines the components of a <@debate@>(@AI safety via debate@) game, lists some of its applications, and defines truth-seeking as the property that we want. Assuming that the agent chooses randomly from the possible Nash equilibria, the truth-promoting likelihood is the probability that the agent picks the actually correct answer. The post then shows the results of experiments on MNIST and Fashion MNIST, seeing comparable results to the original paper.
- VojtaKovarik 23 Jan 2020 14:59 UTC
  LW: 1 AF: 1
  0
  AF Parent
  +1
  
  (Just noticed your comment for the other debate post/paper. I will reply to it during the weekend.)