orthonormal comments on Decision Theories: A Semi-Formal Analysis, Part I

orthonormal 24 Mar 2012 20:40 UTC
0 points
0

Note that our agent will quickly prove “if output = ‘defect’ then utility >= $1”.

Your intuition that it gets deduced before any of the spurious claims like “if output = ‘defect’ then utility ⇐ -$1” is taking advantage of an authoritative payoff matrix that X can’t safely calculate xerself. I’m not sure that this tweaked version is any safer from exploitation...
- AlephNeil 24 Mar 2012 21:15 UTC
  0 points
  0
  Parent
  
  an authoritative payoff matrix that X can’t safely calculate xerself.
  
  Why not? Can’t the payoff matrix be “read off” from the “world program” (assuming X isn’t just ‘given’ the payoff matrix as an argument.)
  - orthonormal 24 Mar 2012 23:29 UTC
    0 points
    0
    Parent
    The one-player game that I wrote out is an example of a NDT agent trying to read off the payoff matrix from the world program, and failing. There are ways to ensure you read off the matrix correctly, but that’s tantamount to what you do to implement CDT, so I’ll explain it in Part II.