Vladimir_Nesov comments on An example of self-fulfilling spurious proofs in UDT

Vladimir_Nesov 25 Mar 2012 12:12 UTC
8 points
More generally, what makes Q dangerous is that (1) it only settles for a spurious moral argument, doesn’t accept natural ones, and (2) what it finds is taken seriously by the agent, acted out. As a result, provability of a spurious moral argument provably implies its truth, which by Loeb’s theorem makes it true and forces the agent to be thus misled.

The only difference between Q and normal proof search procedures is that the normal procedures are OK with any proof, while Q dislikes natural proofs and ignores them. And this bit of “motivated skepticism” is sufficient to make the preferred spurious proofs come true, Q doesn’t just loop without finding anything.

This is a whole new level of Oracle AI unsafety… Take what it says seriously, and it can argue you into doing anything at all. :-)
- orthonormal 25 Mar 2012 16:17 UTC
  2 points
  Parent
  You mean Q instead of P, right? (Edit: Fixed.)
  - Vladimir_Nesov 25 Mar 2012 16:27 UTC
    0 points
    Parent
    Right, cousin_it changed some terminology from the post on the list, I didn’t notice. (Fixed.)