FAWS comments on Self-modification is the correct justification for updateless decision theory

FAWS 12 Apr 2010 0:48 UTC
−2 points
0
Wait, are you thinking I’m thinking I can determine the umpteenth digit of pi in my scenario? I see your point; that would be insane.

My point is simply this: if your existence (or any other observation of yours) allows you to infer the umpteenth digit of pi is odd, then the AI you build should be allowed to use that fact, instead of trying to maximize utility even in the logically impossible world where that digit is even.

Actually you were: There are four possibilities:
- The AI will press the button, the digit is even
- The AI will not press the button, the digit is even, you don’t exist
- The AI will press the button, the digit is odd, the word will kaboom
- The AI will not press the button, the digit is odd.
Updating on the fact that the second possibility is not true is precisely equivalent to concluding that if the AI does not press the button the digit must be odd, and ensuring that the AI does not means choosing the digit to be odd.

If you already know that the digit is odd independent from the choice of the AI the whole thing reduces to a high stakes counterfactual mugging (if the destruction by Omega if the digit is even depends on what the AI knowing the digit to be odd would do, otherwise there is no dilemma in the first place).
- Tyrrell_McAllister 12 Apr 2010 2:49 UTC
  1 point
  0
  Parent
  
  Updating on the fact that the second possibility is not true is precisely equivalent to concluding that if the AI does not press the button the digit must be odd, and ensuring that the AI does not means choosing the digit to be odd.
  
  There is nothing insane about this, provided that it is properly understood. The resolution is essentially the same as the resolution of the paradox of free will in a classically-deterministic universe.
  
  In a classically-deterministic universe, all of your choices are mathematical consequences of the universe’s state 1 million years ago. And people often confused themselves by thinking, “Suppose that my future actions are under my control. Well, I will choose to take a certain action if and only if certain mathematical propositions are true (namely, the propositions necessary to deduce my choice from the state of the universe 1 million years ago). Therefore, by choosing to take that action, I am getting to decide the truth-values of those propositions. But the truth-values of mathematical propositions is beyond my control, so my future actions must also be beyond my control.”
  
  I think that people here generally get that this kind of thinking is confused. Even if we lived in a classically-deterministic universe, we could still think of ourselves as choosing our actions without concluding that we get to determine mathematical truth on a whim.
  
  Similarly, Benja’s AI can think of itself as getting to choose whether to push the button without thereby implying that it has the power to modify mathematical truth.
  - Benya 12 Apr 2010 3:01 UTC
    2 points
    0
    Parent
    
    Similarly, Benja’s AI can think of itself as getting to choose whether to push the button without thereby thinking that it has the power to modify mathematical truth.
    
    I think we’re all on the same page about being able to choose some mathematical truths, actually. What FAWS and I think is that in the setup I described, the human/AI does not get to determine the digit of pi, because the computation of the digits of pi does not involve a computation of the human’s choices in the thought experiment. [Unless of course by incredible mathematical coincidence, the calculation of digits of pi happens to be a universal computer, happens to simulate our universe, and by pure luck happens to depend on our choices just at the umpteenth digit. My math knowledge doesn’t suffice to rule that possibility out, but it’s not just astronomically but combinatorially unlikely, and not what any of us has in mind, I’m sure.]
- Benya 12 Apr 2010 1:53 UTC
  0 points
  0
  Parent
  I’ll grant you that my formulation had a serious bug, but--
  There are four possibilities:
  
  The AI will press the button, the digit is even
  The AI will not press the button, the digit is even, you don’t exist
  The AI will press the button, the digit is odd, the word will kaboom
  The AI will not press the button, the digit is odd.
  
  Updating on the fact that the second possibility is not true is precisely equivalent to concluding that if the AI does not press the button the digit must be odd
  Yes, if by that sentence you mean the logical proposition (AI presses button ⇒ digit is odd), also known as (digit odd \/ ~AI presses button).
  
  and ensuring that the AI does not means choosing the digit to be odd.
  
  I’ll only grant that if I actually end up building an AI that presses the button, and the digit is even, then Omega is a bad predictor, which would make the problem statement contradictory. Which is bad enough, but I don’t think I can be accused of minting causality from logical implication signs...
  
  In any case,
  
  If you already know that the digit is odd independent from the choice of the AI the whole thing reduces to a high stakes counterfactual mugging
  
  That’s true. I think that’s also what Wei Dai had in mind in http://lesswrong.com/lw/214/late_great_filter_is_not_bad_news/ of the great filter post (and not the ability to change Omega’s coin to tails by not pressing the button!). My position is that you should not pay in counterfactual muggings whose counterfactuality was already known prior to your decision to become a timeless decision theorist, although you should program (yourself | your AI) to pay in counterfactual muggings you don’t yet know to be counterfactual.