dxu comments on The AI in a box boxes you

dxu 17 Aug 2015 2:11 UTC
3 points

Eliezer believes in TDT, which would disagree with several of your premises here (“practically uncorrelated”, for one).

The AI’s simulations are not copies of the Gatekeeper, just random people plucked out of “Platonic human-space”, so to speak. (This may have been unclear in my original comment; I was talking about a different formulation of the problem in which the AI doesn’t have enough information about the Gatekeeper to construct perfect copies.) TDT/UDT only applies when talking about copies of an agent (or at least, agents sufficiently similar that they will probably make the same decisions for the same reasons).

Your argument seems to map directly onto an argument for two-boxing.

No, because the “uncorrelated-ness” part doesn’t apply in Newcomb’s Problem (Omega’s decision on whether or not to fill the second box is directly correlated with its prediction of your decision).

What you call “perfectly rational” would be more accurately called “perfectly controlled”.

Meh, fair enough. I have to say, I’ve never heard of that term. Would this happen to have something to do with Vaniver’s series of posts on “control theory”?
- ike 17 Aug 2015 2:54 UTC
  1 point
  Parent
  Ah, I misunderstood your objection. Your talk about “pre-commitments” threw me off.
  
  just random people plucked out of “Platonic human-space”
  
  It seem to me that these wouldn’t quite be following the same general thought processes as an actual human; self-reflection should be able to convince one that they aren’t that type of simulation. If the AI is able to simulate someone to the extent that they “think like a human”, they should be able to simulate someone that thinks “sufficiently” like the Gatekeeper as well.
  
  I’ve never heard of that term.
  
  I made it up just now, it’s not a formal term. What I mean by it is basically: imagine a robot that wants to press a button. However, its hardware is only sufficient to press it successfully 1% of the time. Is that a lack of rationality? No, it’s a lack of control. This seems analogous to a human being unable to precommit properly.
  
  Would this happen to have something to do with Vaniver’s series of posts on “control theory”?
  
  No idea, haven’t read them. Probably not.