Difference between CDT and ADT/UDT as constant programs

gRR19 Mar 2012 19:41 UTC

4 points

After some thinking, I came upon an idea how to define the difference between CDT and UDT within the constant programs framework. I would post it as a comment, but it is rather long...

The idea is to separate the cognitive part of an agent into three separate modules:

1. Simulator: given the code for a parameterless function X(), the Simulator tries to evaluate it, spending L computation steps. The result is either proving that X()=x for some value x, or leaving X() unknown.

2. Correlator: given the code for two functions X(...) and Y(...), the Correlator checks for proofs (of length up to P) of structural similarity between the source codes of the functions, trying to prove correlations X(...)=Y(...).

[Note: the Simulator and the Correlator can use the results of each other, so that:
If simulator proves that A()=x, then correlator can prove that A()+B() = x+B()
If correlator proves that A()=B(), then simulator can skip simulation when proving that (A()==B() ? 1 : 2) = 1]

3. Executive: allocates tasks and resources to Simulator and Correlator in some systematic manner, trying to get them to prove the “moral arguments”
Self()=x ⇒ U()=u
or
( Self()=x ⇒ U()=ux AND Self()=y ⇒ U()=uy ) ⇒ ux > uy,
and returns the best found action.

Now, CDT can be defined as an agent with the Simulator, but without the Correlator. Then, no matter what L it is given, the Simulator won’t be able to prove that Self()=Self(), because of the infinite regress. So the agent will be opaque to itself, and will two-box on Newcomb’s problem and defect against itself in Prisoner’s Dilemma.

The UDT/ADT, on the other hand, have functioning Correlators.

If it is possible to explicitly (rather than conceptually) separate an agent into the three parts, then it appears to be possible to demonstrate the good behavior of an agent in the ASP problem. The world can be written as a NewComb’s-like function:

def U():
box2 = 1000
box1 = (P()==1 ? 1000000 : 0)
return (A()==1 ? box1 : box1 + box2)

where P is a predictor that has much less computational resources than the agent A. We can assume that the predictor has basically the same source code as the agent, except for the bound on L and a stipulation that P two-boxes if it cannot prove one-boxing using available resources.

Then, if the Executive uses a reasonable strategy (“start with low values of L and P, then increase them until all necessary moral arguments are found or all available resources are spent”), then the Correlator should be able to prove A()=P() quite early in the process.

gRR19 Mar 2012 19:41 UTC

4 points

10 comments1 min readLW link Archive

cousin_it 20 Mar 2012 0:09 UTC
1 point
0
If you have a formalization of CDT in the constant programs framework, could you spell it out in more detail?
- gRR 20 Mar 2012 5:11 UTC
  4 points
  0
  Parent
  For CDT, it is not useful to try simulating every single piece of code within U(), since there is no Correlator to use the results. So the simplest version of the CDT agent would be:
  
  Let L = available computation resources / number of functions in U().
  Evaluate U(), spending at most L steps in each function. If a function takes too long to evaluate, assume its result is a new variable x_i and continue to evaluate its parent.
  In the end, the final result should be: U() = some function of variables x_i, one of which is the variable corresponding to Self().
  Return the argmax of that function (assuming some distribution over non-self variables, if necessary).
  - cousin_it 20 Mar 2012 8:40 UTC
    0 points
    0
    Parent
    I don’t understand. In the symmetric PD the world program makes two identical calls to the agent. If the agent chooses defection, does that mean it randomly designates one of these calls as “myself” and the other as “not myself”?
    - gRR 20 Mar 2012 11:28 UTC
      2 points
      0
      Parent
      The cases where the two calls are to the same Self() function correspond to “anthropic uncertainty” in the Newcomb’s problem and “playing against mirror image” in PD. In these cases even CDT knows that it directly influences both parts, and so it one-boxes/cooperates. For properly modeling the symmetric PD, the second agent must be a function with a different name but with the same source code.
      - cousin_it 20 Mar 2012 13:34 UTC
        3 points
        0
        Parent
        Oh. You’re right.
        
        So a CDT agent assumes some probability distribution over the outputs of all computations except itself, then chooses a return value by maximizing expected utility under that distribution. In the PD and Newcomb’s problem, assuming any reasonable probability distribution will lead to defecting or two-boxing. That’s a nice way of looking at it, thanks.
        gRR 20 Mar 2012 14:19 UTC
        0 points
        0
        Parent
        
        So a CDT agent assumes some probability distribution over the outputs of all computations except itself
        
        Yes, all computations that it has no resources to directly simulate. Which necessarily include any copies of itself, no matter how much resources it has.
        
        What do you think of the other idea—that all ways of recognizing correlations can be reduced to series of evaluations and source code comparisons? I don’t immediately see any other way to prove that two source codes return the same thing… must check a textbook on lambda calculus.
        cousin_it 20 Mar 2012 15:00 UTC
        3 points
        0
        Parent
        No, that idea looks wrong, unless I misunderstand it. Imagine you have two functions f() and g() that you’re too weak to evaluate. You need to be able to prove that f()+g()==g()+f(), but I don’t see any way to do that using only evaluations and source code comparisons. The general way is to use a theorem prover that has axioms like the commutativity of addition.
        gRR 20 Mar 2012 19:47 UTC
        0 points
        0
        Parent
        I think I can prove commutativity of addition using just evaluations and source code comparisons:
        
        def P(m,n): P(m, 0) = m ; P(m, n+1) = P(m, n) + 1
        
        Lemma: P(m+1, n) = P(m, n+1)
        Proof:
        Let A(m,n) = P(m+1, n),
        Let B(m,n) = P(m, n+1)
        Then:
        A(m, 0) = P(m+1, 0) = m+1
        A(m, n+1) = P(m+1, n+1) = P(m+1, n) + 1 = A(m, n) + 1
        B(m, 0) = P(m, 0+1) = P(m) + 1 = m+1
        B(m, n+1) = P(m, n+1+1) = P(m, n+1) + 1 = B(m, n) + 1
        The source code for A and B are equivalent ⇒ A=B. QED
        
        Theorem: P(m, n) = P(n, m)
        Proof:
        P(0, 0) = 0
        P(m+1, 0) = P(m, 0) + 1
        P(0, n+1) = P(0, n) + 1
        P(m+1, n+1) = P(m+1, n) + 1 = P(m, n+1) + 1 = P(m, n) + 1 + 1
        The code of P is symmetrical now, so P(m,n)=P(n,m). QED
        cousin_it 20 Mar 2012 19:54 UTC
        0 points
        0
        Parent
        Don’t you need a theorem prover to go from commutativity of addition to f()+g()==g()+f()? I thought you were supposed to use only evaluations and source code comparisons on the code f()+g()==g()+f() directly...
        gRR 20 Mar 2012 20:04 UTC
        2 points
        0
        Parent
        Commutativity of addition is a correlation P1=P2, where P1(x,y)=x+y, and P2(x,y)=y+x.
        So, f()+g() = [using this correlation] = g()+f().
        
        But you’re right. In order to use this, I have to be able to calculate evaluations and comparisons on arbitrary pieces of code, not just the ones existing in U()...

Difference between CDT and ADT/​UDT as constant programs

Difference between CDT and ADT/UDT as constant programs