Correlated decision making: a complete theory

Stuart_Armstrong26 Sep 2009 11:47 UTC

10 points

The title of this post most probably deserves a cautious question mark at the end, but I’ll go out on a limb and start sawing it behind me: I think I’ve got a framework that consistently solves correlated decision problems. That it is, those situation where different agents (a forgetful you at different times, your duplicates, or Omega’s prediction of you) will come to the same decision.

After my first post on the subject, Wei Dai asked whether my ideas could be formalised enough that it could applied mechanically. There were further challenges: introducing further positional information, and dealing with the difference between simulations and predictions. Since I claimed this sort of approach could apply to the Newcomb’s problem, it is also useful to see it work in cases were the two decisions are only partially correlated—where Omega is good, but he’s not perfect.

The theory

In standard decision making, it is easy to estimate your own contribution to your own utility; the contribution of others to your own utility is then estimated separately. In correlated decision-making, both steps are trickier; estimating your contribution is non-obvious, and the contribution from others is not independent. In fact, the question to ask is not “if I decide this, how much return will I make”, but rather “in a world in which I decide this, how much return will I make”.

You first estimate the contribution of each decision made to your own utility, using a simplified version of the CDP: if N correlated decisions are needed to gain some utility, then each decision maker is estimated to have contributed 1/N of the effort towards the gain of that utility.

Then the procedure under correlated decision making is:

1) Estimate the contribution of each correlated decision towards your utility, using CDP.

2) Estimate the probability that each decision actually happens (this is an implicit use of the SIA).

3) Use 1) and 2) to estimate the total utility that emerges from the decision.

To illustrate, apply it to the generalised absent minded driver problem, where the return for turning off at the first and second intersection are x and y, respectively, while driving straight through grants a return of z. The expected return for going straight with probability p is R = (1-p)x + p(1-p)y + p²z.

Then the expected return for the driver at the first intersection is (1-p)x + [p(1-p)y + p²z]/2, since the y and z returns require two decisions before being claimed. The expected return for the second driver is [(1-p)y + pz]/2. The first driver exists with probability one, while the second driver exists with probability p, giving the correct return of R.

In the example given in Outlawing Athropics, there are twenty correlated decision makers, all existing with probability ¹⁄₂. Two of them contribute towards a decision which has utility −52, hence each generates a utility of −52/2. Eighteen of them contribute towards a decision which has utility 12, hence each one generates a utility of ¹²⁄₁₈. Summing this up, the total utility generated is [2*(-52)/2 + 18*(12)/18]/2 = −20, which is correct.

Simulation versus prediction

In the Newcomb problem, there are two correlated decisions: your choice of one- or two-boxing, and Omega’s decision on whether to put the money. The return to you for one-boxing in either case is X/2; for two boxing, the return is ¹⁰⁰⁰⁄₂.

If Omega simulates you, you can be either decision maker, with probability ¹⁄₂; if he predicts without simulating, you are certainly the box-chooser. But it makes no difference—who you are is not an issue, you are simply looking at the probability of each decision maker existing, which is 1 in both cases. So adding up the two utilities gives you the correct estimate.

Consequently, predictions and simulations can be treated similarly in this setup.

Partial correlation

If two decisions are partially correlated—say, Newcomb’s problem where Omega has a probability p of correctly guessing your decision—then the way of modeling it is to split it into several perfectly correlated pieces.

For instance, the partially correlated Newcomb’s problem can be split into one model which is perfectly correlated (with probability p), and one model which is perfectly anti-correlated (with probability (1-p)). The return from two-boxing in the first case is 1000, and X+1000 is the second case. One-boxing gives a return of X in the first case and 0 in the second case. Hence the expected return from one-boxing is p(X), and for two-boxing is 1000 + (1-p)X, which are the correct odds.

Adding positional information

SilasBarta asked whether my old model could deal with the absent-minded driver if there were some positional information. For instance, imagine if there were a light at each crossing that could be red or green, and it was green ¹⁄₂ the time at the first crossing and ²⁄₃ of the time in the second crossing. Then if your probability of continuing on a green light was g, and if on a red light it was r, your initial expected return is R = (2-r-g)x/2 + (r+g)D/2, where D = ((1-r)y + rz)/3 + 2((1-g)y + gz)/3 ).

Then if you are at the first intersection, your expected return must be (2-r-g)x/2 + (r+g)D/4 (CDP on y and z, which require 2 decisions), while if you are at the second intersection, your expected return is D/4. The first driver exists with certainty, while the second one exists with probability r+g, giving us the correct return R.

What links here?

Stuart_Armstrong26 Sep 2009 11:47 UTC

10 points

22 comments3 min readLW link Archive

Wei Dai 27 Sep 2009 11:47 UTC
3 points
0
Your proposed solution seems to introduce some arbitrary structure to the decision algorithm. Specifically, there is a large number of alternatives to CDP (“if N correlated decisions are needed to gain some utility, then each decision maker is estimated to have contributed 1/N of the effort towards the gain of that utility”) that all give the same solutions. For example, suppose we replace it with:
- if N correlated decisions are needed to gain some utility, then the decision maker highest in the decision tree is assigned full responsibility for it
Then the expected return for the driver at X is (1-p)x + [p(1-p)y + p^2 z] and the expected return for the driver at Y is 0, and the sum is still R. Or we can replace CDP with:
- if N correlated decisions are needed to gain some utility, then the decision maker lowest in the decision tree is assigned full responsibility for it
Then the expected return for the driver at X is (1-p)x and the expected return for the driver at Y is [(1-p)y + pz] (with probability of existence p), and the total is still R.

Basically, how you assign responsibility is irrelevant, as long as the total adds up to 1. So why pick equal assignment of responsibility?

Also, your proposed computation
- 1 * ((1-p)x + [p(1-p)y + p2z]/2) + p * [(1-p)y + pz]/2
is not an expected utility computation (since the probabilities 1 and p don’t sum up to 1). Nor does it compute or use “the probability that I’m at X” so it’s no better than UDT in satisfying the epistemic intuition that demands that probability.

At this point, it seems to me that “contribution”/”responsibility” is not a useful concept. It’s just adding complexity without any apparent benefit. Do you agree with this assessment? If not, what advantage do you see in your proposal over UDT?
- Stuart_Armstrong 27 Sep 2009 16:37 UTC
  1 point
  0
  Parent
  
  Your proposed solution seems to introduce some arbitrary structure to the decision algorithm. Specifically, there is a large number of alternatives to CDP (“if N correlated decisions are needed to gain some utility, then each decision maker is estimated to have contributed 1/N of the effort towards the gain of that utility”) that all give the same solutions.
  
  Ah, so someone noticed that :-) I didn’t put it in, since it all gave the same result in the end, and make the whole thing more complicated than needed. For instance, consider these set-ups:
  
  1) Ten people: each one causes £10 to be given to everyone in the group.
  
  2) Ten people: each one causes £100 to be given to themselves.
  
  3) Ten people: each one causes £100 to be given to the next person in the group.
  
  Under correlated decision making, each set-up is the same (the first is CDP, the others are harder). I chose to go with the simplest model—since they’re all equivalent.
  
  1 ((1-p)x + [p(1-p)y + p2z]/2) + p [(1-p)y + pz]/2 is not an expected utility computation (since the probabilities 1 and p don’t sum up to 1).
  
  It is a sum of expected utilities—the expected utility you gain from driver 1, plus the expected utility you gain from driver 2 (under CDP).
  
  At this point, it seems to me that “contribution”/”responsibility” is not a useful concept. It’s just adding complexity without any apparent benefit. Do you agree with this assessment? If not, what advantage do you see in your proposal over UDT?
  
  The starting point was Eliezer’s recent post on outlawing anthropics, where he seemed ready to throw out anthropic reasoning entrierly, based on the approch he was using. The above style of reasoning correctly predicts your expected utility dependent on your decision. Similarly, this type of reasoning solves the Anthropic Trilemma.
  
  If UDT does the same, then my approach has no advantage (though some people may find it conceptually easier (and some, conceptually harder)).
Wei Dai 26 Sep 2009 18:54 UTC
0 points
0

using a simplified version the CDP

Missing a word there.

The expected return for the second driver is [(1-p)y + px]/2

That should be “pz” instead of “px”.

I’ll have more thoughts on this later, but these errors don’t help your credibility. You should double-check everything. (I’ve only read a part of it myself so far.)
- Stuart_Armstrong 26 Sep 2009 19:36 UTC
  0 points
  0
  Parent
  Errors fixed (I can only blame the usual slopiness of mathematicians, and apologise profusely).
  - Tyrrell_McAllister 26 Sep 2009 21:15 UTC
    3 points
    0
    Parent
    
    I can only blame the usual slopiness of mathematicians,
    
    That’s “sloppiness” :P.
    
    I don’t consider these errors to be of the kind that damages credibility. That may be self-serving, though, since I make them all the time. But then again, I am a mathematician.
    - SilasBarta 28 Sep 2009 13:57 UTC
      0 points
      0
      Parent
      
      I can only blame the usual slopiness of mathematicians,
      
      That’s “sloppiness” :P.
      
      No, “slopiness” works too—mathematicians have more slope in their heads ;-P
    - Wei Dai 27 Sep 2009 7:47 UTC
      0 points
      0
      Parent
      I’m curious, why are mathematicians sloppier than others?
      
      I don’t consider these errors to be of the kind that damages credibility.
      
      If that’s true, I’ve wasted a significant chunk of my life reviewing my writings for errors. :-(
      - Stuart_Armstrong 27 Sep 2009 8:21 UTC
        5 points
        0
        Parent
        
        I’m curious, why are mathematicians sloppier than others?
        
        I think it’s because we’re mainly focused on getting ideas right—most of the time, writing out the equation is merely a confirmation of what we allready know to be true. So often, a mathmo will write out a series of equations where the beginning will be true, the middle completely wrong, and the conclusion correct.
        
        As for general linguistic sloppiness, that probably derives from the feeling that “hey my math is good, so don’t mess me about my words”.
        
        If that’s true, I’ve wasted a significant chunk of my life reviewing my writings for errors. :-(
        
        I’ve done that too—I’m just not very good at catching them. And it’s only a waste if you have a typo-tolerant audience.
        Wei Dai 27 Sep 2009 12:40 UTC
        2 points
        0
        Parent
        
        I think it’s because we’re mainly focused on getting ideas right—most of the time, writing out the equation is merely a confirmation of what we allready know to be true. So often, a mathmo will write out a series of equations where the beginning will be true, the middle completely wrong, and the conclusion correct.
        
        I wonder why that doesn’t work in cryptography. There are several well-known examples of “security proofs” (proof of security of a crypto scheme under the assumption that some computational problem is hard) by respected researchers that turn out many years after publication to contain errors that render the conclusions invalid.
        
        Or does this happen just as often in mathematics, except that mathematicians don’t care so much because their errors don’t usually have much real-world impact?
        Stuart_Armstrong 27 Sep 2009 19:08 UTC
        1 point
        0
        Parent
        
        Or does this happen just as often in mathematics, except that mathematicians don’t care so much because their errors don’t usually have much real-world impact?
        
        The strongest theorems are those that have multiple proofs, or where the idea of the proof is easy to grasp (think Godel’s incompleteness theorem). Proofs that depend on every detail of a long tedious calculation, and only on that, are rare.
        
        Proof that err by using implicit lemmas, or assuming results they can’t assume, are much more common, and mathematicians know this and are much more on guard for those errors.
        Wei Dai 29 Sep 2009 0:04 UTC
        0 points
        0
        Parent
        
        The strongest theorems are those that have multiple proofs, or where the idea of the proof is easy to grasp (think Godel’s incompleteness theorem). Proofs that depend on every detail of a long tedious calculation, and only on that, are rare.
        
        But those kinds of proofs are not rare in cryptography. Which suggests that there’s a selection effect going on in mathematics, where mathematicians choose which problems to work on partly by how likely the solutions to those problems will involve “strong” theorems with multiple proofs and perhaps be easily accessed by their intuitions.
        
        Now what happens when the problem picks you, instead of you picking the problem? That is the situation we’re in, I think, so sloppiness is a worse problem than you might expect.
        Douglas_Knight 29 Sep 2009 2:34 UTC
        1 point
        0
        Parent
        This reads to me like macho bragging.
        
        Both math and crypto contain errors. Are they the result of sloppiness? the kind of sloppiness Stuart Armstrong attributes to mathematicians?
        
        I don’t know much about crypto. LF is said to be repeatedly wrong (in crypto? in another field?). That must constitute a kind of sloppiness. Is it correlated with other kinds of sloppiness?
        
        i see two kinds of sloppiness I see attributed in this thread to mathematicians: (1) that detectable by copyediting; (2) focusing on the hard parts and trusting the easy parts to take care of themselves. (2) can lead to (1). There’s a third kind of sloppiness common in senior mathematicians: they supply the proof, but refuse to give the statement. Much of the difference is probably material that mathematicians include that people in CS simply omit. (is crypto published in conferences?)
        Wei Dai 29 Sep 2009 3:00 UTC
        0 points
        0
        Parent
        
        Both math and crypto contain errors. Are they the result of sloppiness? the kind of sloppiness Stuart Armstrong attributes to mathematicians?
        
        According to Stuart, in math there are often errors where “beginning will be true, the middle completely wrong, and the conclusion correct”. I was saying that this kind of error doesn’t seem to occur often in crypto, and trying to figure out why, with no bragging intended. Do you have another hypothesis, besides the one I gave?
        
        LF is said to be repeatedly wrong (in crypto? in another field?).
        
        What is LF?
        Douglas_Knight 29 Sep 2009 3:15 UTC
        0 points
        0
        Parent
        My working hypothesis is that math and crypto are very similar, this kind of error occurs frequently, and you just don’t notice. What little crypto I know could be called complexity theory. I’ve read very little and heard it mainly orally. I’ve experienced this kind of error, certainly in oral complexity theory and I think in oral crypto. Of course, there’s a difference when people are trying to reconstruct proofs that are stamped by authority.
        
        I thought it possible you were talking about the person LF.
        Douglas_Knight 27 Sep 2009 18:14 UTC
        1 point
        0
        Parent
        Yes, mathematicians produce errors that stand for decades, but they aren’t errors that are detectable to copy-editing. Surely the errors you mention in crypto weren’t caused by substituting pz for px! How could it have stood so long with such an error? Also, such a random error is unlikely to help the proof. If you are approximating something and you want different sources of error to cancel, then a minus sign could make all the difference in the world, but people know that this is the key in the proof and pay more attention. Also, if flipping a sign makes it true, it’s probably false, not just false, but false with a big enough error that you can check it in small examples.
        
        “Beware of bugs in the above code; I have only proved it correct, not tried it.″ - Knuth
        Johnicholas 27 Sep 2009 14:07 UTC
        1 point
        0
        Parent
        I’ve heard stories (from my math professors in college) of grad students who spent multiple years writing about certain entities, which have all sorts of very interesting properties. However, they were having difficulties actually constructing one. Eventually it was demonstrated that there aren’t any, and they had been proving the interesting things one could do if one had an element of the empty set.
        
        http://en.wikipedia.org/wiki/Principle_of_explosion
        
        Mathematicians do make errors. Sometimes they brush them aside as trivial (like Girard in Nesov’s example), but sometimes they care a lot.
        Vladimir_Nesov 27 Sep 2009 9:29 UTC
        1 point
        0
        Parent
        Only yesterday I read this digression in Girard’s The Blind Spot:
        
        Once in a while I would like to indulge into an anecdote concerning the genesis of the proof. The criterion was found by the end of 1985; then I remained more than six months making circles around the “splitting tensor”. One nice day of August 1986, I woke up in a camping of Siena and I had got the proof: I therefore sat down and wrote a manuscript of 10 pages. One month later, I was copying this with a typewriter, and I discovered that one of my lemmas was wrong: no importance, I made another lemma! This illustrates the fact, neglected by the formalist ideology, that a proof is not putting side by side logical rules, it is a global perception: since I had found the concept of empire, I had my theorem and the faulty lemma was no more than a misprint.
      - Tyrrell_McAllister 28 Sep 2009 18:59 UTC
        3 points
        0
        Parent
        (1) Not all errors damage credibility (in my eyes) to a significant degree.
        
        (2) Nonetheless, even if an error doesn’t damage your credibility, avoiding it shows that you care about not wasting your readers’ time.
        
        To expand on point (1), I’m inclined to be pretty forgiving if
        
        the error was in a routine computation;
        
        the computation is almost as easy to do myself as to verify; and
        
        the computation serves only as an example, not as a formally necessary part of the description of the ideas.
        
        In some fields of mathematics, papers are almost entirely prose, with very little computation (in the sense of manipulating formulas). In these fields, the proofs are communicated using words, not equations, though the words have very precise definitions.
      - Eliezer Yudkowsky 27 Sep 2009 10:14 UTC
        3 points
        0
        Parent
        Reviewing your writing costs you time, but it saves a mental hiccup on the part of a lot of readers. Multiply?
  - Richard_Kennaway 27 Sep 2009 15:25 UTC
    1 point
    0
    Parent
    
    the usual slopiness of mathematicians
    
    First I’ve ever heard of such a thing. If it appears that mathematicians are sloppy, perhaps that is only because their mistakes are more definitively detectable.
  - mtaran 26 Sep 2009 22:49 UTC
    1 point
    0
    Parent
    Also, in the very beginning, “turning with probability p” should really be “going straight with probability p”.
    - Stuart_Armstrong 27 Sep 2009 8:12 UTC
      0 points
      0
      Parent
      Damned! fixed.