RickJS comments on Ingredients of Timeless Decision Theory

RickJS 10 Sep 2009 2:05 UTC
0 points
Inorite? What is that?

I suspect I’m not smart enough to play on this site. I’m quite unsure I can even parse your sentence correctly, and I can’t imagine a reason to adjust the external payoff matrices (they were given by Wei Dai, that is the original problem I’m discussing) so the internal payoff mtrices match something. I’m baffled.
- Cyan 10 Sep 2009 2:41 UTC
  2 points
  Parent
  “inorite”.
- thomblake 10 Sep 2009 13:05 UTC
  0 points
  Parent
  See Cyan’s comment below. Do not be dispirited by lolspeak.
  
  Also, the reason to adjust the payoff matrices in the original problem is so that your ‘internal’ payoff matrices match those of Wei Dai’s problem, or to put it another way, consider the problem in the least convenient possible world. Basically, the prisoner’s dilemma is still there if you take the problem to be in utilons, which take into account things like your ‘compassion’ (in this case, valuing the reward given to the other person). I can’t quite figure out what your formula for discounting is above, so let me simplify...
  
  It would be remiss for me to not do the math, though it is not my forte:
  
  Suppose the matrix represents jelly beans for you or the opponent, each worth 1 utilon. Further suppose that you get .25 utilons for each jelly bean the opponent gets, due to your ‘compassion’. Now take this payoff matrix (in jellybeans):
```
375/500 -150/600  
600/0     75/100  
```
  Which becomes in your ‘internal’ matrix (in utilons):
```
500/500   0/600  
600/0   100/100  
```
  Now cooperation is dominated by defection for the ‘compassionate’ person.
  
  Someone please note if my numbers don’t work out—it’s early here.
  - RickJS 10 Sep 2009 23:23 UTC
    2 points
    Parent
    Ah. Thanks! I think I get that.
    
    But maybe I just think I do. I thought I understood that narrow part of Wei Dai’s post on a problem that maybe defeats TDT. I had no idea that compassion had already been considered and compensated out of consideration. And that’s such common shared knowledge here in the LessWrong community that it need not be mentioned.
    
    I have a lot to learn. I now see I was very arrogant think I could contribute here. I should read the archives & wiki before I post. I apologize.
    
    <<Begins to compute an estimated time to de-lurk. They collectively write several times faster than I can read, even if I don’t slow down to mull it over. Hmmm… >>