“I believe X to be like me” ⇒ “whatever I decide, X will decide also” seems tenuous without some proof of likeness that is beyond any guarantee possible in humans.
I can accept your analysis in the context of actors who have irrevocably committed to some mechanically predictable decision rule, which, along with perfect information on all the causal inputs to the rule, gives me perfect predictions of their behavior, but I’m not sure such an actor could ever trust its understanding of an actual human.
Maybe you could aspire to such determinism in a proven-correct software system running on proven-robust hardware.
“I believe X to be like me” ⇒ “whatever I decide, X will decide also” seems tenuous without some proof of likeness that is beyond any guarantee possible in humans...
Maybe you could aspire to such determinism in a proven-correct software system running on proven-robust hardware.
Well, yeah, this is primarily a theory for AIs dealing with other AIs.
You could possibly talk about human applications if you knew that the N of you had the same training as rationalists, or if you assigned probabilities to the others having such training.
I’m not sue such an actor could ever trust its understanding of an actual human.
Let’s play a little game; you and an opponent, 10 rounds of the prisoner’s dilemma. It will cost you each $5 to play, with the following payouts on each round:
(C,C) = $0.75 each
(C,D) = $1.00 for D, $0 for C
(D,D) = $0.25 each
Conventional game theory says both people walk away with $2.50 and a grudge against each other, and I, running the game, pocket the difference.
Your opponent is Eliezer Yudkowsky.
How much money do you expect to have after the final round?
The statistical predictability of human behavior in less extreme circumstances is a much weaker constraint. I thought the (very gentle) PD presented sufficed to make the point that prediction is not impossible even in a real-world scenario.
I don’t know that I have confidence in even you to cooperate on the True PD—sorry. A hypothetical transhuman Bayesian intelligence with your value system? Quite possibly.
Well, obviously. But the more interesting question is what if you suspect, but are not certain, that your opponent is Eliezer Yudkowsky? Assuming identity makes the problem too easy.
My position is that I’d expect a reasonable chance that an arbitrary, frequent LW participant playing this game against you would also end with 10 (C,C)s. I’d suggest actually running this as an experiment if I didn’t think I’d lose money on the deal...
Harsher dilemmas (more meaningful stake, loss from an unreciprocated cooperation that may not be recoverable in the remaining iterations) would make me increasingly hesitant to assume “this person is probably like me”.
This makes me feel like I’m in “no true Scotsman” territory; nobody “like me” would fail to optimistically attempt cooperation. But if caring more about the difference in outcomes makes me less optimistic about other-similarity, then in a hypothetical where I am matched up against essentially myself (but I don’t know this), I defeat myself exactly when it matters—when the payoff is the highest.
and this is exactly the problem: If your behavior on the prisoner’s dilemma changes with the size of the outcome, then you aren’t really playing the prisoner’s dilemma. Your calculation in the low-payoff case was being confused by other terms in your utility function, terms for being someone who cooperates—terms that didn’t scale.
Yes, my point was that my variable skepticism is surely evidence of bias or rationalization, and that we can’t learn much from “mild” PD. I do also agree that warm fuzzies from being a cooperator don’t scale.
If we wanted to be clever we could include Eliezer playing against himself (just report back to him the same value) as a possibility, though if it’s a high probability that he faces himself it seems pointless.
I’d be happy to front the (likely loss of) $10.
It might be possible to make it more like a the true prisoner’s dilemma if we could come up with two players each of whom want the money donated to a cause that they consider worthy but the other player opposes or considers ineffective.
Though I have plenty of paperclips, sadly I lack the resources to successfully simulate Eliezer’s true PD . . .
Meaningful results would probably require several iterations of the game, though, with different players (also, the expected loss in my scenario was $5 per game).
I seem to recall Douglas Hofstadter did an experiment with several of his more rational friends, and was distressed by the globally rather suboptimal outcome. I do wonder if we on LW would do better, with or without Eliezer?
“I believe X to be like me” ⇒ “whatever I decide, X will decide also” seems tenuous without some proof of likeness that is beyond any guarantee possible in humans.
I can accept your analysis in the context of actors who have irrevocably committed to some mechanically predictable decision rule, which, along with perfect information on all the causal inputs to the rule, gives me perfect predictions of their behavior, but I’m not sure such an actor could ever trust its understanding of an actual human.
Maybe you could aspire to such determinism in a proven-correct software system running on proven-robust hardware.
Well, yeah, this is primarily a theory for AIs dealing with other AIs.
You could possibly talk about human applications if you knew that the N of you had the same training as rationalists, or if you assigned probabilities to the others having such training.
For X to be able to model the decisions of Y with 100% accuracy, wouldn’t X require a more sophisticated model?
If so, why would supposedly symmetrical models retain this symmetry?
Nope. http://arxiv.org/abs/1401.5577
Let’s play a little game; you and an opponent, 10 rounds of the prisoner’s dilemma. It will cost you each $5 to play, with the following payouts on each round:
(C,C) = $0.75 each
(C,D) = $1.00 for D, $0 for C
(D,D) = $0.25 each
Conventional game theory says both people walk away with $2.50 and a grudge against each other, and I, running the game, pocket the difference.
Your opponent is Eliezer Yudkowsky.
How much money do you expect to have after the final round?
But that’s not the true PD.
The statistical predictability of human behavior in less extreme circumstances is a much weaker constraint. I thought the (very gentle) PD presented sufficed to make the point that prediction is not impossible even in a real-world scenario.
I don’t know that I have confidence in even you to cooperate on the True PD—sorry. A hypothetical transhuman Bayesian intelligence with your value system? Quite possibly.
Well, let me put it this way—if my opponent is Eliezer Yudkowsky, I would be shocked to walk away with anything but $7.50.
Well, obviously. But the more interesting question is what if you suspect, but are not certain, that your opponent is Eliezer Yudkowsky? Assuming identity makes the problem too easy.
My position is that I’d expect a reasonable chance that an arbitrary, frequent LW participant playing this game against you would also end with 10 (C,C)s. I’d suggest actually running this as an experiment if I didn’t think I’d lose money on the deal...
Harsher dilemmas (more meaningful stake, loss from an unreciprocated cooperation that may not be recoverable in the remaining iterations) would make me increasingly hesitant to assume “this person is probably like me”.
This makes me feel like I’m in “no true Scotsman” territory; nobody “like me” would fail to optimistically attempt cooperation. But if caring more about the difference in outcomes makes me less optimistic about other-similarity, then in a hypothetical where I am matched up against essentially myself (but I don’t know this), I defeat myself exactly when it matters—when the payoff is the highest.
and this is exactly the problem: If your behavior on the prisoner’s dilemma changes with the size of the outcome, then you aren’t really playing the prisoner’s dilemma. Your calculation in the low-payoff case was being confused by other terms in your utility function, terms for being someone who cooperates—terms that didn’t scale.
Yes, my point was that my variable skepticism is surely evidence of bias or rationalization, and that we can’t learn much from “mild” PD. I do also agree that warm fuzzies from being a cooperator don’t scale.
If we wanted to be clever we could include Eliezer playing against himself (just report back to him the same value) as a possibility, though if it’s a high probability that he faces himself it seems pointless.
I’d be happy to front the (likely loss of) $10.
It might be possible to make it more like a the true prisoner’s dilemma if we could come up with two players each of whom want the money donated to a cause that they consider worthy but the other player opposes or considers ineffective.
Though I have plenty of paperclips, sadly I lack the resources to successfully simulate Eliezer’s true PD . . .
Meaningful results would probably require several iterations of the game, though, with different players (also, the expected loss in my scenario was $5 per game).
I seem to recall Douglas Hofstadter did an experiment with several of his more rational friends, and was distressed by the globally rather suboptimal outcome. I do wonder if we on LW would do better, with or without Eliezer?