halcyon comments on Newcomb’s Problem and Regret of Rationality

halcyon 3 Jun 2012 10:03 UTC
1 point
Well, the more I think about this, the more it seems to me that we’re dealing with a classic case of the unspecified problem.
- You are standing on the observation deck of Starship Dog Poo orbiting a newly discovered planet. The captain inquires as to its color. What do you answer him?
- Uh, do I get to look at the planet?
- No.
- … Let me look up the most common color of planets across the universe.
In the given account, the ability attributed to our alien friend is not described in terms that are meaningful in any sense, but is instead ascribed to his “superintelligence”, which is totally irrelevant as far as our search for solutions is concerned. And yet, we’re getting distracted from the problem’s fundamentally unspecified nature by these yarn balls of superintelligence and paradoxical choice, which are automatically Clever Things to bring up in our futurist iconography.

If you think I’m mistaken, then I’d really appreciate criticism. Thanks!
- halcyon 3 Jun 2012 11:07 UTC
  1 point
  Parent
  (The above problem is actually a more sensible one since the relationship of the query to our cache of observed data is at least clear. Newcomb’s Problem, OTOH, leaves the domain of well-understood science completely behind. If, with our current scientific knowledge, we find the alien’s ability utterly baffling at the stage of understanding his methods which the problem has set out for us, then it would be sheer hubris to label either choice “rational”, because if the very basis for such a judgement exists, then I for one cannot see it. What if you pick B and it turns out to be empty? If that is impossible, then what are the details of the guarantee that that outcome could never occur? The problem’s wrappings, so to speak, makes this look like an incomprehensible matter of faith to me. If I have misunderstood something, could someone smart please explain it to me?)
  
  (At the very least, it must be admitted that in our current understanding of the universe, a world of chaotic systems and unidirectional causality, a perfect predictor’s algorithm is a near-impossibility, “superintelligence” or no. All this reminds me of what Eliezer said in his autobiographical sequence: If you want to treat a complete lack of understanding of a subject as an unknown variable and shift it around at your own convenience, then there are definitely limits to that kind of thing.)
  
  (Based on a recommendation, I am now reading Yudkowsky’s paper on Timeless Decision Theory. I’m 7 pages in, but before I come across Yudkowsky’s solution, I’d like to note that choosing evidential decision theory over causal decision theory or vice-versa, in itself, looks like a completely arbitrary decision to me. Based on what objective standards could either side possibly justify its subjective priorities as being more “rational”?)
  - CuSithBell 3 Jun 2012 16:16 UTC
    2 points
    Parent
    Well, it’s a thought experiment, involving the assumption of some unlikely conditions. I think the main point of the experiment is the ability to reason about what decisions to make when your decisions have “non-causal effects”—there are conditions that will arise depending on your decisions, but that are not caused in any way by the decisions themselves. It’s related to Kavka’s toxin and Parfit’s hitchhiker.
    - halcyon 3 Jun 2012 16:54 UTC
      −2 points
      Parent
      But even thought experiments ought to make sense, and I’m not yet convinced this one does, for the reasons I’ve been ranting about. If the problem does not make sense to begin with, what is its “answer” worth? For me, this is like seeing the smartest minds in the world divided over whether 5 + Goldfish = Sky or 0. I’m asking what the operator “+” signifies in this context, but the problem is carefully crafted to make that very question seem like an unfair imposition.
      
      Here, the power ascribed to the alien, without further clarification, appears incoherent to me. Which mental modules, or other aspects of reality, does it read to predict my intentions? Without that being specified, this remains a trick question. Because if it directly reads my future decision, and that decision does not yet exist, then causality runs backwards. And if causality runs backwards, then the money already being in box B or not makes no difference, because your actual decision NOW is going to determine whether it will have been placed there in the past. So if you’re defying causality, and then equating reason with causality, then obviously the “irrational”, ie. acausal, decision will be rewarded, because the acausal decision is the calculating one. God I wish I could draw a chart in here.
      - CuSithBell 4 Jun 2012 0:10 UTC
        2 points
        Parent
        The power, without further clarification, is not incoherent. People predict the behavior of other people all the time.
        
        Ultimately, in practical terms the point is that the best thing to do is “be the sort of person who picks one box, then pick both boxes,” but that the way to be the sort of person that picks one box is to pick one box, because your future decisions are entangled with your traits, which can leak information and thus become entangled with other peoples’ decisions.
        halcyon 4 Jun 2012 12:18 UTC
        0 points
        Parent
        
        People predict the behavior of other people all the time.
        
        And they’re proved wrong all the time. So what you’re saying is, the alien predicts my behavior using the same superficial heuristics that others use to guess at my reactions under ordinary circumstances, except he uses a more refined process? How well can that kind of thing handle indecision if my choice is a really close thing? If he’s going with a best guess informed by everyday psychological traits, the inaccuracies of his method would probably be revealed before long, and I’d be at the numbers immediately.
        
        “be the sort of person who picks one box, then pick both boxes”
        
        I agree, I would pick both boxes if that were the case, hoping I’d lived enough of a one box picking life before.
        
        but that the way to be the sort of person that picks one box is to pick one box, because your future decisions are entangled with your traits, which can leak information and thus become entangled with other peoples’ decisions.
        
        I beg to differ on this point. Whether or not I knew I would meet Dr. Superintelligence one day, an entire range of more or less likely behaviors is very much conceivable that violate this assertion, from “I had lived a one box picking life when comparatively little was at stake,” to “I just felt like picking differently that day.” You’re taking your reification of selfhood WAY too far if you think Being a One Box Picker by picking one box when the judgement is already over makes sense. I’m not even sure I understand what you’re saying here, so please clarify if I’ve misunderstood things. Unlike my (present) traits, my future decisions don’t yet exist, and hence cannot leak anything or become entangled with anyone.
        
        But what this disagreement boils down to is, I don’t believe that either quality is necessarily manifest in every personality with anything resembling steadfastness. For instance, I neither see myself as the kind of person who would pick one box, nor as the kind who would pick both boxes. If the test were administered to me a hundred times, I wouldn’t be surprised to see a 50-50 split. Surely I would be exaggerating if I said you claim that I already belong to one of these two types, and that I’m merely unaware of my true inner box-picking nature? If my traits haven’t specialized into either category, (and I have no rational motive to hasten the process) does the alien place a million dollars or not? I pity the good doctor. His dilemma is incomparably more black and white than mine.
        
        To summarize, even if I have mostly picked one box in similar situations in the past, how concrete is such a trait? This process comes nowhere near the alien’s implied infallibility, it seems to me. Therefore, either this process or the method’s imputed infallibility has got to go if his power is to be coherent.
        
        Not only that, if that’s all there is to the alien’s ability, what does this thought experiment say, except that it’s indeed possible for a rational agent to reward others for their past irrationality? (to grant the most meaningful conclusion I DO perceive) That doesn’t look like a particularly interesting result to me. Such figures are seen in authoritarian governments, religions, etc.
        CuSithBell 4 Jun 2012 14:35 UTC
        1 point
        Parent
        
        Unlike my (present) traits, my future decisions don’t yet exist, and hence cannot leak anything or become entangled with anyone.
        
        Your future decisions are entangled with your present traits, and thus can leak. If you picture a Bayesian network with the nodes “Current Brain”, “Future Decision”, and “Current Observation”, with arrows from Current Brain to the two other nodes, then knowing the value of Current Observation gives you information about Future Decision.
        
        Obviously the alien is better than a human at running this game (though, note that a human would only have to be right a little more than 50% of the time to make one-boxing have the higher expected value—in fact, that could be an interesting test to run!). Perhaps it can observe your neurochemistry in detail and in real time. Perhaps it simulates you in this precise situation, and just sees whether you pick one or both boxes. Perhaps land-ape psychology turns out to be really simple if you’re an omnipotent thought-experiment enthusiast.
        
        The reasoning wouldn’t be “this person is a one-boxer” but rather “this person will pick one box in this particular situation”. It’s very difficult to be the sort of person who would pick one box in the situation you are in without actually picking one box in the situation you are in.
        
        One use of the thought experiment, other than the “non-causal effects” thing, is getting at this notion that the “rational” thing to do (as you suggest two-boxing is) might not be the best thing. If it’s worse, just do the other thing—isn’t that more “rational”?
        halcyon 4 Jun 2012 17:43 UTC
        0 points
        Parent
        
        knowing the value of Current Observation gives you information about Future Decision.
        
        Here I’d just like to note that one must not assume all subsystems of Current Brain remain constant over time. And what if the brain is partly a chaotic system? (AND new information flows in all the time… Sorry, I cannot condone this model as presented.)
        
        Perhaps it can observe your neurochemistry in detail and in real time.
        
        I already mentioned this possibility. Fallible models make the situation gameable. I’d get together with my friends, try to figure out when the model predicts correctly, calculate its accuracy, work out a plan for who picks what, and split the profits between ourselves. How’s that for rationality? To get around this, the alien needs to predict our plan and—do what? Our plan treats his mission like total garbage. Should he try to make us collectively lose out? But that would hamper his initial design.
        
        (Whether it cares about such games or not, what input the alien takes, when, how, and what exactly it does with said input—everything counts in charting an optimal solution. You can’t just say it uses Method A and then replace it with Method B when convenient. THAT is the point: Predictive methods are NOT interchangeable in this context. (Reminder: Reading my brain AS I make the decision violates the original conditions.))
        
        Perhaps land-ape psychology turns out to be really simple if you’re an omnipotent thought-experiment enthusiast.
        
        We’re veering into uncertain territory again… (Which would be fine if it weren’t for the vagueness of mechanism inherent in magical algorithms.)
        
        The reasoning wouldn’t be “this person is a one-boxer” but rather “this person will pick one box in this particular situation”.
        
        Second note: An entity, alien or not, offering me a million dollars, or anything remotely analogous to this, would be a unique event in my life with no precedent whatever. My last post was written entirely under the assumption that the alien would be using simple heuristics based on similar decisions in the past. So yeah, if you’re tweaking the alien’s method, then disregard all that.
        
        It’s very difficult to be the sort of person who would pick one box in the situation you are in without actually picking one box in the situation you are in.
        
        From the alien’s point of view, this is epistemologically non-trivial if my box-picking nature is more complicated than a yes-no switch. Even if the final output must take the form of a yes or a no, the decision tree that generated that result can be as endlessly complex as I want, every step of which the alien must predict correctly (or be a Luck Elemental) to maintain its reputation of infallibility.
        
        If it’s worse, just do the other thing—isn’t that more “rational”?
        
        As long as I know nothing about the alien’s method, the choice is arbitrary. See my second note. This is why the alien’s ultimate goals, algorithms, etc, MATTER.
        
        (If the alien reads my brain chemistry five minutes before The Task, his past history is one of infallibility, and no especially cunning plan comes to mind, then my bet regarding the nature of brain chemistry would be that not going with one box is silly if I want the million dollars. I mean, he’ll read my intentions and place the money (or not) like five minutes before… (At least that’s what I’ll determine to do before the event. Who knows what I’ll end up doing once I actually get there. (Since even I am unsure as to the strength of my determination to keep to this course of action once I’ve been scanned, the conscious minds of me and the alien are freed from culpability. Whatever happens next, only the physical stance is appropriate for the emergent scenario. ((“At what point then, does decision theory apply here?” is what I was getting at.) Anyway, enough navel-gazing and back to Timeless Decision Theory.))))
        CuSithBell 5 Jun 2012 18:12 UTC
        0 points
        Parent
        
        knowing the value of Current Observation gives you information about Future Decision.
        
        Here I’d just like to note that one must not assume all subsystems of Current Brain remain constant over time. And what if the brain is partly a chaotic system? (AND new information flows in all the time… Sorry, I cannot condone this model as presented.)
        
        Well… okay, but the point I was making was milder and pretty uncontroversial. Are you familiar with bayesian networks?
        
        Perhaps it can observe your neurochemistry in detail and in real time.
        
        I already mentioned this possibility. Fallible models make the situation gameable. I’d get together with my friends, try to figure out when the model predicts correctly, calculate its accuracy, work out a plan for who picks what, and split the profits between ourselves. How’s that for rationality? To get around this, the alien needs to predict our plan and—do what? Our plan treats his mission like total garbage. Should he try to make us collectively lose out? But that would hamper his initial design.
        
        (Whether it cares about such games or not, what input the alien takes, when, how, and what exactly it does with said input—everything counts in charting an optimal solution. You can’t just say it uses Method A and then replace it with Method B when convenient. THAT is the point: Predictive methods are NOT interchangeable in this context. (Reminder: Reading my brain AS I make the decision violates the original conditions.))
        
        I never said it used method A? And what is all this about games? It predicts your choice.
        
        You’re not engaging with the thought experiment. How about this—how would you change the thought experiment to make it work properly, in your estimation?
        
        Perhaps land-ape psychology turns out to be really simple if you’re an omnipotent thought-experiment enthusiast.
        
        We’re veering into uncertain territory again… (Which would be fine if it weren’t for the vagueness of mechanism inherent in magical algorithms.)
        
        Well, yeah. We’re in uncertain territory as a premise.
        
        The reasoning wouldn’t be “this person is a one-boxer” but rather “this person will pick one box in this particular situation”.
        
        Second note: An entity, alien or not, offering me a million dollars, or anything remotely analogous to this, would be a unique event in my life with no precedent whatever. My last post was written entirely under the assumption that the alien would be using simple heuristics based on similar decisions in the past. So yeah, if you’re tweaking the alien’s method, then disregard all that.
        
        I’m not tweaking the method. There is no given method. The closest to a canonical method that I’m aware of is simulation, which you elided in your reply.
        
        It’s very difficult to be the sort of person who would pick one box in the situation you are in without actually picking one box in the situation you are in.
        
        From the alien’s point of view, this is epistemologically non-trivial if my box-picking nature is more complicated than a yes-no switch. Even if the final output must take the form of a yes or a no, the decision tree that generated that result can be as endlessly complex as I want, every step of which the alien must predict correctly (or be a Luck Elemental) to maintain its reputation of infallibility.
        
        What makes you think you’re so special—compared to the people who’ve been predicted ahead of you?
        
        If it’s worse, just do the other thing—isn’t that more “rational”?
        
        As long as I know nothing about the alien’s method, the choice is arbitrary. See my second note. This is why the alien’s ultimate goals, algorithms, etc, MATTER.
        
        If you know nothing about the alien’s methods, there still is a better choice. You do not have the same expected value for each choice.
        
        (If the alien reads my brain chemistry five minutes before The Task, his past history is one of infallibility, and no especially cunning plan comes to mind, then my bet regarding the nature of brain chemistry would be that not going with one box is silly if I want the million dollars. I mean, he’ll read my intentions and place the money (or not) like five minutes before… (At least that’s what I’ll determine to do before the event. Who knows what I’ll end up doing once I actually get there. (Since even I am unsure as to the strength of my determination to keep to this course of action once I’ve been scanned, the conscious minds of me and the alien are freed from culpability. Whatever happens next, only the physical stance is appropriate for the emergent scenario. ((“At what point then, does decision theory apply here?” is what I was getting at.) Anyway, enough navel-gazing and back to Timeless Decision Theory.))))
      - asparisi 3 Jun 2012 19:53 UTC
        2 points
        Parent
        If you assume that you are a physical system and that the alien is capable of modeling that system under a variety of circumstances, there is no contradiction. The alien simply has a device that creates an effective enough simulation of you that it is able to reliably predict what will happen when you are presented with the problem. Causality isn’t running backwards then, it’s just that the alien’s model is close enough to reality that it can reliably predict your behavior in advance. So it’s:
        
        (You[t0])>(Alien’s Model of You)>(Set Up Box)>(You[t1])
        
        If the alien’s model of you is accurate enough, then it will pick out the decision you will make in advance (or at least, is likely to with an extraordinarily high probability) but that doesn’t violate causality any more than me offering to take my girlfriend out for chinese does because I predict that she will say yes. If accurate models broke causality then causality would have snuffed out of existence somewhere around the time the first brain formed, maybe earlier.
        halcyon 4 Jun 2012 9:51 UTC
        1 point
        Parent
        You don’t seem to understand what I’m getting at. I’ve already addressed this ineptly, but at some length. If causality does not run backwards, then the actual set of rules involved in the alien’s predictive method, the mode of input it requires from reality, its accuracy, etc, become the focus of calculation. If nothing is known about this stuff, then the problem has not been specified in sufficient detail to propose customized solutions, and we can only make general guesses as to the optimal course of action. (lol The hubris of trying to outsmart unimaginably advanced technology as though it were a crude lie detector reminds me of Artemis Fowl. The third book was awesome.) I only mentioned one ungameable system to explain why I ruled it out as being a trivial consideration in the first place. (Sorry, it isn’t Sunday. No incomprehensible ranting today, only tangents involving childrens’ literature.)
      - drethelin 3 Jun 2012 20:04 UTC
        1 point
        Parent
        It’s more useful to view this as a problem involving source code. The alien is powerful enough to read your code, to know what you would do in any situation. This means that it’s in your own self interest to modify your source-code to one-box.
        halcyon 4 Jun 2012 9:52 UTC
        −2 points
        Parent
        That begs the question as to whether anything analogous to “code” exists, whether anything is modifiable simply by willing it, etc. What if my mind looks like it’s going to opt for B when the alien reads me, and I change my mind by the time it’s my turn to choose? If no such thing ever happens, the problem ought to specify why that is the case, because I don’t buy the premise as it stands.
      - halcyon 3 Jun 2012 17:25 UTC
        0 points
        Parent
        To whoever keeps downvoting my comments: The faster I get to negative infinity, the happier I’ll be, but care to explain why?
        asparisi 3 Jun 2012 19:40 UTC
        0 points
        Parent
        Now I am tempted to downvote your comments just to make you happy. :)
        wedrifid 3 Jun 2012 22:16 UTC
        −1 points
        Parent
        
        Now I am tempted to downvote your comments just to make you happy. :)
        
        I’m tempted to downvote his comments despite it making him happy. I have no wish to reward self-described anti-social behavior but the effect of making said behavior invisible seems to make the alleged ‘reward’ of the desired downvotes may make it worthwhile on net.