moridinamael comments on Simulate and Defer To More Rational Selves

moridinamael 9 Sep 2014 20:46 UTC
51 points
0
I have found that the more I use my simulation of HPMOR!Quirrell for advice, the harder it is to shut him up. As with any mental discipline, thinking in particular modes wears thought-grooves into your brain’s hardware, and before you know it you’ve performed an irreversible self-modification. Consequently, I would definitely recommend that anybody attempting to supplant their own personality (for lack of a better phrasing) with a model of some idealized reasoner try to make sure that the idealized reasoner shares your values as thoroughly as possible.
- Nomad 9 Sep 2014 22:55 UTC
  87 points
  0
  Parent
  I’ve now got this horrifying idea that this has been Quirrell’s plan all along: to escape from HPMOR to the real world by tempting you to simulate him until he takes over your mind.
  - polymathwannabe 10 Sep 2014 22:26 UTC
    28 points
    0
    Parent
    Hmm, so the Fanfiction.net website is his horcrux?
  - Vulture 11 Sep 2014 0:59 UTC
    12 points
    Parent
    In retrospect, I’m kind of glad that my plan to make a Quirrell-tulpa never got off the ground.
    - alicey 15 Jan 2016 1:47 UTC
      2 points
      Parent
      afaict the quirrell tulpa is one of the more common types of tulpas. if you have one, do not use it. it is secretly voldemort and will destroy your soul.
  - notsonewuser 11 Sep 2014 2:35 UTC
    9 points
    Parent
    But Quirrell didn’t cause Eliezer to write HPMOR...
    - therufs 11 Sep 2014 3:41 UTC
      28 points
      0
      Parent
      It’s to Quirrell’s advantage that you believe that, of course.
    - DanArmak 14 Sep 2014 17:11 UTC
      19 points
      Parent
      Beware acausal trade! Once Eliezer imagined Quirrel, he had to write HPMOR to stop Quirrel from counterfactually simulating 3^^^3 dustspeckings.
      - Eliezer Yudkowsky 17 Sep 2014 18:11 UTC
        20 points
        Parent
        Rational agents cannot be successfully blackmailed by other agents that simulate them accurately, and especially not by figments of their own imagination.
        skeptical_lurker 17 Sep 2014 21:10 UTC
        3 points
        Parent
        Are you implying that rational agents can be successfully blackmailed by other agents that simulate them inaccurately? (This does seem plausible to me, and is an interesting rare example of accurate knowlage posing a hazard.)
        Armok_GoB 7 Oct 2014 22:20 UTC
        2 points
        Parent
        Well, that’s quite obvious. Just imagine the blackmailer is a really stupid human with a big gun that’d fall for blackmail in a variety of awful ways, and has a bad case of typical mind fallacy, and if anything goes other than their expectations they get angry and just shot them before thinking through the consequences.
        skeptical_lurker 8 Oct 2014 2:24 UTC
        3 points
        Parent
        Its kinda obvious, but deeply counter-intuitive—I mean its a situation where stupidity is decisive advantage!
        Lumifer 8 Oct 2014 14:57 UTC
        7 points
        Parent
        
        its a situation where stupidity is decisive advantage!
        
        Not quite stupidity—irrationality. And it is well-known that (credible) irrationality can be a big advantage in negotiations and other game theory scenarios. Essentially, if I’m irrational then you cannot simulate me accurately and cannot predict what I will do which means that your risk aversion pushes you towards safe choices which limit your downside at the cost of your upside. And if it’s a zero-sum game, I get this upside.
        
        Of course, I need to be credible in showing my irrationality.
        
        The reason such a strategy is not used more often is because (a) often there is the option to walk away which many people do when faced with an irrational counterparty; and (b) when two irrational counterparties meet, bad things happen :-)
        gjm 8 Oct 2014 23:41 UTC
        3 points
        Parent
        There are instances where (arguably) irrationality confers a big game-theoretic advantage even though you’re predictable.
        
        For instance, suppose you’re leading a nuclear superpower. If you can make it credibly clear that you really truly would be happy to launch World War Three if the other guys don’t back down, then they probably will. Not because they can’t predict your actions, but because they can.
        
        In this sort of case it’s either debatable whether it’s really irrationality, or debatable whether it’s really a game-theoretic advantage. If you can really be sure that the other guys will back down, then maybe it’s not irrationality because you never have to blow up the world. If you can’t, then maybe you don’t have a game-theoretic advantage after all because if you play this game often enough then the other guys call your bluff, you push the big red button, and everyone dies.
        
        [EDITED to add: I think this sort of case is nearer to the example discussed upthread than the sort where unpredictability is key.]
        Lumifer 9 Oct 2014 0:14 UTC
        0 points
        Parent
        
        For instance, suppose you’re leading a nuclear superpower. If you can make it credibly clear that you really truly would be happy to launch World War Three
        
        That’s more like sheer bloodymindedness X-) not irrationality.
        
        then the other guys call your bluff, you push the big red button, and everyone dies.
        
        Yeah, it’s called the game of chicken and that’s a slightly different thing.
        dankane 17 Sep 2014 19:14 UTC
        3 points
        Parent
        I think you mean that rational agents cannot be successfully blackmailed by others agents that for which it is common knowledge that the other agents can simulate them accurately and will only use blackmail if they predict it to be successful. All of this of course in the absence of mitigating circumstances (including for example the theoretical likelihood of other agents that reward you for counterfactualy giving into blackmail under these circumstances).
        Philip_W 16 Jun 2015 5:51 UTC
        1 point
        Parent
        That doesn’t seem true. How can the victim know for sure that the blackmailer is simulating them accurately or being rational?
        
        Suppose you get mugged in an alley by random thugs. Which of these outcomes seems most likely:
        
        You give them the money, they leave.
        
        You lecture them about counterfactual reasoning, they leave.
        
        You lecture them about counterfactual reasoning, they stab you.
        
        Any agent capable of appearing irrational to a rational agent can blackmail that rational agent. This decreases the probability of agents which appear irrational being irrational, but not necessarily to the point that you can dismiss them.
        Decius 18 Sep 2014 1:26 UTC
        1 point
        Parent
        Why not? Are rational agents generally immune to blackmail, or is it not strictly advantageous to be able to simulate another agent accurately?
        Tintinnabulation 16 Dec 2014 20:42 UTC
        0 points
        Parent
        I think it basically comes to, if the rational agent recognizes that the rational thing to do is to NOT buckle under blackmail, regardless of what the rational agent simulating them threatens, then the blackmailer’s simulation of the blackmailee will also not respond to that pressure, and so it’s pointless to go to the effort of pressuring them in the first place. However, if the blackmailer is irrational, their simulation of the blackmailee will be irrational, and thus they will carry through with the threat. This means that the blackmailee’s simulation of the blackmailer as rational is itself inaccurate, as the simulation does not correspond to reality. If the blackmailee is irrational, their simulation of the blackmailer will be irrational, and thus they will concede to their demands. Yet, each party acts as if their simulation of the other was correct, until actual, photon-transmitted information about the world can impress itself into their cognitive function. So, no-one gets what they want. The best choice for a rational agent here is just to ignore the good professor. On the other hand, you can’t argue with results. And there’s a simulation of Quirrel s-quirreled away in your brain, whispering.
        Decius 25 Dec 2014 6:02 UTC
        0 points
        Parent
        It looks like you are saying that both rational and irrational agents model competitors as behaving in the same way they do.
        
        Is that why you think that an irrational simulation of a rational agent must be wrong, and why a rational simulation of an irrational agent must be wrong? I suggest that an irrational agent can correctly model even a perfectly rational one.
        johnlawrenceaspden 23 Mar 2016 17:43 UTC
        0 points
        Parent
        sorry
  - Algernoq 21 Sep 2014 2:42 UTC
    3 points
    Parent
    Worryingly, this sounds like a good deal—getting skills for faster power/control increase, keeping continuity of consciousness, and increasing the odds of escaping from this reality into the next higher one...
- MichaelVassar 18 Sep 2014 18:33 UTC
  9 points
  Parent
  Possibly valuable to talk with Robin Hanson and I for revision to HPMOR!Quirrell decision procedures from the source?
  - moridinamael 18 Sep 2014 18:40 UTC
    1 point
    Parent
    I would give a finger from my wand hand for such an opportunity.
  - johnlawrenceaspden 23 Mar 2016 17:39 UTC
    0 points
    Parent
    I bid two.
- LoganStrohl 17 Sep 2014 13:26 UTC
  6 points
  Parent
  This whole comment thread is utterly delightful.