Wei Dai comments on Ukraine Post #2: Options

Wei Dai 12 Mar 2022 9:11 UTC
18 points

On a personal level, getting yourself to where you are using a functional decision theory is very much worth it, as is helping others to get there with you – it’s good even on your own, but the more people use one, the better it does.

I think this is far too sanguine with regard to our understanding of decision theory. See The Commitment Races problem for one example of a serious problem that, AFAIK, isn’t solved by any of the currently proposed decision theories, including FDT, and advocating that more people adopt FDT before solving the problem might even make it worse (if people think that FDT implies making earlier, more hasty commitments, in a wider range of circumstances).

Or at a minimum, we need to give the proper disdain to those who are advocating policies that would result in handing the world to men like Putin.

I’m very unsure what is the rational response (for the West) to Putin’s actions and threats. Among other considerations, what if there’s say a 10% chance that Putin actually assigns less disutility to destroying the world than to failing at restoring Russia’s historical territories (or he can be reasonably modeled as such), in which case his nuclear threats would not be bluffs, so letting him “win” is perhaps the rational thing to do?

Yes, such a policy would eventually hand the world to “men like Putin”, but the alternative policy would potentially incur an immediate 10% chance of destroying the world, perhaps increasing to 100% over time as we keep calling the bluff of other “men like Putin”. You seem to be a lot more certain than I am about what the right thing to do is, so I’m curious what is the reasoning that led to your conclusions.
- Zvi 12 Mar 2022 12:15 UTC
  7 points
  Parent
  On the personal level, to me this seems like a potential failure mode worth worrying about in an AGI context because that’s Impossible Mode for everything, but not in practical human mode, and most definitely not on the margin. I’m not claiming I have the best possible answer here, I’m claiming that what humans are current doing is some mix of absurdly stupid things and non-FDT proposals that exist seem way worse than FDT proposals that exist—also that actual human attempts to be FDT-style agents will be incomplete and not lead to degenerate outcomes, the same way mostly-basically-CDT-style human agents often don’t actually do the fully crazy things it implies when it implies fully crazy things.
  On the global level, again I’m not saying I know the details of how we should respond, only that we shouldn’t lay down and let him get whatever he wants.
  I think it is essentially epsilon chance that Putin would choose nuclear firestorm over a world where Russia doesn’t recreate the USSR / Russian Empire (and even where he dies tomorrow as well) if those were 100% the choices (I do think he might well be willing to risk a non-trivial chance of nuclear war to get it, but that is different), but let’s say that it is 10% (and notice that in those 10%, if he knew for a fact we’d respond with our nukes, he’d just say ‘oh that’s too bad’ and destroy the world in a fit of pique, which very much doesn’t seem right). In the other 90%, he backs down after various numbers of escalations (e.g. in some he tries the escalate-to-deescalate single nuke, others he tries leveling Kyiv, others he folds tomorrow and leaves) in some combination, then folds after he sees we won’t fold, but losses are ‘acceptable’ here.
  In those 10% of worlds, what are we hoping for? I don’t think there is much of an ‘eventually’ here.
  He takes Ukraine, we let him. He sees we let him do what he wants. Everyone else sees too. Every state starts a nuclear weapons program that can afford one, so Putin knows he’s on limited time. Ukraine starts an insurgency and Putin starts killing A LOT of people in response. We do nothing. Moldavia is next pretty much right away. That falls in days. He then goes for Kazakhstan. Within a year he has all the non-NATO USSR republics in hand.
  Meanwhile, Xi launches an invasion of Taiwan. Putin makes it clear that if the USA interferes he’ll back China up. We fold. China takes it. TSMC is destroyed. We lose the majority of our chip capacity, either to China or the void. Everyone knows our commitments mean nothing. Every country with a score to settle acts now.
  Now Putin takes part of Estonia, and fortifies it. This is, let’s say, December 2022, and Trump-backed Republicans just swept the midterms during an extreme recession. What do we do? Again, clearly, nothing.
  NATO is dead. No one believes us at all, anywhere. Putin invades and takes the Baltics. Six months later, he’s in Warsaw and Bucharest, perhaps without firing a shot. North Korea marches south and points its ICBMs.
  And then it gets worse.
  Or, more likely, at some point in that story we DO confront him, and the nuclear war happens anyway—and there’s a lot of worlds where he didn’t want that, but by folding so often, we let him think he could do it, so he doesn’t know where to stop, and then we get into a nuclear war over Estonia or whatever and someone miscalculates.
  And that scenario happens 50%+ of the time rather than 10%, because there’s a ton of worlds where Putin/Xi/etc will take advantage like that, but where they very much would have folded if challenged. The chances of nukes flying in the medium (10 year) term go up, not down.
  Even the best case scenarios I can imagine in such places are ones I very much do not like.
  (Also, I think that if Putin tried to start a nuclear war without any attacks on Russia itself, that there is a >50% (although of course not terribly reassuring) chance that the answer would be ‘no’ and a substantial chance Russia’s nuclear weapons mostly no longer work and are a bluff whether or not Putin knows they don’t work—I very much would not launch ours until post-impact in the hopes this was the case, given our second strike capabilities. And there’s the worlds in which Putin tries to launch one nuke, it’s a dud, or his people turn on him, and that’s the end of all of it.)
  If we lived by the rules you’re suggesting in the past, also, we wouldn’t have gotten this far—we would have folded to the USSR as nation after nation turned red, and at best we’d be in a much poorer, less free world (assuming it’s nuke-specific, and we still fight WW2).
  - Wei Dai 12 Mar 2022 12:41 UTC
    3 points
    Parent
    
    I’m not claiming I have the best possible answer here, I’m claiming that what humans are current doing is some mix of absurdly stupid things
    
    What are some examples of these?
    
    I think it is essentially epsilon chance that Putin would choose nuclear firestorm over a world where Russia doesn’t recreate the USSR / Russian Empire (and even where he dies tomorrow as well) if those were 100% the choices (I do think he might well be willing to risk a non-trivial chance of nuclear war to get it, but that is different), but let’s say that it is 10% (and notice that in those 10%, if he knew for a fact we’d respond with our nukes, he’d just say ‘oh that’s too bad’ and destroy the world in a fit of pique, which very much doesn’t seem right).
    
    Yeah, I should have said 10% chance of escalating all the way to destroying the world (if the West doesn’t let Putin have his way), through all causes, not just Putin having that kind of values.
    
    If we lived by the rules you’re suggesting in the past, also, we wouldn’t have gotten this far—we would have folded to the USSR as nation after nation turned red, and at best we’d be in a much poorer, less free world (assuming it’s nuke-specific, and we still fight WW2).
    
    But if the USSR became the world government, at least we wouldn’t be repeatedly facing 10% chance of destroying the world. Is that an obvious tradeoff to you?
    - Viliam 12 Mar 2022 22:43 UTC
      8 points
      Parent
      But if the USSR became the world government, at least we wouldn’t be repeatedly facing 10% chance of destroying the world.
      After reading Solzhenitsyn’s The Gulag Archipelago, I am not really sure which outcome is worse.
      - Wei Dai 13 Mar 2022 19:31 UTC
        5 points
        Parent
        I’m not sure why this got voted down, but I don’t disagree. In parts of my comments that you didn’t quote, I indicated my own uncertainty about the tradeoff.
    - Zvi 12 Mar 2022 16:30 UTC
      3 points
      Parent
      If there were a bunch of Putin-style people in charge of the world that doesn’t seem like a ‘safe’ world either. It seems like a world where these states engage in continuous brinksmanship that, if this kind of mindset is common, leads to a 10% chance of Armageddon as often or more often than the current one.
      We may have very different models of what happens if we let the USSR take over, but yeah I think that world has destroyed most of its value assuming it didn’t go negative. And I don’t think you eliminate the risks—you have a bunch of repressive communist governments everywhere in a world where conditions are getting worse (because communism doesn’t work) and they start fighting over resources slash have nuclear civil wars.
      If the model is ‘Putin escalates to nuclear war sometimes and maybe he miscalculates’ then ‘fold to him’ is letting him conquer the world, literally, because no he wouldn’t stop with Russia’s old borders if we let him get Warsaw and Helsinki. Why would he? Otherwise, folding more makes him escalate until the nukes fly.
      - Wei Dai 12 Mar 2022 17:21 UTC
        4 points
        Parent
        
        And I don’t think you eliminate the risks—you have a bunch of repressive communist governments everywhere in a world where conditions are getting worse (because communism doesn’t work) and they start fighting over resources slash have nuclear civil wars.
        
        I’m assuming that the USSR would not have let other communist governments develop their own nuclear weapons.
        
        It seems like the only world which doesn’t face repeated 10% chances of Armageddon is one in which some state has a nuclear monopoly and enforces it by threatening to attack any other state that tries to develop nuclear weapons (escalating to nuclear attack if necessary). Ideally, this would have been the US, but failing that, maybe the USSR having a nuclear monopoly would be preferable to the current situation.
        
        Also, communist governments don’t always get worse over time, monotonically. Sometimes they get better instead. It’s pretty unclear to me what would have happened to the USSR in the long run, in this alternate world in which they achieved a nuclear monopoly.
  - Radford Neal 12 Mar 2022 16:43 UTC
    0 points
    Parent
    Zvi: It’s interesting that your argument above is phrased entirely in the framework of causal decision theory. Might there be a good reason for that?
- Zvi 13 Mar 2022 14:15 UTC
  3 points
  Parent
  Looking at the Commitment Races Problem more (although not in full detail) it looks like this is if anything a worse problem for existing systems used in practice by (e.g. Biden), or at a minimum a neutral consideration. It seems more like a “I notice all existing options have this issue” problem than anything else, and like it’s pointing to a flaw in consequentialism more broadly?
  - Wei Dai 13 Mar 2022 19:24 UTC
    4 points
    Parent
    it looks like this is if anything a worse problem for existing systems used in practice by (e.g. Biden)
    
    Why do you say this? I’m pretty worried about people adopting any kind of formal decision theory, and then making commitments earlier than they otherwise would, because that’s what the decision theory says is “rational”. If you have a good argument to the contrary, then I’d be less concerned about this.
    
    It seems more like a “I notice all existing options have this issue” problem than anything else, and like it’s pointing to a flaw in consequentialism more broadly?
    
    The addition issue with UDT/FDT is that they extend the Commitment Races Problem into logical time instead of just physical time:
    
    physical time: physically throwing away the wheel in a game of chicken before the other player does
    logical time: think as little as possible before making a commitment in your mind, because if you think more, you might conclude (via simulation or abstract reasoning) that the other player already made their commitment so now your own decision has to condition on that commitment (i.e., take it as a given), and by thinking more you also make it harder for the other player to conclude this about yourself
    
    BTW you didn’t answer my request of examples of “humans are current doing is some mix of absurdly stupid things”. I’m still curious about that.
    - Zvi 14 Mar 2022 14:39 UTC
      5 points
      Parent
      I think you’re taking the formal adoption of FDTs too literally here, or treating it as if it were the AGI case, as if humans were able to self-modify into machines fully capable of honoring commitments and then making arbitrary ones, or something? Whereas actual implementations here are pretty messy, and also they’re inscribed in the larger context of the social world.
      I also don’t understand the logical time argument here as it applies to humans?
      I can see in a situation where you’re starting out in fully symmetrical conditions with known source codes, or something, why you’d need to think super quick and make faster commitments. But I’m confused why that would apply to ordinary humans in ordinary spots?
      Or to bring it back to the thing I actually said in more detail, Biden seems like he’s using something close to pure CDT. So someone using commitments can get Biden to do quite a lot, and thus they make lots of crazy commitments.
      Whereas in a socially complex multi-polar situation, someone who was visibly making lots of crazy strong commitments super fast or something would some combination of (1) run into previous commitments made by others to treat such people poorly (2) be seen as a loose cannon and crazy actor to be put down (3) not be seen as credible because they’re still a human, sufficiently strong/fast/stupid commitments don’t work, etc.
      I think the core is—you are worried about people ‘formally adopting a decision theory’ and I think that’s not what actual people ever actually do. As in, you and I both have perhaps informally adapted such policies, but that’s importantly different and does not lead to these problems in these ways. On the margin such movements are simply helpful.
      (On your BTW, I literally meant that to refer to the central case of ‘what people do in general when they have non-trivial decisions, in general’ - that those without a formal policy don’t do anything coherent, and often change their answers dramatically based on social context or to avoid mild awkwardness, and so on, if I have time I’ll think about what the best examples of this would be but e.g. I’ve been writing about crazy decisions surrounding Covid for 2+ years now.)
      - Wei Dai 15 Mar 2022 3:08 UTC
        3 points
        Parent
        
        I think you’re taking the formal adoption of FDTs too literally here, or treating it as if it were the AGI case, as if humans were able to self-modify into machines fully capable of honoring commitments and then making arbitrary ones, or something?
        
        Actually, my worry is kind of in the opposite direction, namely that we don’t really know how FDT can or should be applied in humans, but someone with a vague understanding of FDT might “adopt FDT” and then use it to handwavingly justify some behavior or policy. For example someone might think, “FDT says that we should think as little as possible before mentally making commitments, so that’s what I’ll do.”
        
        Or take the example of your OP, in which you invoke FDT, but don’t explain in any mathematical detail how FDT implies the conclusions you’re seemingly drawing from it.
        
        Or to bring it back to the thing I actually said in more detail, Biden seems like he’s using something close to pure CDT. So someone using commitments can get Biden to do quite a lot, and thus they make lots of crazy commitments.
        
        Here too, I suspect you may have only a vague understanding of the difference between CDT and FDT. Resisting threats (“crazy commitments”) is often rational even under CDT, if you’re in a repeated game (i.e., being observed by players you may face in the future). I would guess your disagreement with Biden is probably better explained by something else besides FDT vs CDT.
        
        ETA: I also get a feeling that you have a biased perspective on the object level. If “someone using commitments can get Biden to do quite a lot”, why couldn’t Putin get Biden to promise not to admit Ukraine into NATO?
    - dxu 13 Mar 2022 19:50 UTC
      3 points
      Parent
      I admit to not being super interested in the larger geopolitical context in which this discussion is embedded… but I do want to get into this bit a little more:
      think as little as possible before making a commitment in your mind, because if you think more, you might conclude (via simulation or abstract reasoning) that the other player already made their commitment so now your own decision has to condition on that commitment
      It’s not obvious to me why the bolded assertion follows; isn’t the point of “updatelessness” precisely that you ignore / refrain from conditioning your decision on (negative-sum) actions taken by your opponent in a way that would, if your conditioning on those actions was known in advance, predictably incentivize your opponent to take those actions? Isn’t that the whole point of having a decision theory that doesn’t give in to blackmail?
      Like, yes, one way to refuse to condition on that kind of thing is to refuse to even compute it, but it seems very odd to me to assert that this is the best way to do things. At the very least, you can compute everything first, and then decide to retroactively ignore all the stuff you “shouldn’t have” computed, right? In terms of behavior this ought not provide any additional incentives to your opponent to take stupid (read: negative-sum) actions, while still providing the rest of the advantages that come with “thinking things through”… right?
      and by thinking more you also make it harder for the player to conclude this about yourself
      This part is more compelling in my view, but also it kind of seems… outside of decision theory’s wheelhouse? Like, yes, once you start introducing computational constraints and other real-world weirdness, things can and do start getting messy… but also, the messiness that results isn’t a reason to abandon the underlying decision theory?
      For example, I could say “Imagine a crazy person really, really wants to kill you, and the reason they want to do this is that their brain is in some sense bugged; what does your decision theory say you should do in this situation?” And the answer is that your decision theory doesn’t say anything (well, anything except “this opponent is behaviorally identical to a DefectBot, so defect against them with all you have”), but that isn’t the decision theory’s fault, it’s just that you gave it an unfair scenario to start with.
      What, if anything, am I missing here?
      - Wei Dai 13 Mar 2022 22:43 UTC
        5 points
        Parent
        
        It’s not obvious to me why the bolded assertion follows; isn’t the point of “updatelessness” precisely that you ignore / refrain from conditioning your decision on (negative-sum) actions taken by your opponent in a way that would, if your conditioning on those actions was known in advance, predictably incentivize your opponent to take those actions? Isn’t that the whole point of having a decision theory that doesn’t give in to blackmail?
        
        By “has to” I didn’t mean that’s normatively the right thing to do, but rather that’s what UDT (as currently formulated) says to do. UDT is (currently) updateless with regard to physical observations (inputs from your sensors) but not logical observations (things that you compute in your mind), and nobody seems to know how to formulate a decision theory that is logically updateless (and not broken in other ways). It seems to be a hard problem as progress has been bogged down for more than 10 years.
        
        Conceptual Problems with UDT and Policy Selection is probably the best article to read to get up to date on this issue, if you want a longer answer.