Because we have a “basic counterfactual” proposition for what would happen if we 1-box and what would happen if we 2-box, and both of those propositions stick around, LCH’s bets about what happens in either case both matter. This is unlike conditional bets, where if we 1-box, then bets conditional on 2-boxing disappear, refunded, as if they were never made in the first place.
I don’t understand this part. Your explanation of PCDT at least didn’t prepare me for it, it doesn’t mention betting. And why is the payoff for the counterfactual-2-boxing determined by the beliefs of the agent after 1-boxing?
And what I think is mostly independent of that confusion: I don’t think things are as settled.
I’m more worried about the embedding problems with the trader in dutch book arguments, so the one against CDT isn’t as decisive for me.
In the Troll Bridge hypothetical, we prove that [cross]->[U=-10]. This will make the conditional expectations poor. But this doesn’t have to change the counterfactuals.
But how is the counterfactual supposed to actually think? I don’t think just having the agent unrevisably believe that crossing is counterfactually +10 is a reasonable answer, even if it doesn’t have any instrumental problems in this case. I think it ought to be possible to get something like “whether to cross in troll bridge depends only on what you otherwise think about PAs consistency” with some logical method. But even short of that, there needs to be some method to adjust your counterfactuals if they fail to really match you conditionals. And if we had an actual concrete model of counterfactual reasoning instead of a list of desiderata, it might be possible to make a troll based on the consistency of whatever is inside this model, as opposed to PA.
I also think there is a good chance the answer to the cartesian boundary problem won’t be “heres how to calculate where your boundary is”, but something else of which boundaries are an approximation, and then something similar would go for counterfactuals, and then there won’t be a counterfactual theory which respects embedding.
These later two considerations suggest the leftover work isn’t just formalisation.
are the two players physically precisely the same (including environment), at least insofar as the players can tell?
In the examples I gave yes. Because thats the case where we have a guarantee of equal policy, from which people try to generalize. If we say players can see their number, then the twins in the prisoners dilemma needn’t play the same way either.
But this is one reason why correlated equilibria are, usually, a better abstraction than Nash equilibria.
The “signals” players receive for correlated equilibria are already semantic. So I’m suspicious that they are better by calling on our intuition more to be used, with the implied risks. For example I remember reading about a result to the effect that correlated equilibria are easier to learn. This is not something we would expect from your explanation of the differences: If we explicitly added something (like the signals) into the game, it would generally get more complicated.
Hum, then I’m not sure I understand in what way classical game theory is neater here?
Changing the labels doesn’t make a difference classically.
As long as the probabilistic coin flips are independent on both sides
Do you have examples of problems with copies that I could look at and that you think would be useful to study?
No, I think you should take the problems of distributed computing, and translate them into decision problems, that you then have a solution to.
Well, if I understand the post correctly, you’re saying that these two problems are fundamentally the same problem
No. I think:
...the reasoning presented is correct in both cases, and the lesson here is for our expectations of rationality...
As outlined in the last paragraph of the post. I want to convince people that TDT-like decision theories won’t give a “neat” game theory, by giving an example where they’re even less neat than classical game theory.
Actually it could.
I think you’re thinking about a realistic case (same algorithm, similar environment) rather than the perfect symmetry used in the argument. A communication channel is of no use there because you could just ask yourself what you would send, if you had one, and then you know you would have just gotten that message from the copy as well.
I can use my knowledge of distributed computing to look at the sort of decision problems where you play with copies
I’d be interested. I think even just more solved examples of the reasoning we want are useful currently.
The link would have been to better illustrate how the proposed system works, not about motivation. So, it seems that you understood the proposal, and wouldn’t have needed it.
I don’t exactly want to learn the cartesian boundary. A cartesian agent believes that its input set fully screens off any other influence on its thinking, and the outputs screen off any influence of the thinking on the world. Its very hard to find things that actually fulfill this. I explain how PDT can learn cartesian boundaries, if there are any, as a sanity/conservative extension check. But it can also learn that it controls copies or predictions of itself for example.
The apparent difference is based on the incoherent counterfactual “what if I say heads and my copy says tails”
I don’t need counterfactuals like that to describe the game, only implications. If you say heads and your copy tails, you will get one util, just like how if 1+1=3, the circle can be squared.
The interesting thing here is that superrationality breaks up an equivalence class relative to classical game theory, and peoples intuitions don’t seem to have incorporated this.
“The same” in what sense? Are you saying that what I described in the context of game theory is not surprising, or outlining a way to explain it in retrospect?
Communication won’t make a difference if you’re playing with a copy.
What is and isn’t an isomorphism depends on what you want to be preserved under isomorphism. If you want everything thats game-theoretically relevant to be preserved, then of course those games won’t turn out equivalent. But that doesn’t explain anything. If my argument had been that the correct action in the prisoners dilemma depends on sunspot activity, you could have written your comment just as well.
...the one outlined in the post?
Right, but then, are all other variables unchanged? Or are they influenced somehow? The obvious proposal is EDT—assume influence goes with correlation.
I’m not sure why you think there would be a decision theory in that as well. Obviously when BDT decides its output, it will have some theory about how its output nodes propagate. But the hypothesis as a whole doesn’t think about influence. Its just a total probability distribution, and it includes that some things inside it are distributed according to BDT. It doesn’t have beliefs about “if the output of BDT were different”. If BDT implements a mixed strategy, it will have beliefs about what each option being enacted correlates with, but I don’t see a problem if this doesn’t track “real influence” (indeed, in the situations where this stuff is relevant it almost certainly won’t) - its not used in this role.
Adding other hypothesis doesn’t fix the problem. For every hypothesis you can think of, theres a version of it that says “but I survive for sure” tacked on. This hypothesis can never lose evidence relative to the base version, but it can gain evidence anthropically. Eventually, these will get you. Yes, theres all sorts of considerations that are more relevant in a realistic scenario, thats not the point.
The problem, as I understand it, is that there seem to be magical hypothesis you can’t update against from ordinary observation, because by construction the only time they make a difference is in your odds of survival. So you can’t update them from observation, and anthropics can only update in their favour, so eventually you end up believing one and then you die.
Maybe the disagreement is in how we consider the alternative hypothesis to be? I’m not imagining a broken gun—you could examine your gun and notice it isn’t, or just shoot into the air a few times and see it firing. But even after you eliminate all of those, theres still the hypothesis “I’m special for no discernible reason” (or is there?) that can only be tested anthropically, if at all. And this seems worrying.
Maybe heres a stronger way to formulate it: Consider all the copies of yourself across the multiverse. They will sometimes face situations where they could die. And they will always remember having survived all previous ones. So eventually, all the ones still alive will believe they’re protected by fate or something, and then do something suicidal. Now you can bring the same argument about how there are a few actual immortals, but still… “A rational agent that survives long enough will kill itself unless its literally impossible for it to do so” doesn’t inspire confidence, does it? And it happens even in very “easy” worlds. There is no world where you have a limited chance of dying before you “learn the ropes” and are safe—its impossible to have a chance of eventual death other than 0 or 1, without the laws of nature changing over time.
just have a probabilistic model of the world and then condition on the existence of yourself (with all your memories, faculties, etc).
I interpret that as conditioning on the existence of at least one thing with the “inner” properties of yourself.
To clarify, do you think I was wrong to say UDT would play the game? I’ve read the two posts you linked. I think I understand Weis, and I think the UDT described there would play. I don’t quite understand yours.
Another problem with this is that it isn’t clear how to form the hypothesis “I have control over X”.
You don’t. I’m using talk about control sometimes to describe what the agent is doing from the outside, but the hypothesis it believes all have a form like “The variables such and such will be as if they were set by BDT given such and such inputs”.
One problem with this is that it doesn’t actually rank hypotheses by which is best (in expected utility terms), just how much control is implied.
For the first setup, where its trying to learn what it has control over, thats true. But you can use any ordering of hypothesis for the descent, so we can just take “how good that world is” as our ordering. This is very fragile of course. If theres uncountably many great but unachievable worlds, we fail, and in any case we are paying for all this with performance on “ordinary learning”. If this were running in a non-episodic environment, we would have to find a balance between having the probability of hypothesis decline according to goodness, and avoiding the “optimistic humean troll” hypothesis by considering complexity as well. It really seems like I ought to take “the active ingredient” of this method out, if I knew how.
From my perspective, Radical Probabilism is a gateway drug.
This post seemed to be praising the virtue of returning to the lower-assumption state. So I argued that in the example given, it took more than knocking out assumptions to get the benefit.
So, while I agree, I really don’t think it’s cruxy.
It wasn’t meant to be. I agree that logical inductors seem to de facto implement a Virtuous Epistemic Process, with attendent properties, whether or not they understand that. I just tend to bring up any interesting-seeming thoughts that are triggered during conversation and could perhaps do better at indicating that. Whether its fine to set it aside provisionally depends on where you want to go from here.
Either way, we’ve made assumptions which tell us which Dutch Books are valid. We can then check what follows.
Ok. I suppose my point could then be made as “#2 type approaches aren’t very useful, because they assume something thats no easier than what they provide”.
I think this understates the importance of the Dutch-book idea to the actual construction of the logical induction algorithm.
Well, you certainly know more about that than me. Where did the criterion come from in your view?
This part seems entirely addressed by logical induction, to me.
Quite possibly. I wanted to separate what work is done by radicalizing probabilism in general, vs logical induction specifically. That said, I’m not sure logical inductors properly have beliefs about their own (in the de dicto sense) future beliefs. It doesn’t know “its” source code (though it knows that such code is a possible program) or even that it is being run with the full intuitive meaning of that, so it has no way of doing that. Rather, it would at some point think about the source code that we know is its, and come to believe that that program gives reliable results—but only in the same way in which it comes to trust other logical inductors. It seems like a version of this in the logical setting.
By “knowing where they are”, I mean strategies that avoid getting dutch-booked without doing anything that looks like “looking for dutch books against me”. One example of that would be The Process That Believes Everything Is Independent And Therefore Never Updates, but thats a trivial stupidity.
One of the most important things I learned, being very into nutrition-research, is that most people can’t recognize malnutrition when they see it, and so there’s a widespread narrative that it doesn’t exist. But if you actually know what you’re looking for, and you walk down an urban downtown and look at the beggars, you will see the damage it has wrought… and it is extensive.
Can someone recommed a way of learning to recognize this without having to spend effort on nutrition-in-general?
I think giving reasons made this post less effective. Reasons make naive!rationalist more likely to yield on this particular topic, but thats no longer a live concern, and it probably inhibits learning the general lesson.