Note that Bayesian updating is not done explicitly in this decision theory. When the decision algorithm receives input X, it may determine that a subset of programs it has preferences about never calls it with X
That’s odd, I remember reading through the whole post, but my eyes must have skipped that part. Probably lack of sleep.
I was recently talking over a notion similar but not identical to this with Nick Bostrom. It shares with this idea the property of completely ruling out all epistemic anthropic reasoning even to the extent of concluding that you’re probably not a Boltzmann brain. I may post on it now that you’ve let the cat loose on “decide for all correlated copies of yourself”.
The four main things to be verified are (a) whether this works with reasoning about impossible possible worlds, say if the coinflip is a digit of pi, (b) that the obvious way of extending it to probabilistic hypotheses (namely separating the causal mechanism into determistic and uncorrelated probabilistic parts a la Pearl) actually works, (c) that there are no even more startling consequences not yet observed, and (d) that you can actually formally say when and how to make a decision that correlates to a copy of yourself in a world that a classical Bayesian would call “ruled out” (with the obvious idea being to assume similarity only with possible computations that have received the same inputs you do, and then being similar in your own branch to the computation depended on by Omega in the Counterfactual Mugging—I have to think about this further and maybe write it out formally to check if it works, though).
Further reflecting, it looks to me like there may be an argument which forces Wei Dai’s “updateless” decision theory, very much akin to the argument that I originally used to pin down my timeless decision theory—if you expect to face Counterfactual Muggings, this is the reflectively consistent behavior; a simple-seeming algorithm has been presented which generates it, so unless an even simpler algorithm can be found, we may have to accept it.
The face-value interpretation of this algorithm is a huge bullet to bite even by my standards—it amounts to (depending on your viewpoint) accepting the Self-Indication Assumption or rejecting anthropic reasoning entirely. If a coin is flipped, and on tails you will see a red room, and on heads a googolplex copies of you will be created in green rooms and one copy in a red room, and you wake up and find yourself in a red room, you would assign (behave as if you assigned) 50% posterior probability that the coin had come up tails. In fact it’s not yet clear to me how to interpret the behavior of this algorithm in any epistemic terms.
To give credit where it’s due, I’d only been talking with Nick Bostrom about this dilemma arising from altruistic timeless decision theorists caring about copies of themselves; the idea of applying the same line of reasoning to all probability updates including over impossible worlds, and using this to solve Drescher’s(?) Counterfactual Mugging, had not occurred to me at all.
Wei Dai, you may have solved one of the open problems I named, with consequences that currently seem highly startling. Congratulations again.
Credit for the no-update solution to Counterfactual Mugging really belongs to Nesov, and he came up with the problem in the first place as well, not Drescher. (Unless you can find a mention of it in Drescher’s book, I’m going to assume you misremembered.)
I will take credit for understanding what he was talking about and reformulating the solution in a way that’s easier to understand. :)
Nesov, you might want to reconsider your writing style, or something… maybe put your ideas into longer posts instead of scattered comments and try to leave smaller inferential gaps. You obviously have really good ideas, but often a person almost has to have the same idea already before they can understand you.
My book discusses a similar scenario: the dual-simulation version of Newcomb’s Problem (section 6.3), in the case where the large box is empty (no $1M) and (I argue) it’s still rational to forfeit the $1K. Nesov’s version nicely streamlines the scenario.
Just to elaborate a bit, Nesov’s scenario and mine share the following features:
In both cases, we argue that an agent should forfeit a smaller sum for the sake of a larger reward that would have been obtainted (couterfactually contingently on that forfeiture) if a random event had turned out differently than in fact it did (and than the agent knows it did).
We both argue for using the original coin-flip probability distribution (i.e., not-updating, if I’ve understood that idea correctly) for purposes of this decision, and indeed in general, even in mundane scenarios.
We both note that the forfeiture decision is easier to justify if the coin-toss was quantum under MWI, because then the original probability distribution corresponds to a real physical distribution of amplitude in configuration-space.
Nesov’s scenario improves on mine in several ways. He eliminates some unnecessary complications (he uses one simulation instead of two, and just tells the agent what the coin-toss was, whereas my scenario requires the agent to deduce that). So he makes the point more clearly, succinctly and dramatically. Even more importantly, his analysis (along with Yudkowsky, Dai, and others here) is more formal than my ad hoc argument (if you’ve looked at Good and Real, you can tell that formalism is not my forte.:)).
I too have been striving for a more formal foundation, but it’s been elusive. So I’m quite pleased and encouraged to find a community here that’s making good progress focusing on a similar set of problems from a compatible vantage point.
I’m quite pleased and encouraged to find a community here that’s making good progress focusing on a similar set of problems from a compatible vantage point.
And I think I speak for everyone when I say we’re glad you’ve started posting here! Your book was suggested as required rationalist reading. It certainly opened my eyes, and I was planning to write a review and summary so people could more quickly understand its insights.
(And not to be a suck-up, but I was actually at a group meeting the other day where the ice-breaker question was, “If you could spend a day with any living person, who would it be?” I said Gary Drescher. Sadly, no one had heard the name.)
I won’t be able to contribute much to these discussions for a while, unfortunately. I don’t have a firm enough grasp of Pearlean causality and need to read up more on that and Newcomb-like problems (halfway through your book’s handling of it).
Being in a transitionary period from sputtering nonsense to thinking in math, I don’t feel right to write anything up (publicly) until I understand it well enough. But I can’t help making occasional comments. Well, maybe that’s a wrong mode as well.
I guess there’s a tradeoff between writing too early, wasting your and other people’s time, and writing too late and wasting opportunities to clear other people’s confusion earlier and have them work in the same direction.
And on the same note: was my comment about state networks understandable? What do you think about that? I’d appreciate if people who have sufficient background to in principle understand a given comment but who are unable to do so due to insufficiently clear or incomplete explanation spoke up about that fact.
Another point that may help: if you’re presenting a complex idea, you need to provide some motivation for the reader to try to understand it. In your mind, that idea is linked to many others and form a somewhat coherent whole. But if you just describe the idea in isolation as math, either in equations or in words, the reader has no idea why they should try to understand it, except that you think it might be important for them to understand it. Perhaps because you’re so good at thinking in math, you seriously underestimate the amount of effort involved when others try it.
I think that’s the main reason to write in longer form. If you try to describe ideas individually, you have to either waste a lot of time motivating each one separately and explain how it fits in with other ideas, or risk having nobody trying seriously to understand you. If you describe the system as a whole, you can skip a lot of that and achieve an economy of scale.
Yeah, and math is very helpful as an explanation tool, because people can reconstruct the abstract concepts written in formulas correctly on the first try, even if math seems unnecessary for a particular point. Illusion of transparency of informal explanation, which is even worse where you know that formal explanation can’t fail.
Hmm… I’ve been talking about no-updating approach to decision-making for months, and Counterfactual Mugging was constructed specifically to show where it applies well, in a way that sounds on the surface opposite to “play to win”.
The idea itself doesn’t seem like anything new, just a way of applying standard expectation maximization, not to individual decisions, but to a choice of strategy as a whole, or agent’s source code.
From the point of view of agent, everything it can ever come to know results from computations it runs with its own source code, that take into account interaction with environment. If the choice of strategy doesn’t depend on particular observations, on context-specific knowledge about environment, then the only uncertainty that remains is the uncertainty about what the agent itself is going to do (compute) according to selected strategy. In simple situations, uncertainty disappears altogether. In more real-world situations, uncertainty results from there being a huge number of possible contexts in which the agent could operate, so that when the agent has to calculate its action in each such context, it can’t know for sure what it’s going to calculate in other contexts, while that information is required for the expected utility calculation. That’s logical uncertainty.
I was recently talking over a notion similar but not identical to this with Nick Bostrom. It shares with this idea the property of completely ruling out all epistemic anthropic reasoning even to the extent of concluding that you’re probably not a Boltzmann brain. I may post on it now that you’ve let the cat loose on “decide for all correlated copies of yourself”.
That reminds me, I actually had a similar idea back in 2001, and posted it on everything-list. I recall thinking at the time something like “This is a really alien way of reasoning and making decisions, and probably nobody will be able to practice it even if it works.”
Notice that which instances of the agent (making the choice) are possible in general depends on what choice it makes.
Consider what is accessible if you trace the history of the agent along counterfactuals. Let’s say the time is discrete, and at each moment the agent is in a certain state. Going forwards in time, you include both options for the agent’s state after receiving a binary observation from environment, and conversely, going backwards, you include both options for the agent’s state before each option for a binary action that agent could make to arrive to the current state (action and observation are dual under time-reversal in reversible deterministic world dynamic). Iterating with these operations, you construct a “state network” of accessible agent states. (You include the states arrived at by “zig-zag” as well: first, a step to the past, then, a step to the future along an observation other than the one that led to the original state from which the tracing began—and you arrive at a counterfactual state in the usual sense—but these time-forward and time-backward steps can be repeated infinite number of times.)
Now, the set of all possible states of the agent becomes divided into equivalence classes of states belonging to the same state networks. If the agent belongs to one of the state networks, if couldn’t be in any other state network (in the generalized sense of “coundn’t”). But which states belong to which network depends on the agent’s algorithm. In fact, the choice of the algorithm is equivalent to the choice of networks that cover the state set. I’m not really sure what to do with this construction, and whether the structure of the networks other that the network that contains the current state should matter. From the principle that observations shouldn’t influence the choice of strategy, the other state networks should matter just as well, but then again they are not even counterfactual...
Action and observation are not “intuitively” dual, to my first thought they are invariant on time reversal. Action is a state-transition of the environment, and observation is a state-transition of the agent.
I can see how the duality can be suggested by viewing action as a move of the agent-player and observation as a move of the environment-player. But here duality is in that a node which in one direction was a move by A (associated with arrows to the right), in the other direction is a move by E (associated with arrows to the left).
Ok, I understood this on my second reading, but I don’t know what to make of it either. Why did you decide to think about agents like this, or did the idea just pop into your head and you wanted to see if it has any applications?
It’s more or less a direct rendition of the idea of UDT: actions (with state transitions) depend on state of knowledge, so what does it say about the geometry of state transitions?
More relevant to the recent discussion: Where does logical dependence come from and how to track it in a representation detailed enough? The source of logical dependence, beside what comes from the common algorithm, is actions and observations. In forward-time, all states following a given observation become dependent on that observation, and in backward-time, states preceding an action. A single observation can make multiple actions depend on it, and thus make them dependent.
Connection with logic: states of knowledge in the state network are programs/proofs, and actions/observations are variables parameterizing more general programs that resolve into specific states of knowledge given these actions/observations. Also related to game semantics. This is one dimension along which to compress the knowledge representation and seek further understanding.
That’s odd, I remember reading through the whole post, but my eyes must have skipped that part. Probably lack of sleep.
I was recently talking over a notion similar but not identical to this with Nick Bostrom. It shares with this idea the property of completely ruling out all epistemic anthropic reasoning even to the extent of concluding that you’re probably not a Boltzmann brain. I may post on it now that you’ve let the cat loose on “decide for all correlated copies of yourself”.
The four main things to be verified are (a) whether this works with reasoning about impossible possible worlds, say if the coinflip is a digit of pi, (b) that the obvious way of extending it to probabilistic hypotheses (namely separating the causal mechanism into determistic and uncorrelated probabilistic parts a la Pearl) actually works, (c) that there are no even more startling consequences not yet observed, and (d) that you can actually formally say when and how to make a decision that correlates to a copy of yourself in a world that a classical Bayesian would call “ruled out” (with the obvious idea being to assume similarity only with possible computations that have received the same inputs you do, and then being similar in your own branch to the computation depended on by Omega in the Counterfactual Mugging—I have to think about this further and maybe write it out formally to check if it works, though).
Further reflecting, it looks to me like there may be an argument which forces Wei Dai’s “updateless” decision theory, very much akin to the argument that I originally used to pin down my timeless decision theory—if you expect to face Counterfactual Muggings, this is the reflectively consistent behavior; a simple-seeming algorithm has been presented which generates it, so unless an even simpler algorithm can be found, we may have to accept it.
The face-value interpretation of this algorithm is a huge bullet to bite even by my standards—it amounts to (depending on your viewpoint) accepting the Self-Indication Assumption or rejecting anthropic reasoning entirely. If a coin is flipped, and on tails you will see a red room, and on heads a googolplex copies of you will be created in green rooms and one copy in a red room, and you wake up and find yourself in a red room, you would assign (behave as if you assigned) 50% posterior probability that the coin had come up tails. In fact it’s not yet clear to me how to interpret the behavior of this algorithm in any epistemic terms.
To give credit where it’s due, I’d only been talking with Nick Bostrom about this dilemma arising from altruistic timeless decision theorists caring about copies of themselves; the idea of applying the same line of reasoning to all probability updates including over impossible worlds, and using this to solve Drescher’s(?) Counterfactual Mugging, had not occurred to me at all.
Wei Dai, you may have solved one of the open problems I named, with consequences that currently seem highly startling. Congratulations again.
Credit for the no-update solution to Counterfactual Mugging really belongs to Nesov, and he came up with the problem in the first place as well, not Drescher. (Unless you can find a mention of it in Drescher’s book, I’m going to assume you misremembered.)
I will take credit for understanding what he was talking about and reformulating the solution in a way that’s easier to understand. :)
Nesov, you might want to reconsider your writing style, or something… maybe put your ideas into longer posts instead of scattered comments and try to leave smaller inferential gaps. You obviously have really good ideas, but often a person almost has to have the same idea already before they can understand you.
My book discusses a similar scenario: the dual-simulation version of Newcomb’s Problem (section 6.3), in the case where the large box is empty (no $1M) and (I argue) it’s still rational to forfeit the $1K. Nesov’s version nicely streamlines the scenario.
Just to elaborate a bit, Nesov’s scenario and mine share the following features:
In both cases, we argue that an agent should forfeit a smaller sum for the sake of a larger reward that would have been obtainted (couterfactually contingently on that forfeiture) if a random event had turned out differently than in fact it did (and than the agent knows it did).
We both argue for using the original coin-flip probability distribution (i.e., not-updating, if I’ve understood that idea correctly) for purposes of this decision, and indeed in general, even in mundane scenarios.
We both note that the forfeiture decision is easier to justify if the coin-toss was quantum under MWI, because then the original probability distribution corresponds to a real physical distribution of amplitude in configuration-space.
Nesov’s scenario improves on mine in several ways. He eliminates some unnecessary complications (he uses one simulation instead of two, and just tells the agent what the coin-toss was, whereas my scenario requires the agent to deduce that). So he makes the point more clearly, succinctly and dramatically. Even more importantly, his analysis (along with Yudkowsky, Dai, and others here) is more formal than my ad hoc argument (if you’ve looked at Good and Real, you can tell that formalism is not my forte.:)).
I too have been striving for a more formal foundation, but it’s been elusive. So I’m quite pleased and encouraged to find a community here that’s making good progress focusing on a similar set of problems from a compatible vantage point.
And I think I speak for everyone when I say we’re glad you’ve started posting here! Your book was suggested as required rationalist reading. It certainly opened my eyes, and I was planning to write a review and summary so people could more quickly understand its insights.
(And not to be a suck-up, but I was actually at a group meeting the other day where the ice-breaker question was, “If you could spend a day with any living person, who would it be?” I said Gary Drescher. Sadly, no one had heard the name.)
I won’t be able to contribute much to these discussions for a while, unfortunately. I don’t have a firm enough grasp of Pearlean causality and need to read up more on that and Newcomb-like problems (halfway through your book’s handling of it).
I think you’d find me anticlimactic. :) But I do appreciate the kind words.
Being in a transitionary period from sputtering nonsense to thinking in math, I don’t feel right to write anything up (publicly) until I understand it well enough. But I can’t help making occasional comments. Well, maybe that’s a wrong mode as well.
I guess there’s a tradeoff between writing too early, wasting your and other people’s time, and writing too late and wasting opportunities to clear other people’s confusion earlier and have them work in the same direction.
And on the same note: was my comment about state networks understandable? What do you think about that? I’d appreciate if people who have sufficient background to in principle understand a given comment but who are unable to do so due to insufficiently clear or incomplete explanation spoke up about that fact.
Another point that may help: if you’re presenting a complex idea, you need to provide some motivation for the reader to try to understand it. In your mind, that idea is linked to many others and form a somewhat coherent whole. But if you just describe the idea in isolation as math, either in equations or in words, the reader has no idea why they should try to understand it, except that you think it might be important for them to understand it. Perhaps because you’re so good at thinking in math, you seriously underestimate the amount of effort involved when others try it.
I think that’s the main reason to write in longer form. If you try to describe ideas individually, you have to either waste a lot of time motivating each one separately and explain how it fits in with other ideas, or risk having nobody trying seriously to understand you. If you describe the system as a whole, you can skip a lot of that and achieve an economy of scale.
Yeah, and math is very helpful as an explanation tool, because people can reconstruct the abstract concepts written in formulas correctly on the first try, even if math seems unnecessary for a particular point. Illusion of transparency of informal explanation, which is even worse where you know that formal explanation can’t fail.
I didn’t understand it on my first try. I’ll have another go at it later and let you know.
Hmm… I’ve been talking about no-updating approach to decision-making for months, and Counterfactual Mugging was constructed specifically to show where it applies well, in a way that sounds on the surface opposite to “play to win”.
The idea itself doesn’t seem like anything new, just a way of applying standard expectation maximization, not to individual decisions, but to a choice of strategy as a whole, or agent’s source code.
From the point of view of agent, everything it can ever come to know results from computations it runs with its own source code, that take into account interaction with environment. If the choice of strategy doesn’t depend on particular observations, on context-specific knowledge about environment, then the only uncertainty that remains is the uncertainty about what the agent itself is going to do (compute) according to selected strategy. In simple situations, uncertainty disappears altogether. In more real-world situations, uncertainty results from there being a huge number of possible contexts in which the agent could operate, so that when the agent has to calculate its action in each such context, it can’t know for sure what it’s going to calculate in other contexts, while that information is required for the expected utility calculation. That’s logical uncertainty.
Re: The idea itself doesn’t seem like anything new [...]
That was my overwhelming impression.
Wei Dai’s theory does seem to imply this, and the conclusions don’t startle me much, but I’d really like a longer post with a clearer explanation.
That reminds me, I actually had a similar idea back in 2001, and posted it on everything-list. I recall thinking at the time something like “This is a really alien way of reasoning and making decisions, and probably nobody will be able to practice it even if it works.”
Notice that which instances of the agent (making the choice) are possible in general depends on what choice it makes.
Consider what is accessible if you trace the history of the agent along counterfactuals. Let’s say the time is discrete, and at each moment the agent is in a certain state. Going forwards in time, you include both options for the agent’s state after receiving a binary observation from environment, and conversely, going backwards, you include both options for the agent’s state before each option for a binary action that agent could make to arrive to the current state (action and observation are dual under time-reversal in reversible deterministic world dynamic). Iterating with these operations, you construct a “state network” of accessible agent states. (You include the states arrived at by “zig-zag” as well: first, a step to the past, then, a step to the future along an observation other than the one that led to the original state from which the tracing began—and you arrive at a counterfactual state in the usual sense—but these time-forward and time-backward steps can be repeated infinite number of times.)
Now, the set of all possible states of the agent becomes divided into equivalence classes of states belonging to the same state networks. If the agent belongs to one of the state networks, if couldn’t be in any other state network (in the generalized sense of “coundn’t”). But which states belong to which network depends on the agent’s algorithm. In fact, the choice of the algorithm is equivalent to the choice of networks that cover the state set. I’m not really sure what to do with this construction, and whether the structure of the networks other that the network that contains the current state should matter. From the principle that observations shouldn’t influence the choice of strategy, the other state networks should matter just as well, but then again they are not even counterfactual...
Action and observation are not “intuitively” dual, to my first thought they are invariant on time reversal. Action is a state-transition of the environment, and observation is a state-transition of the agent. I can see how the duality can be suggested by viewing action as a move of the agent-player and observation as a move of the environment-player. But here duality is in that a node which in one direction was a move by A (associated with arrows to the right), in the other direction is a move by E (associated with arrows to the left).
Ok, I understood this on my second reading, but I don’t know what to make of it either. Why did you decide to think about agents like this, or did the idea just pop into your head and you wanted to see if it has any applications?
It’s more or less a direct rendition of the idea of UDT: actions (with state transitions) depend on state of knowledge, so what does it say about the geometry of state transitions?
More relevant to the recent discussion: Where does logical dependence come from and how to track it in a representation detailed enough? The source of logical dependence, beside what comes from the common algorithm, is actions and observations. In forward-time, all states following a given observation become dependent on that observation, and in backward-time, states preceding an action. A single observation can make multiple actions depend on it, and thus make them dependent.
Connection with logic: states of knowledge in the state network are programs/proofs, and actions/observations are variables parameterizing more general programs that resolve into specific states of knowledge given these actions/observations. Also related to game semantics. This is one dimension along which to compress the knowledge representation and seek further understanding.