[edit: Thanks to Jessica for pointing out that the literal thing I suggested here does not work unless one of the nodes (my decision) is deterministic. After discussing the CTC-like phenomena below we summarized it as finding timeloops to then remove them by being updateless in the right places]
Philosophically, this is kind of weird. It is hard to imagine what would happen if we were trying to discover the causal structure of the universe, and we discover that there is some hidden variable that always turns out to be a certain way, in spite of appearing to be causally downstream from the other variables.
This actually feels very motivated to me: I think about the new epiphenomenal variables as logical facts. It feels intuitive to think about logical facts as being caused by all their concrete instances out in the world rather than as causing the concrete instances. The operation of post-selection then functions as a post-facto condition on logic being consistent (or computations being the same way everywhere). This feels very close to how humans intuit these situations which I think is a good sign.
On the other hand, maybe the effect is to be like CDT when you are the only copy of yourself, but then becomes like EDT when other copies are involved, thus allowing it to be like an EDT that smokes in the smoking legion problem.
Smoking legion is an apt name as the behavior, counterintuitively, depends on the number of copies you consider. Still, I don’t think it’s an illuminating problem for computational decision theories. Problems akin to Agent Simulates Predictor (I looked at Transparent Newcomb’s Problem and Parafit’s Hitchhiker) feel clearer when reasoning about algorithms, and there postCDT fails to be optimal, but approaches the right behavior as you add more copies that are logically entangled.
These problems seem to come from what’s almost a directed cycle in the causal structure: if you consider yourself as causing logical facts but everyone else to be influenced by them you get DAGs in Newcomb’s Problem and Prisoner’s Dilemma, but directed cycles in the above problems that postCDT struggles with. I think this is actually fundamental to how we should think about the problems: not too different from time-travel (intelligence/prediction as time-travel is an interesting metaphor). Being updateless about some inputs looks a lot like finding the most desirable CTC.
Concretely, I propose that we determine the set of parents P that become pseudo-descendants (according to above point of view) and then, instead of conditioning on the value of x∈P and choosing an action from the set A, we chose a look-up table from AI (where I=∏x∈Px is the space of possible inputs, and we always condition normally on the parents not in P). Now when we sample we treat all the nodes in this pseudo-directed cycle as undetermined other than our decision which is given by the look-up table. This makes us updateless, but only about a bounded set decision-relevant set of things. I’m not sure to what extent this looks as something different than UDT to you, but it strikes me as more modularizable.
In terms of implementation I see two options that both feel unsatisfactory at my current level of understanding:
Firstly, if we’re only doing the standard postCDT (without the loop-based fix) we can sample the causal structure using some kind of Monte Carlo method creating a sizable table of all the outcomes of all the nodes, we then use a GI (with suitable background information) to estimate how likely each of the outcomes is and discard overrepresented outcomes at random until we have the same distribution. In Prisoner’s Dilemma it will correctly know facts like my twin and me making the same decision, and therefore we’ll discard most of the samples where this is not the case.
I haven’t thought about this approach much and I’m not sure if it has problems with spurious counterfactuals (is this clearly avoidable with ε-exploration and normalizing each action separately?) and/or weird linkage between the causal structure and the inductor. Because this method can’t be used to find the CTC-like loops it feels less promising overall.
Secondly, we can identify facts about larger computations implementing smaller ones and use this for our logical links, so that the physical world causing logical facts that are instantiated in it becomes a special case of larger computations causing facts about the smaller computations that they implement. Here the very large computation of physics contains the two smaller computations of me and whoever is computing my decision (the latter of which also contains a simulation of the former). Adding these two nodes and three links we get the necessary structure; post-selection follows simply from logic being consistent (i.e. strictly speaking we add one node for every incoming edge to the new nodes and a common child to all of them (if more than one), that checks whether they’re all the same, and post-select on it saying same).
This approach feels closer to my intuitions about how to make decisions, but also feels very incomplete. Finding all sub-computations of a computation seems somewhat tractable, but it’ll be tricky to both locate all the relevant computations (a robot moving around means that it’s memory locations don’t correspond to fixed physical locations, so its computations can’t be had by only forgetting details from our simulation of physics) and not be overwhelmed by irrelevant translations (cf. David Chalmer’s paper Does a Rock Implement Every Finite-State Automaton? for an example of what we don’t want).
It now looks like we’ll have to deal with lots and lots of different sub-computations, but there are two helpful facts to note here: (i) some of these sub-computations factorize (e.g. if I use a sorting algorithm in my computation and someone else simulates me then we can find entanglement through the fact that the sorting algorithm works the same way in both places, but this fact is completely captured by me and the simulation of me working the same way, and thus redundant), (ii) we don’t care about getting all sub-computations (which we do when thinking about ontologies in general because they could contain facts relevant to our utility function/morality) only those that give rise to decision relevant entanglement between different parts of our causal graph.
Finally, there’s the problem of reasoning about programs without simulating them: this isn’t picked up as a sub-computation for any reasonable notion of sub-computation, but the difference between simulating and abstract reasoning should just be an implementation detail and not decision relevant.
It looks like there are also common problems to all approaches along these lines: we have to derive the high-level causal structure of the world (which would require multi-layer world models for all but the simplest problems), and once we’ve done this there’s the artificial boundary between knowledge that’s encoded as the causal structure and knowledge about the distrbution of the nodes. The latter doesn’t seem very troubling to me: it actually seems to echo the phenomenon seen in PSRL/Thompson sampling of alternating stages of inference and optimization. In fact, this duality between epistemic and instrumental rationality is starting to feel as important as the duality between proof and counter-example in my thinking about thinking.
[edit: Thanks to Jessica for pointing out that the literal thing I suggested here does not work unless one of the nodes (my decision) is deterministic. After discussing the CTC-like phenomena below we summarized it as finding timeloops to then remove them by being updateless in the right places]
This actually feels very motivated to me: I think about the new epiphenomenal variables as logical facts. It feels intuitive to think about logical facts as being caused by all their concrete instances out in the world rather than as causing the concrete instances. The operation of post-selection then functions as a post-facto condition on logic being consistent (or computations being the same way everywhere). This feels very close to how humans intuit these situations which I think is a good sign.
Smoking legion is an apt name as the behavior, counterintuitively, depends on the number of copies you consider. Still, I don’t think it’s an illuminating problem for computational decision theories. Problems akin to Agent Simulates Predictor (I looked at Transparent Newcomb’s Problem and Parafit’s Hitchhiker) feel clearer when reasoning about algorithms, and there postCDT fails to be optimal, but approaches the right behavior as you add more copies that are logically entangled.
These problems seem to come from what’s almost a directed cycle in the causal structure: if you consider yourself as causing logical facts but everyone else to be influenced by them you get DAGs in Newcomb’s Problem and Prisoner’s Dilemma, but directed cycles in the above problems that postCDT struggles with. I think this is actually fundamental to how we should think about the problems: not too different from time-travel (intelligence/prediction as time-travel is an interesting metaphor). Being updateless about some inputs looks a lot like finding the most desirable CTC.
Concretely, I propose that we determine the set of parents P that become pseudo-descendants (according to above point of view) and then, instead of conditioning on the value of x∈P and choosing an action from the set A, we chose a look-up table from AI (where I=∏x∈Px is the space of possible inputs, and we always condition normally on the parents not in P). Now when we sample we treat all the nodes in this pseudo-directed cycle as undetermined other than our decision which is given by the look-up table. This makes us updateless, but only about a bounded set decision-relevant set of things. I’m not sure to what extent this looks as something different than UDT to you, but it strikes me as more modularizable.
In terms of implementation I see two options that both feel unsatisfactory at my current level of understanding:
Firstly, if we’re only doing the standard postCDT (without the loop-based fix) we can sample the causal structure using some kind of Monte Carlo method creating a sizable table of all the outcomes of all the nodes, we then use a GI (with suitable background information) to estimate how likely each of the outcomes is and discard overrepresented outcomes at random until we have the same distribution. In Prisoner’s Dilemma it will correctly know facts like my twin and me making the same decision, and therefore we’ll discard most of the samples where this is not the case.
I haven’t thought about this approach much and I’m not sure if it has problems with spurious counterfactuals (is this clearly avoidable with ε-exploration and normalizing each action separately?) and/or weird linkage between the causal structure and the inductor. Because this method can’t be used to find the CTC-like loops it feels less promising overall.
Secondly, we can identify facts about larger computations implementing smaller ones and use this for our logical links, so that the physical world causing logical facts that are instantiated in it becomes a special case of larger computations causing facts about the smaller computations that they implement. Here the very large computation of physics contains the two smaller computations of me and whoever is computing my decision (the latter of which also contains a simulation of the former). Adding these two nodes and three links we get the necessary structure; post-selection follows simply from logic being consistent (i.e. strictly speaking we add one node for every incoming edge to the new nodes and a common child to all of them (if more than one), that checks whether they’re all the same, and post-select on it saying same).
This approach feels closer to my intuitions about how to make decisions, but also feels very incomplete. Finding all sub-computations of a computation seems somewhat tractable, but it’ll be tricky to both locate all the relevant computations (a robot moving around means that it’s memory locations don’t correspond to fixed physical locations, so its computations can’t be had by only forgetting details from our simulation of physics) and not be overwhelmed by irrelevant translations (cf. David Chalmer’s paper Does a Rock Implement Every Finite-State Automaton? for an example of what we don’t want).
It now looks like we’ll have to deal with lots and lots of different sub-computations, but there are two helpful facts to note here: (i) some of these sub-computations factorize (e.g. if I use a sorting algorithm in my computation and someone else simulates me then we can find entanglement through the fact that the sorting algorithm works the same way in both places, but this fact is completely captured by me and the simulation of me working the same way, and thus redundant), (ii) we don’t care about getting all sub-computations (which we do when thinking about ontologies in general because they could contain facts relevant to our utility function/morality) only those that give rise to decision relevant entanglement between different parts of our causal graph.
Finally, there’s the problem of reasoning about programs without simulating them: this isn’t picked up as a sub-computation for any reasonable notion of sub-computation, but the difference between simulating and abstract reasoning should just be an implementation detail and not decision relevant.
It looks like there are also common problems to all approaches along these lines: we have to derive the high-level causal structure of the world (which would require multi-layer world models for all but the simplest problems), and once we’ve done this there’s the artificial boundary between knowledge that’s encoded as the causal structure and knowledge about the distrbution of the nodes. The latter doesn’t seem very troubling to me: it actually seems to echo the phenomenon seen in PSRL/Thompson sampling of alternating stages of inference and optimization. In fact, this duality between epistemic and instrumental rationality is starting to feel as important as the duality between proof and counter-example in my thinking about thinking.