Debates on which decision theory (EDT/CDT/UDT/FDT/etc.) is “rational” seem to revolve around how one should model “free will”. Do we optimize individual actions or entire policies? Do we model our choice as an evidential update or a causal intervention?
Physics tells us that the Universe evolves deterministically and reversibly, so that the microscopic laws do not distinguish past from future. From the present state of the Universe, Laplace’s demon can accurately predict both the past and the future. It makes no sense to consider alternative actions or policies that you could have taken, because no such actions or policies exist. And yet, free will is an empirical fact: that is to say, our world is full of agents whose actions and policies optimize for some notion of future well-being.
Our cover article presents, I believe for the first time, a mathematically rigorous class of dynamical systems that resolve the century-old paradox between microscopic irreversibility and macroscopic causality. After establishing that the system’s macroscopic statistics are described by Bayes nets with timelike directed edges, Pearlean causal interventions arise as an effective way to model exogenous influences on a subsystem. From there, a sketch of why Darwinian natural selection may create CDT agents is as follows:
For simplicity, we model ourselves as following a deterministic algorithm encoded in our genes. There is no magic force of “free will” from outside the Universe; instead, the algorithm determines our action.
Among algorithms of the form “Do action A” and “Do action B”, Darwinian selection will favor the one whose action reaps rewards.
Now consider a more complex environment, in which it would be infeasible to maintain a genetic lookup table of optimal actions. There, a successful algorithm might reason as follows: “Let’s imagine that I were an agent that does action A and simulate the results. Now let’s imagine instead that I were an agent that does action B and simulate the results. I cannot choose which kind of agent I am, but I see that I prefer the result of action B. Therefore, my action is B.”
I argue that “free will” is precisely this process of considering multiple alternatives and outputting the action whose outcome best meets some internal criteria. Note that the algorithm deterministically chooses its preferred action B. Thus, there is no contradiction between determinism and choice.
I am not claiming that CDT is always optimal; it’s just the most straightforward to analyze. In settings containing Newcomb problems, CDT agents may well be selected against. Nonetheless, I hope the concepts presented here are helpful to furthering the development of decision theory. Indeed, I think that debates involving decision theory often get stuck on preconceived notions of physicality or rationality.
As a more grounded alternative, I propose that a theory of rational agency should be based on what’s naturally selected for, perhaps in some kind of idealized setting. Some of the comments on my previous post suggest this might not work in a post-ASI world, if the ASI is too powerful to face selective pressures. So now I ask: is it meaningful to define rationality in such a world, where the above arguments fall apart? Can the role of selection instead be played by the training methodology that produces the ASI?
Time’s arrow ⇒ decision theory
Link post
Debates on which decision theory (EDT/CDT/UDT/FDT/etc.) is “rational” seem to revolve around how one should model “free will”. Do we optimize individual actions or entire policies? Do we model our choice as an evidential update or a causal intervention?
Physics tells us that the Universe evolves deterministically and reversibly, so that the microscopic laws do not distinguish past from future. From the present state of the Universe, Laplace’s demon can accurately predict both the past and the future. It makes no sense to consider alternative actions or policies that you could have taken, because no such actions or policies exist. And yet, free will is an empirical fact: that is to say, our world is full of agents whose actions and policies optimize for some notion of future well-being.
Our cover article presents, I believe for the first time, a mathematically rigorous class of dynamical systems that resolve the century-old paradox between microscopic irreversibility and macroscopic causality. After establishing that the system’s macroscopic statistics are described by Bayes nets with timelike directed edges, Pearlean causal interventions arise as an effective way to model exogenous influences on a subsystem. From there, a sketch of why Darwinian natural selection may create CDT agents is as follows:
For simplicity, we model ourselves as following a deterministic algorithm encoded in our genes. There is no magic force of “free will” from outside the Universe; instead, the algorithm determines our action.
Among algorithms of the form “Do action A” and “Do action B”, Darwinian selection will favor the one whose action reaps rewards.
Now consider a more complex environment, in which it would be infeasible to maintain a genetic lookup table of optimal actions. There, a successful algorithm might reason as follows: “Let’s imagine that I were an agent that does action A and simulate the results. Now let’s imagine instead that I were an agent that does action B and simulate the results. I cannot choose which kind of agent I am, but I see that I prefer the result of action B. Therefore, my action is B.”
I argue that “free will” is precisely this process of considering multiple alternatives and outputting the action whose outcome best meets some internal criteria. Note that the algorithm deterministically chooses its preferred action B. Thus, there is no contradiction between determinism and choice.
I am not claiming that CDT is always optimal; it’s just the most straightforward to analyze. In settings containing Newcomb problems, CDT agents may well be selected against. Nonetheless, I hope the concepts presented here are helpful to furthering the development of decision theory. Indeed, I think that debates involving decision theory often get stuck on preconceived notions of physicality or rationality.
As a more grounded alternative, I propose that a theory of rational agency should be based on what’s naturally selected for, perhaps in some kind of idealized setting. Some of the comments on my previous post suggest this might not work in a post-ASI world, if the ASI is too powerful to face selective pressures. So now I ask: is it meaningful to define rationality in such a world, where the above arguments fall apart? Can the role of selection instead be played by the training methodology that produces the ASI?