One part of it is that I want to scrap classical (“static”) decision theory and move to a more learning-theoretic (“dynamic”) view.
Can you explain more what you mean by this, especially “learning-theoretic”? I’ve looked at learning theory a bit and the typical setup seems to involve a loss or reward that is immediately observable to the learner, whereas in decision theory, utility can be over parts of the universe that you can’t see and therefore can’t get feedback from, so it seems hard to apply typical learning theory results to decision theory. I wonder if I’m missing the whole point though… What do you think are the core insights or ideas of learning theory that might be applicable to decision theory?
(I don’t speak for Abram but I wanted to explain my own opinion.) Decision theory asks, given certain beliefs an agent has, what is the rational action for em to take. But, what are these “beliefs”? Different frameworks have different answers for that. For example, in CDT a belief is a causal diagram. In EDT a belief is a joint distribution over actions and outcomes. In UDT a belief might be something like a Turing machine (inside the execution of which the agent is supposed to look for copies of emself). Learning theory allows us to gain insight through the observation that beliefs must be learnable, otherwise how would the agent come up with these beliefs in the first place? There might be parts of the beliefs that come from the prior and cannot be learned, but still, at least the type signature of beliefs should be compatible with learning.
Moreover, decision problems are often implicitly described from the point of view of a third party. For example, in Newcomb’s paradox we postulate that Omega can predict the agent, which makes perfect sense for an observer looking from the side, but might be difficult to formulate from the point of view of the agent itself. Therefore, understanding decision theory requires the translation of beliefs from the point of view of one observer to the point of view of another. Here also learning theory can help us: we can ask, what are the beliefs Alice should expect Bob to learn given particular beliefs of Alice about the world? From a slightly different angle, the central source of difficulty in decision theory is the notion of counterfactuals, and the attempt to prescribe particular meaning to them, which different decision theories do differently. Instead, we can just postulate that, from the subjective point of view of the agent, counterfactuals are ontologically basic. The agent believes emself to have free will, so to speak. Then, the interesting quesiton is, what kind of counterfactuals are produced by the translation of beliefs from the perspective of a third party to the perspective of the given agent.
Indeed, thinking about learning theory led me to the notion of quasi-Bayesian agents (agents that use incomplete/fuzzy models), and quasi-Bayesian agents automatically solve all Newcomb-like decision problems. In other words, quasi-Bayesian agents are effectively a rigorous version of UDT.
Incidentally, to align AI we literally need to translate beliefs from the user’s point of view to the AI’s point of view. This is also solved via the same quasi-Bayesian approach. In particular, this translation process preserves the “point of updatelessness”, which, in my opinion, is the desired result (the choice of this point is subjective).
My thinking is somewhat similar to Vanessa’s. I think a full explanation would require a long post in itself. It’s related to my recent thinking about UDT and commitment races. But, here’s one way of arguing for the approach in the abstract.
Assuming that we do want to be pre-rational, how do we move from our current non-pre-rational state to a pre-rational one? This is somewhat similar to the question of how do we move from our current non-rational (according to ordinary rationality) state to a rational one. Expected utility theory says that we should act as if we are maximizing expected utility, but it doesn’t say what we should do if we find ourselves lacking a prior and a utility function (i.e., if our actual preferences cannot be represented as maximizing expected utility).
The fact that we don’t have good answers for these questions perhaps shouldn’t be considered fatal to pre-rationality and rationality, but it’s troubling that little attention has been paid to them, relative to defining pre-rationality and rationality. (Why are rationality researchers more interested in knowing what rationality is, and less interested in knowing how to be rational? Also, BTW, why are there so few rationality researchers? Why aren’t there hordes of people interested in these issues?)
My contention is that rationality should be about the update process. It should be about how you adjust your position. We can have abstract rationality notions as a sort of guiding star, but we also need to know how to steer based on those.
Some examples:
Logical induction can be thought of as the result of performing this transform on Bayesianism; it describes belief states which are not coherent, and gives a rationality principle about how to approach coherence—rather than just insisting that one must somehow approach coherence.
Evolutionary game theory is more dynamic than the Nash story. It concerns itself more directly with the question of how we get to equilibrium. Strategies which work better get copied. We can think about the equilibria, as we do in the Nash picture; but, the evolutionary story also lets us think about non-equilibrium situations. We can think about attractors (equilibria being point-attractors, vs orbits and strange attractors), and attractor basins; the probability of ending up in one basin or another; and other such things.
However, although the model seems good for studying the behavior of evolved creatures, there does seem to be something missing for artificial agents learning to play games; we don’t necessarily want to think of there as being a population which is selected on in that way.
The complete class theorem describes utility-theoretic rationality as the end point of taking Pareto improvements. But, we could instead think about rationality as the process of taking Pareto improvements. This lets us think about (semi-)rational agents whose behavior isn’t described by maximizing a fixed expected utility function, but who develop one over time. (This model in itself isn’t so interesting, but we can think about generalizing it; for example, by considering the difficulty of the bargaining process—subagents shouldn’t just accept any Pareto improvement offered.)
Again, this model has drawbacks. I’m definitely not saying that by doing this you arrive at the ultimate learning-theoretic decision theory I’d want.
Can you explain more what you mean by this, especially “learning-theoretic”? I’ve looked at learning theory a bit and the typical setup seems to involve a loss or reward that is immediately observable to the learner, whereas in decision theory, utility can be over parts of the universe that you can’t see and therefore can’t get feedback from, so it seems hard to apply typical learning theory results to decision theory. I wonder if I’m missing the whole point though… What do you think are the core insights or ideas of learning theory that might be applicable to decision theory?
(I don’t speak for Abram but I wanted to explain my own opinion.) Decision theory asks, given certain beliefs an agent has, what is the rational action for em to take. But, what are these “beliefs”? Different frameworks have different answers for that. For example, in CDT a belief is a causal diagram. In EDT a belief is a joint distribution over actions and outcomes. In UDT a belief might be something like a Turing machine (inside the execution of which the agent is supposed to look for copies of emself). Learning theory allows us to gain insight through the observation that beliefs must be learnable, otherwise how would the agent come up with these beliefs in the first place? There might be parts of the beliefs that come from the prior and cannot be learned, but still, at least the type signature of beliefs should be compatible with learning.
Moreover, decision problems are often implicitly described from the point of view of a third party. For example, in Newcomb’s paradox we postulate that Omega can predict the agent, which makes perfect sense for an observer looking from the side, but might be difficult to formulate from the point of view of the agent itself. Therefore, understanding decision theory requires the translation of beliefs from the point of view of one observer to the point of view of another. Here also learning theory can help us: we can ask, what are the beliefs Alice should expect Bob to learn given particular beliefs of Alice about the world? From a slightly different angle, the central source of difficulty in decision theory is the notion of counterfactuals, and the attempt to prescribe particular meaning to them, which different decision theories do differently. Instead, we can just postulate that, from the subjective point of view of the agent, counterfactuals are ontologically basic. The agent believes emself to have free will, so to speak. Then, the interesting quesiton is, what kind of counterfactuals are produced by the translation of beliefs from the perspective of a third party to the perspective of the given agent.
Indeed, thinking about learning theory led me to the notion of quasi-Bayesian agents (agents that use incomplete/fuzzy models), and quasi-Bayesian agents automatically solve all Newcomb-like decision problems. In other words, quasi-Bayesian agents are effectively a rigorous version of UDT.
Incidentally, to align AI we literally need to translate beliefs from the user’s point of view to the AI’s point of view. This is also solved via the same quasi-Bayesian approach. In particular, this translation process preserves the “point of updatelessness”, which, in my opinion, is the desired result (the choice of this point is subjective).
My thinking is somewhat similar to Vanessa’s. I think a full explanation would require a long post in itself. It’s related to my recent thinking about UDT and commitment races. But, here’s one way of arguing for the approach in the abstract.
You once asked:
My contention is that rationality should be about the update process. It should be about how you adjust your position. We can have abstract rationality notions as a sort of guiding star, but we also need to know how to steer based on those.
Some examples:
Logical induction can be thought of as the result of performing this transform on Bayesianism; it describes belief states which are not coherent, and gives a rationality principle about how to approach coherence—rather than just insisting that one must somehow approach coherence.
Evolutionary game theory is more dynamic than the Nash story. It concerns itself more directly with the question of how we get to equilibrium. Strategies which work better get copied. We can think about the equilibria, as we do in the Nash picture; but, the evolutionary story also lets us think about non-equilibrium situations. We can think about attractors (equilibria being point-attractors, vs orbits and strange attractors), and attractor basins; the probability of ending up in one basin or another; and other such things.
However, although the model seems good for studying the behavior of evolved creatures, there does seem to be something missing for artificial agents learning to play games; we don’t necessarily want to think of there as being a population which is selected on in that way.
The complete class theorem describes utility-theoretic rationality as the end point of taking Pareto improvements. But, we could instead think about rationality as the process of taking Pareto improvements. This lets us think about (semi-)rational agents whose behavior isn’t described by maximizing a fixed expected utility function, but who develop one over time. (This model in itself isn’t so interesting, but we can think about generalizing it; for example, by considering the difficulty of the bargaining process—subagents shouldn’t just accept any Pareto improvement offered.)
Again, this model has drawbacks. I’m definitely not saying that by doing this you arrive at the ultimate learning-theoretic decision theory I’d want.