We aren’t working on decision theory in order to make sure that AGI systems are decision-theoretic, whatever that would involve. We’re working on decision theory because there’s a cluster of confusing issues here (e.g., counterfactuals, updatelessness, coordination) that represent a lot of holes or anomalies in our current best understanding of what high-quality reasoning is and how it works.
Gaining information about the nature of rationality (e.g., is “realism about rationality” true?) and the nature of philosophy (e.g., is it possible to make real progress in decision theory, and if so what cognitive processes are we using to do that?), and helping to solve the problems of normativity, meta-ethics, and metaphilosophy.
Better understanding potential AI safety failure modes that are due to flawed decision procedures implemented in or by AI.
Making progress on various seemingly important intellectual puzzles that seem directly related to decision theory, such as free will, anthropic reasoning, logical uncertainty, Rob’s examples of counterfactuals, updatelessness, and coordination, and more.
Thanks for the answer. It clarifies a little bit, but I still feel like I don’t fully grasp its relevance to alignment. I have the impression that there’s more to the story than just that?
From Comment on decision theory:
From On the purposes of decision theory research:
Thanks for the answer. It clarifies a little bit, but I still feel like I don’t fully grasp its relevance to alignment. I have the impression that there’s more to the story than just that?