cubefox comments on Reflectively stable consequentialists are expected utility maximisers

cubefox 9 May 2026 2:06 UTC
3 points
0
I have a comment on this definition of consequentialism here:
The behaviour of an agent is called consequentialist if it is a function from the list of lotteries of whichever scenario they face to a distribution over actions. Phrased another way, given a consequentialist agent and a scenario, the probability distribution over actions is completely determined by the (ordered) list of lotteries of that scenario.
For example, an agent that always picks the same action is a consequentialist. So is one that always picks an action uniformly at random. So is one that assigns numbers to outcomes and picks the action with the expected value closest to 12. Any function to valid probability distributions over actions is allowed, no matter how nonsensical.
If the action distributions depend on anything else, the behaviour is not consequentialist.
It seems to me “consequentialism” here is either pretty trivially correct under an interpretation where consequentialism and utilitarianism essentially refer to the same thing. Or consequentialism and utilitarianism are quite distinct concepts, in which case consequentialism seems pretty clearly false.
The way I see it, the distinction depends on what we mean with an “outcome”.
Let me begin with the “trivial” case. In Richard Jeffrey’s utility theory (also often identified with evidential decision theory), there is a utility function over a Boolean algebra of propositions, in addition to a probability function familiar from probability theory. So all the basic objects of the theory are propositions, and things like actions, outcomes, and states of the world are derived notions.
Jeffrey adds the following utility axiom to the standard Kolmogorov probability axioms:
If and then
This says that the utility of a mutually exclusive disjunction is the probability weighted average of its disjuncts, normalized by the probability of the disjunction (in the denominator) itself. I think this is relatively uncontroversial.
Then there is a theorem (I think Jeffrey proves it in his book) which states:
Here A and B can be any propositions. But since any action can also be described as a proposition (“I do X”), we can assume here that A is an action. We can also assume that B describes an arbitrary “state of the world” (in Savage’s terminology). Then the expressions and in the two terms of the sum can be described as two “outcomes”.
In this sense, consequentialism, as defined by you, is pretty trivially satisfied. The utility of the action A trivially depends only on the utility of the outcomes and , because the action is defined to be part of the outcomes.
For example, say action A is “I climb the nearby mountain” and state of the world B is “The weather is sunny”. Then the two possible outcomes are “I climb the nearby mountain and the weather is sunny” and “I climb the nearby mountain and the weather is not sunny”. Then even if I have an intrinsic preference for climbing mountains which isn’t fully determined by the consequences, your definition of consequentialism would be satisfied, because your definition talks about “outcomes”, and in this case outcomes and actions are not separate, since the outcomes include the action. So the utility of the action can be calculated from the (probability weighted) utilities of the outcomes alone. But that doesn’t mean that we can’t assign a utility to A that isn’t fully determined by the expected utilities of the causal consequences of A. Outcomes here aren’t the same as literal consequences.
Now, for the non-trivial interpretation: assume that outcomes are just consequences (in contrast to Jeffrey’s theory), without including the action itself. Then consequentialism seems clearly false in general. Because I might want to climb a mountain (an action) without doing it to achieve some consequence. More details in this post here.
So the utility of the action can no longer be derived from the (probability weighted) utility of the outcomes (consequences). Then consequentialism, according to your definition, is false.
In the first, trivial, case (Jeffrey’s theory) utilitarianism (picking the action with the highest utility) would be the same as “consequentialism” (picking the action with the highest utility, and the utility of the action is only determined by the utilities/probabilities of the outcomes).
In the non-trivial case, where outcomes are actual consequences without including actions, this would no longer be plausible, as explained in the post linked above, and in its comment section. (If need be, I can also add some additional examples where we clearly assign utilities to actions that are not fully determined by their expected consequences.)
So it seems your axiom 1 is either trivially true or pretty clearly false, depending on how we define “outcome”.
Well, at least I think so. I’m interested in what you think about this argument.
- Pedro Afonso 9 May 2026 15:56 UTC
  3 points
  0
  Parent
  For the purposes of the theorem an outcome is just a meaningless element of a finite set over which we can set probability distributions. Whether or not the theorem applies to some actual physical agent does depend on how we define what an outcome is. Notice that the definition of reflective stability requires an agent to behave a certain way in all succession setups, and therefore we must consider all scenarios. So, if “I climb the nearby mountain” is an action, and “I don’t climb the nearby mountain and the weather is sunny” is an outcome, it must be possible to create a scenario where climbing the nearby mountain has an 100% chance of resulting in not climbing the nearby mountain, and the weather being sunny. You must choose some partition of trajectories of the world into outcomes such that it is in principle possible to create any scenario, and if your agent is a reflectively stable consequentialist with respect to that partition, the theorem says that it will also be an expected utility maximiser with respect to it. Partitioning the trajectories is actually forced by the fact that reality is continuous but the theorem only works with a finite set of outcomes.
  Consequentialism is also not as clearly false about people as you make it out to be (although it is false). “I climb the nearby mountain” is clearly not an action that can be taken in an arbitrary situation, whereas we are assuming that the set of available actions is exactly the same in all scenarios. What is always available is some discretization of the set of possible signals we can send to our muscles at some instant. In the example I gave of a corporation, the first action is not “choose the first possible successor”, it is just action 1, which in that scenario results in choosing a successor, but in some other scenario can result in choosing a different successor, or maybe walking two steps right. You can choose different levels of granularity depending on the kind of thing you are trying to describe. In the case of climbing a mountain, I think that considering it as an atomic action, or even as an action at all is not the right way of seeing things. The vast, vast majority of all the possible actual action sequences someone can take when in front of a mountain simply result in them falling to the ground. Achieving the goal of climbing a mountain is not trivial, and it requires different actions depending on what happens to be in front of them, so it should be considered a consequence instead.
  You can of course insist that you care directly about whatever you define an action to be, and that therefore it must be considered as a part of the outcomes too if we want to be viewed as consequentialists. I think that’s probably not unreasonable, but it does break the theorem. Purely behavioral theorems are insufficient to describe human values. Reasoning clearly about how someone can both want to climb a mountain and reach the top will require thinking about the mechanisms in common between both kinds of goals, which will break the setup in many other ways.
  I hope this made sense.
  - cubefox 18 May 2026 19:44 UTC
    2 points
    0
    Parent
    So, if “I climb the nearby mountain” is an action, and “I don’t climb the nearby mountain and the weather is sunny” is an outcome, it must be possible to create a scenario where climbing the nearby mountain has an 100% chance of resulting in not climbing the nearby mountain, and the weather being sunny.
    Okay, but this action/outcome combination is logically impossible because it would require both climbing and not climbing the mountain. According to Jeffrey’s theory, “I don’t climb the nearby mountain and the weather is sunny” can’t be an outcome. Because, if is an action, only events of the type and are are outcomes, but or are not. This makes sense, since and form a logical partition of , i.e., they are 1) mutually exclusive and 2) jointly logically equivalent to .
    However, this is not the case if we (unlike Jeffrey) regard outcomes as consequences, since consequences generally don’t imply their causes (including actions). The same set of macroscopic end states (outcomes) could have been produced by different initial states (causes). This has to do with the general increase in entropy over time, a basic fact about causality. Stirring or shaking a drink are two different actions, but they may result macroscopically in the exact same causal outcome. Likewise, the outcome of a broken egg doesn’t imply the action which broke it.
    In that realistic view, possible outcomes are not jointly logically equivalent to their actions, which would mean two agents who assign the same expected utilities to the outcomes of a scenario need not assign the same utility (or probability) to their corresponding actions. They may inherently like some action for its own sake, and that needn’t be reflected in different evaluations of the outcomes—in contrast to Jeffrey’s theory, where the “outcomes” always imply their actions.
    Consequentialism is also not as clearly false about people as you make it out to be (although it is false). “I climb the nearby mountain” is clearly not an action that can be taken in an arbitrary situation, whereas we are assuming that the set of available actions is exactly the same in all scenarios. What is always available is some discretization of the set of possible signals we can send to our muscles at some instant.
    I think pure “muscle signals” is almost never what is meant in decision theory when talking about “actions”. An action is usually something like “taking an umbrella”, which is not available always. And strictly speaking, someone might have a stroke or a temporary paralysis and lose some of his muscle signals.
    Moreover, even for actual muscle signals, someone might simply have a preference for moving his legs in some situation (someone after a long flight, say, or a dancer, or an excited child), apart from any expected consequences/outcomes. Which would already violate consequentialism.
    (I agree that climbing a mountain is not usefully described as an action if there is a realistic chance of failure, in which case it would have to be modeled as an outcome.)