This is formalization of the decision procedure corresponding to the informal solution I gave in another comment (obviously, it includes a lot of detail unnecessary for this problem, but for the purpose of demonstrating the method, details are not omitted):
Programs for the participants: P—player O—Omega deciding whether to make the offer A—Alpha
Notation: [[X]] is the output of program X, X(Y) is a program that is composition of X and Y, where X expects program Y as argument. Thus, [[X(Y)]] is the output of X given argument Y, and X([[Y]]) is the output of X given output of Y (but not Y).
[[A]] is the contents of the envelope (true/false, or 1⁄0), [[O(P,A)]] is Omega’s decision to make the appearance, [[P(O)]] is player’s decision.
Problem statement: Omega appeared ([[O(P,A)]] is true), it asserts that either envelope is full, or player takes its offer ([[A]] xor [[P(O)]] is true), P is money-maximizing, what is [[P(O)]]?
First, P(O) (the player that has observed Omega appearing, as distinguished from P that might or might not see Omega; note that Omega is parameterized with whole P, not just P(O)) constructs the expression for its payoff that depends on P(O). It’s going to get the contents of the envelope if [[A]], and Omega’s money if [[O(P,A)]], where P can be represented as {X,[[P(O)]]}, where X stands for the rest of P, while [[P(O)]] is specifically P’s decision in this situation where P observed Omega.
The payoff is V([[P(O)]])=10^6*[[A]]+10*[[P(O)]]*[[O({X,[[P(O)]]},A)]], the payoff function is V(t)=10^6*[[A]]+10*t*[[O({X,t},A)]]. This may be taken as part of problem statement, additionally specifying P(O) via a statement that P(O) has a property of maximizing V([[P(O)]]).
P(O) considers counterfactual actions for the role of [[P(O)]] (the counterfactuals are not assumed to be equal to [[P(O)]]; I use constants T=true and F=false to refer to them). That is, it’s computing [[P(O)]]:=arg max V(t)
First, consider T=true (take Omega’s offer). The payoff is V(T)=10^6*[[A]]+10*T*[[O({X,T},A)]]=10^6*[[A]]+10*[[O({X,T},A)]]. Second, consider F=false (refuse Omega’s offer). The payoff is V(F)=10^6*[[A]]+10*F*[[O({X,F},A)]]=10^6*[[A]].
Clearly, even given that we don’t know what O(P,A) is, V(T)>=V(F). Therefore, [[P(O)]]=T (the player takes Omega’s offer). Since [[A]] xor [[P(O)]], it follows that [[A]]=false, so there is no point in opening the envelope.
This is formalization of the decision procedure corresponding to the informal solution I gave in another comment (obviously, it includes a lot of detail unnecessary for this problem, but for the purpose of demonstrating the method, details are not omitted):
Programs for the participants:
P—player
O—Omega deciding whether to make the offer
A—Alpha
Notation: [[X]] is the output of program X, X(Y) is a program that is composition of X and Y, where X expects program Y as argument. Thus, [[X(Y)]] is the output of X given argument Y, and X([[Y]]) is the output of X given output of Y (but not Y).
[[A]] is the contents of the envelope (true/false, or 1⁄0), [[O(P,A)]] is Omega’s decision to make the appearance, [[P(O)]] is player’s decision.
Problem statement: Omega appeared ([[O(P,A)]] is true), it asserts that either envelope is full, or player takes its offer ([[A]] xor [[P(O)]] is true), P is money-maximizing, what is [[P(O)]]?
First, P(O) (the player that has observed Omega appearing, as distinguished from P that might or might not see Omega; note that Omega is parameterized with whole P, not just P(O)) constructs the expression for its payoff that depends on P(O). It’s going to get the contents of the envelope if [[A]], and Omega’s money if [[O(P,A)]], where P can be represented as {X,[[P(O)]]}, where X stands for the rest of P, while [[P(O)]] is specifically P’s decision in this situation where P observed Omega.
The payoff is
V([[P(O)]])=10^6*[[A]]+10*[[P(O)]]*[[O({X,[[P(O)]]},A)]],
the payoff function is
V(t)=10^6*[[A]]+10*t*[[O({X,t},A)]].
This may be taken as part of problem statement, additionally specifying P(O) via a statement that P(O) has a property of maximizing V([[P(O)]]).
P(O) considers counterfactual actions for the role of [[P(O)]] (the counterfactuals are not assumed to be equal to [[P(O)]]; I use constants T=true and F=false to refer to them). That is, it’s computing
[[P(O)]]:=arg max V(t)
First, consider T=true (take Omega’s offer). The payoff is
V(T)=10^6*[[A]]+10*T*[[O({X,T},A)]]=10^6*[[A]]+10*[[O({X,T},A)]].
Second, consider F=false (refuse Omega’s offer). The payoff is
V(F)=10^6*[[A]]+10*F*[[O({X,F},A)]]=10^6*[[A]].
Clearly, even given that we don’t know what O(P,A) is, V(T)>=V(F). Therefore, [[P(O)]]=T (the player takes Omega’s offer). Since [[A]] xor [[P(O)]], it follows that [[A]]=false, so there is no point in opening the envelope.
I really like this formulation.