This is a probabilistic approach to logical counterfactuals inspired by the Demski prior. Given a sentence ϕ, we wish to ask what other logical sentences are true in the world C(ϕ) where we correctly counterfactually assume ϕ. I spoke about this before here and here, but now I am going to approach the problem slightly differently. Instead of insisting that every other sentence ψ is either true or false in the counterfactual world C(ϕ), I allow C(ϕ) to assign probabilities to each sentence.
When you try to describe counterfactual worlds as sets of true and false sentences, you necessarily have very sharp boundaries. You will have, for example, counterfactual worlds where ϕ and ψ are true, but ϕ∧ψ is false. Probabilities seem to be more realistic by allowing us to smooth the boundaries. As you look at sentences which are more and more logically distinct from ϕ, you can gradually change probabilities so that they will represent the truth, rather than representing consequences of ϕ.
Let μ be a measure on logical sentences, for example, μ(ϕ)=2−ℓ(ϕ), where ℓ(ϕ) is the number of bits necessary to encode ϕ. Let TN be the theory containing all sentences true about the natural numbers. Consider the following procedure which computes P(ψ|ϕ). This definition is only for ϕ which are consistent by themselves.
Let T0 be the theory containing only the sentence ϕ. For n≥0, compute Tn+1 from Tn by sampling a sentence tn according to μ. If both TN+tn and Tn+tn are consistent, then Tn+1=Tn+tn. Otherwise, Tn+1=Tn. Let T∞ be the union of all of the Tn. P(ψ|ϕ) is the probability that ψ is a consequence of T∞.
As it is, this procedure is not approximable, but you can make a similar thing approximable by replacing TN with PA, or a complete theory sampled from your favorite approximable distribution.
Claim: P(⋅|ϕ) gives a coherent probability assignment which assigns probability 1 to ϕ, and thus represents a probability distribution on complete theories.
Proof: The probability distribution on complete theories is exactly the distribution on T∞. All we need to show is that T∞ is complete (with probability 1). Take a sentence ψ. Either ψ or ¬ψ is consistent with TN. WLOG assume ψ is consistent with TN. The sentence ψ is eventually sampled at some time n. Either ψ is added to Tn, or it is inconsistent with Tn. Therefore, either Tn proves ¬ψ, or Tn+1 proves ψ, so T∞ proves either ψ or ¬ψ. Note that T∞ does not necessarily contain either ψ or ¬ψ as an axiom.
Now I will give the reasons I am considering this proposal. None of the following is stuff I actually know to be true. I think it is plausible that my intuitions about the result of this procedure are very wrong.
It seems that true sentences will generally have high probabilities. Thus, if ϕ and ψ are both complex sentences, and there is a simple proof of ϕ→ψ, it is likely that many true sentences will be sampled and accepted before ψ is considered. Thus, it seems plausible that sufficiently many simple axioms to complete the simple proof of ϕ→ψ will be accepted before ψ is considered. If this happens, ψ will automatically be included. It seems then that ψ will have a high probability.
Thus it is plausible that this proposal follows the spirit of the conjecture that simple proofs of ϕ→ψ correspond to legitimate counterfactuals. Note that this informal argument only goes the one direction. If If there is a simple proof of ϕ→ψ, it seems likely that P(ψ|ϕ) will be large, but it does not seem likely that P(ϕ|ψ) will be large. This in consistent with my idea of logical counterfactuals.
Proof-length based definitions of counterfactuals usually have the unfortunate property that they are dependent on the formalities of our proof system, and equivalent proof systems can give very different proof lengths. Perhaps this proposal can get many of the nice properties of proof length based systems, while being independent choosing different equivalent ways to carry out proofs.
An approach to logical counterfactuals inspired by the Demski prior
This is a probabilistic approach to logical counterfactuals inspired by the Demski prior. Given a sentence ϕ, we wish to ask what other logical sentences are true in the world C(ϕ) where we correctly counterfactually assume ϕ. I spoke about this before here and here, but now I am going to approach the problem slightly differently. Instead of insisting that every other sentence ψ is either true or false in the counterfactual world C(ϕ), I allow C(ϕ) to assign probabilities to each sentence.
When you try to describe counterfactual worlds as sets of true and false sentences, you necessarily have very sharp boundaries. You will have, for example, counterfactual worlds where ϕ and ψ are true, but ϕ∧ψ is false. Probabilities seem to be more realistic by allowing us to smooth the boundaries. As you look at sentences which are more and more logically distinct from ϕ, you can gradually change probabilities so that they will represent the truth, rather than representing consequences of ϕ.
Let μ be a measure on logical sentences, for example, μ(ϕ)=2−ℓ(ϕ), where ℓ(ϕ) is the number of bits necessary to encode ϕ. Let TN be the theory containing all sentences true about the natural numbers. Consider the following procedure which computes P(ψ|ϕ). This definition is only for ϕ which are consistent by themselves.
Let T0 be the theory containing only the sentence ϕ. For n≥0, compute Tn+1 from Tn by sampling a sentence tn according to μ. If both TN+tn and Tn+tn are consistent, then Tn+1=Tn+tn. Otherwise, Tn+1=Tn. Let T∞ be the union of all of the Tn. P(ψ|ϕ) is the probability that ψ is a consequence of T∞.
As it is, this procedure is not approximable, but you can make a similar thing approximable by replacing TN with PA, or a complete theory sampled from your favorite approximable distribution.
Claim: P(⋅|ϕ) gives a coherent probability assignment which assigns probability 1 to ϕ, and thus represents a probability distribution on complete theories.
Proof: The probability distribution on complete theories is exactly the distribution on T∞. All we need to show is that T∞ is complete (with probability 1). Take a sentence ψ. Either ψ or ¬ψ is consistent with TN. WLOG assume ψ is consistent with TN. The sentence ψ is eventually sampled at some time n. Either ψ is added to Tn, or it is inconsistent with Tn. Therefore, either Tn proves ¬ψ, or Tn+1 proves ψ, so T∞ proves either ψ or ¬ψ. Note that T∞ does not necessarily contain either ψ or ¬ψ as an axiom.
Now I will give the reasons I am considering this proposal. None of the following is stuff I actually know to be true. I think it is plausible that my intuitions about the result of this procedure are very wrong.
It seems that true sentences will generally have high probabilities. Thus, if ϕ and ψ are both complex sentences, and there is a simple proof of ϕ→ψ, it is likely that many true sentences will be sampled and accepted before ψ is considered. Thus, it seems plausible that sufficiently many simple axioms to complete the simple proof of ϕ→ψ will be accepted before ψ is considered. If this happens, ψ will automatically be included. It seems then that ψ will have a high probability.
Thus it is plausible that this proposal follows the spirit of the conjecture that simple proofs of ϕ→ψ correspond to legitimate counterfactuals. Note that this informal argument only goes the one direction. If If there is a simple proof of ϕ→ψ, it seems likely that P(ψ|ϕ) will be large, but it does not seem likely that P(ϕ|ψ) will be large. This in consistent with my idea of logical counterfactuals.
Proof-length based definitions of counterfactuals usually have the unfortunate property that they are dependent on the formalities of our proof system, and equivalent proof systems can give very different proof lengths. Perhaps this proposal can get many of the nice properties of proof length based systems, while being independent choosing different equivalent ways to carry out proofs.