This is a writeup of the stuff Benja and I thought up during the logical uncertainty workshop in May.
An optimal predictor for (D,μ) allows assigning probabilities to logical sentences of the form x∈D but in general it doesn’t allow assigning probabilities to propositional formulae constructed from sentences of that form. Here we outline two approaches to defining such probabilities.
Consider D a language. Define BD⊆{0,1}∗ to be the set of words encoding propositional formulae whose variables are labeled by elements of {0,1}∗ such that the formula evaluates to truth if a variable labeled by x is substituted for the truth value of x∈D. We can think of BD as consisting of expressions like ⌈0110∈D⌉∧¬⌈101∈D⌉ which represent true sentences.
Consider μ a word ensemble and suppose P is an optimal predictor for (D,μ). It would be nice to show P satisfies something like the “coherence condition”
Pk(ϕ)=Pk(ϕ∧ψ)+P(ϕ∧¬ψ)
Obviously we cannot expect the exact equality since P is only defined up to similarity relative to μ. Instead, we were able to show that under certain assumptions on μ, we have that (Pk(ϕ)−Pk(ϕ∧ψ)−P(ϕ∧¬ψ))2 is “negligible on average” in some sense.
Assume a word ensemble ν exists s.t. f+, f− and f0 are pseudo-invertible reductions of (D+,ν), (D−,ν) and (D0,ν) to (BD,μ). Then by Theorem 6.1 of [1], f−1+(P) is an optimal predictor for (D+,ν), f−1−(P) is an optimal predictor for (D−,ν) and f−10(P) is an optimal predictor for (D0,ν). Since D0=D+⊔D−, Theorems 5.1 and 4.2 imply that f−10(P)ν∼f−1+(P)+f−1−(P). That is, Eνk[(Pk(ϕ)−Pk(ϕ∧ψ)−P(ϕ∧¬ψ))2] is negligible.
Another approach is considering a language D equipped with a binary operation m:{0,1}∗×{0,1}∗→{0,1}∗ such that χD(m(x,y))=χD(x)χD(y). If μ is a word ensemble and P is an optimal predictor for (D,μ) then Pk(m(x,y)) can be interpreted as the probability of ⌈x∈D⌉∧⌈y∈D⌉. The probability of ⌈x∈D⌉∧¬⌈y∈D⌉ can be taken to be η(Pk(m(x,y))−Pk(y)). Continuing in this manner, it is possible to assign probabilities to all propositional formulae of the sort.
In this case, the coherence condition would hold automatically except for the presence of η. Define
D+={(x,y)∣m(x,y)∈D}D0={(x,y)∣x∈D}
We can now apply the same method as above combined with Lemma 4.3 (D+⊆D0) to show approximate coherence.
It would be interesting to construct concrete examples in which these results are applicable.
[1] “A complexity theoretic approach to logical uncertainty (Draft)”
Optimal predictors and propositional calculus
This is a writeup of the stuff Benja and I thought up during the logical uncertainty workshop in May.
An optimal predictor for (D,μ) allows assigning probabilities to logical sentences of the form x∈D but in general it doesn’t allow assigning probabilities to propositional formulae constructed from sentences of that form. Here we outline two approaches to defining such probabilities.
Consider D a language. Define BD⊆{0,1}∗ to be the set of words encoding propositional formulae whose variables are labeled by elements of {0,1}∗ such that the formula evaluates to truth if a variable labeled by x is substituted for the truth value of x∈D. We can think of BD as consisting of expressions like ⌈0110∈D⌉∧¬⌈101∈D⌉ which represent true sentences.
Consider μ a word ensemble and suppose P is an optimal predictor for (D,μ). It would be nice to show P satisfies something like the “coherence condition”
Pk(ϕ)=Pk(ϕ∧ψ)+P(ϕ∧¬ψ)
Obviously we cannot expect the exact equality since P is only defined up to similarity relative to μ. Instead, we were able to show that under certain assumptions on μ, we have that (Pk(ϕ)−Pk(ϕ∧ψ)−P(ϕ∧¬ψ))2 is “negligible on average” in some sense.
To see this define the languages
D+={(ϕ,ψ)∣ϕ∧ψ∈BD} D−={(ϕ,ψ)∣ϕ∧¬ψ∈BD} D0={(ϕ,ψ)∣ϕ∈BD}
All three languages have reductions to BD:
f+(ϕ,ψ):=ϕ∧ψ f−(ϕ,ψ):=ϕ∧¬ψ f0(ϕ,ψ):=ϕ
Assume a word ensemble ν exists s.t. f+, f− and f0 are pseudo-invertible reductions of (D+,ν), (D−,ν) and (D0,ν) to (BD,μ). Then by Theorem 6.1 of [1], f−1+(P) is an optimal predictor for (D+,ν), f−1−(P) is an optimal predictor for (D−,ν) and f−10(P) is an optimal predictor for (D0,ν). Since D0=D+⊔D−, Theorems 5.1 and 4.2 imply that f−10(P)ν∼f−1+(P)+f−1−(P). That is, Eνk[(Pk(ϕ)−Pk(ϕ∧ψ)−P(ϕ∧¬ψ))2] is negligible.
Another approach is considering a language D equipped with a binary operation m:{0,1}∗×{0,1}∗→{0,1}∗ such that χD(m(x,y))=χD(x)χD(y). If μ is a word ensemble and P is an optimal predictor for (D,μ) then Pk(m(x,y)) can be interpreted as the probability of ⌈x∈D⌉∧⌈y∈D⌉. The probability of ⌈x∈D⌉∧¬⌈y∈D⌉ can be taken to be η(Pk(m(x,y))−Pk(y)). Continuing in this manner, it is possible to assign probabilities to all propositional formulae of the sort.
In this case, the coherence condition would hold automatically except for the presence of η. Define
D+={(x,y)∣m(x,y)∈D} D0={(x,y)∣x∈D}
We can now apply the same method as above combined with Lemma 4.3 (D+⊆D0) to show approximate coherence.
It would be interesting to construct concrete examples in which these results are applicable.
[1] “A complexity theoretic approach to logical uncertainty (Draft)”