...The prototypical example of a prior based on Turing machines is Solomonoff’s prior. Someone not familiar with the distinction between Turing-complete and Turing-universal might naively think that a prior based on lambda calculus would be equally powerful. It is not so. Solomonoff’s prior guarantees a constant Bayes loss compared to the best computable prior for the job. In contrast, a prior based on lambda calculus can guarantee only a multiplicative loss.
Can you please make this precise?
When I think of “a prior based on lambda calculus”, I imagine something like the following. First, we choose some reasonable complexity measure C on lambda terms, such as:
For a variable x, we define C(x):=1
For two terms t,s, we define C(ts):=C(t)+C(s)
For a term t and a variable x, we define C(λx.t):=C(t)+1
Denote the set of lambda-terms by Λ. We then choose β>0 s.t. ∑t∈Λe−βC(t)≤1.
Now, we choose some reasonable way to describe lower-semicomputable semimeasures using lambda terms, and make the prior probabilities of different lambda terms proporitional to e−βC(t). It seems to me that the resulting semimeasure dominates every lower-semicomputable semimeasure and is arguably “as good as” the Solomonoff prior. What am I missing?
Can you please make this precise?
When I think of “a prior based on lambda calculus”, I imagine something like the following. First, we choose some reasonable complexity measure C on lambda terms, such as:
For a variable x, we define C(x):=1
For two terms t,s, we define C(ts):=C(t)+C(s)
For a term t and a variable x, we define C(λx.t):=C(t)+1
Denote the set of lambda-terms by Λ. We then choose β>0 s.t. ∑t∈Λe−βC(t)≤1.
Now, we choose some reasonable way to describe lower-semicomputable semimeasures using lambda terms, and make the prior probabilities of different lambda terms proporitional to e−βC(t). It seems to me that the resulting semimeasure dominates every lower-semicomputable semimeasure and is arguably “as good as” the Solomonoff prior. What am I missing?
Reply here.