...the problem of how to choose one’s IBH prior. (If the solution was something like “it’s subjective/arbitrary” that would be pretty unsatisfying from my perspective.)
It seems clear to me that the prior is subjective. Like with Solomonoff induction, I expect there to exist something like the right asymptotic for the prior (i.e. an equivalence class of priors under the equivalence relation where μ and ν are equivalent when there exists some C>0 s.t.μ≤Cν and ν≤Cμ), but not a unique correct prior, just like there is no unique correct UTM. In fact, my arguments about IBH already rely on the asymptotic of the prior to some extent.
One way to view the non-uniqueness of the prior is through an evolutionary perspective: agents with prior X are likely to evolve/flourish in universes sampled from prior X, while agents with prior Y are likely to evolve/flourish in universes sampled from prior Y. No prior is superior across all universes: there’s no free lunch.
For the purpose of AI alignment, the solution is some combination of (i) learn the user’s prior and (ii) choose some intuitively appealing measure of description complexity, e.g. length of lambda-term (i is insufficient in itself because you need some ur-prior to learn the user’s prior). The claim is, different reasonable choices in ii will lead to similar results.
Given all that, I’m not sure what’s still unsatisfying. Is there any reason to believe something is missing in this picture?
It seems clear to me that the prior is subjective. Like with Solomonoff induction, I expect there to exist something like the right asymptotic for the prior (i.e. an equivalence class of priors under the equivalence relation where μ and ν are equivalent when there exists some C>0 s.t.μ≤Cν and ν≤Cμ), but not a unique correct prior, just like there is no unique correct UTM. In fact, my arguments about IBH already rely on the asymptotic of the prior to some extent.
One way to view the non-uniqueness of the prior is through an evolutionary perspective: agents with prior X are likely to evolve/flourish in universes sampled from prior X, while agents with prior Y are likely to evolve/flourish in universes sampled from prior Y. No prior is superior across all universes: there’s no free lunch.
For the purpose of AI alignment, the solution is some combination of (i) learn the user’s prior and (ii) choose some intuitively appealing measure of description complexity, e.g. length of lambda-term (i is insufficient in itself because you need some ur-prior to learn the user’s prior). The claim is, different reasonable choices in ii will lead to similar results.
Given all that, I’m not sure what’s still unsatisfying. Is there any reason to believe something is missing in this picture?