Really great post! I think I already got John’s idea from his post, but putting everything in perspective and reference previous and adjacent works really help!
On that note, you have been mentioning Stuart’s no-indescribable-hellworlds hypothesis for a few posts now, and I’m really interested in it. To take the even more meta-argument for its importance, it looks particularly relevant to asking whether the human epistemic perspective is the right one to use when defining ascription universality (which basically abstracts most of Paul’s and Paul-related approaches in term of desiderata for a supervisor).
Do you know if there have been working on poking at this hypothesis, and trying to understand what it implies/requires? I doubt we can ever prove it, but there might be a way to do the “computational complexity approach”, where we formally relate it to much more studied and plausible hypotheses.
After thinking a bit more about it, the no-indescribable-hellworlds seems somewhat related to logical uncertainty and the logical induction criterion. Because intuitively, indescribability comes from complexity issues about reasoning, that is the lack of logical omniscience about the consequences of our values. The sort of description we would like is a polynomial proof, or at least a polynomial interactive protocol for verifying the indescribability (which means being in PSPACE, as in the original take on debate). And satisfying the logical induction criterion seems a good way to ensure that such a proof will eventually be found, because otherwise we could be exploited forever on our wrong assessment of the hellworld.
The obvious issue with this approach comes from the asymptotic nature of logical induction guarantees, which might mean it takes so long to convince us/check indescribable hellworlds that they already came to pass.
I don’t have much to say other than that I agree with the connection. Honestly, thinking of it in those terms makes me pessimistic that it’s true—it seems quite possible that humans, given enough time for philosophical reflection, could point to important value-laden features of worlds/plans which are not PSPACE.
Glad that you find the connection interesting. That being said, I’m confused by what you’re saying afterwards: why would logical inductors not able to find propositions about worlds/plans which are outside PSPACE? I find no mention of PSPACE in the paper.
Really great post! I think I already got John’s idea from his post, but putting everything in perspective and reference previous and adjacent works really help!
On that note, you have been mentioning Stuart’s no-indescribable-hellworlds hypothesis for a few posts now, and I’m really interested in it. To take the even more meta-argument for its importance, it looks particularly relevant to asking whether the human epistemic perspective is the right one to use when defining ascription universality (which basically abstracts most of Paul’s and Paul-related approaches in term of desiderata for a supervisor).
Do you know if there have been working on poking at this hypothesis, and trying to understand what it implies/requires? I doubt we can ever prove it, but there might be a way to do the “computational complexity approach”, where we formally relate it to much more studied and plausible hypotheses.
After thinking a bit more about it, the no-indescribable-hellworlds seems somewhat related to logical uncertainty and the logical induction criterion. Because intuitively, indescribability comes from complexity issues about reasoning, that is the lack of logical omniscience about the consequences of our values. The sort of description we would like is a polynomial proof, or at least a polynomial interactive protocol for verifying the indescribability (which means being in PSPACE, as in the original take on debate). And satisfying the logical induction criterion seems a good way to ensure that such a proof will eventually be found, because otherwise we could be exploited forever on our wrong assessment of the hellworld.
The obvious issue with this approach comes from the asymptotic nature of logical induction guarantees, which might mean it takes so long to convince us/check indescribable hellworlds that they already came to pass.
I don’t have much to say other than that I agree with the connection. Honestly, thinking of it in those terms makes me pessimistic that it’s true—it seems quite possible that humans, given enough time for philosophical reflection, could point to important value-laden features of worlds/plans which are not PSPACE.
Glad that you find the connection interesting. That being said, I’m confused by what you’re saying afterwards: why would logical inductors not able to find propositions about worlds/plans which are outside PSPACE? I find no mention of PSPACE in the paper.
Oh, well, satisfying the logical induction criterion is stronger than just PSPACE. I see debate, and iterated amplification, as attempts to get away with less than full logical induction. See https://www.lesswrong.com/posts/R3HAvMGFNJGXstckQ/relating-hch-and-logical-induction, especially Paul’s comment https://www.lesswrong.com/posts/R3HAvMGFNJGXstckQ/relating-hch-and-logical-induction?commentId=oNPtnwTYcn8GixC59