Sometimes written “Loeb’s Theorem” (because umlauts are tricky). This is a theorem about proofs of what is provable and how they interact with what is actually provable in ways that surprise some people.
This math result often comes up when attempting to formalize “an agent” or “a value system” as somehow related to “a set of axioms”.
Often, when making such mental motions, one wants to take multi-agent interactions seriously, and make the game-theoretically provably endorsable actions “towards an axiom system” be somehow contingent on what that other axiom system might or might not be able to game-theoretically provably endorse.
You end up with proofs about proofs about proofs… and then, without care, the formal proof systems themselves might explode or might give agentically incoherent results on certain test cases.
Sometimes, in this research context, the phrase “loebstacle” or “Löbstacle” comes up. This was an area of major focus (and a common study guide pre-requisite) for MIRI from maybe 2011 to 2016?
It became much less important later after the invention/discovery of the Garrabrant Inductor.
As to the math of Löb’s theorem itself...
We trust Peano Arithmetic to correctly capture certain features of the standard model of arithmetic. Furthermore, we know that Peano Arithmetic is expressive enough to talk about itself in meaningful ways. So it would certainly be great if Peano Arithmetic asserted what now is an intuition: that everything it proves is certainly true.
In formal notation, let stand for the standard provability predicate of . Then, is true if and only if there is a proof from the axioms and rules of inference of of . Then what we would like to say is that for every sentence .
But alas, suffers from a problem of self-trust.
Löb’s theorem states that if then . This immediately implies that if is consistent, the sentences are not provable when is false, even though according to our intuitive understanding of the standard model every sentence of this form must be true.
Thus, is incomplete, and fails to prove a particular set of sentences that would increase massively our confidence in it.
Notice that Gödel’s second incompleteness theorem follows immediately from Löb’s theorem, as if is consistent, then by Löb’s , which by the propositional calculus implies .
It is worth remarking that Löb’s theorem does not only apply to the standard provability predicate, but to every predicate satisfying the Hilbert-Bernais derivability conditions.
@Multicore I accidentally deleted your contribution by submitting an edit I started writing before you published yours. I’m letting you add it back so it remains attributed to you. Also, if you can do some relevance voting that would be helpful.
This is one of two ways I know of proving Löb’s theorem, and I find them both illuminating. (The other is to derive it from Gödel’s Second Incompleteness Theorem.)
:, not ′
Something I learnt from Mietek Bak is that Löb’s Theorem is kind of more subtle than this. In provability theory, it’s fine to have a “box” operator that we informally read as “is provable”; but what Löb’s theorem tells us that we can’t simply interpret it literally as “is provable” without difficulties. One should define the “provability” predicate formally, to avoid getting confused (or one should specify that it is simply a formal symbol to which we have not assigned any semantic meaning, although that is somewhat against the point of the angle taken by the parent article); for example, the provability predicate could be defined by a certain first-order formula which unpacks a Gödel number, checks it’s encoding a proof and verifies each step of the encoded proof.
Yeah, that is the formal definition of the standard provability predicate. I’ll add the page explaining that soon enough.