Anyway, since you keep taking the time to thoroughly reply in good faith, I’ll do my best to clarify and address some of the rest of what you’ve said. However, thanks to the discussion we’ve had so far, a more formal presentation of my ideas is crystallizing in my mind; I prefer to save that for another proper post, since I anticipate it will involve rejigging the terminology again, and I don’t want to muddy the waters further!
Looks like I forgot about this discussion! Did you post a more formal treatment?
I don’t know how you so misread what I said; I explicitly wrote that aliefs constitute the larger logic, so that beliefs are contained in aliefs (which I’m pretty sure is what you were going for!) and not vice versa. Maybe you got confused because I put beliefs first in this description, or because I described the smaller of the two logics as the “reasoning engine” (for the reason I subsequently provided)? What you said almost convinced me that our definitions actually align, until I reached the point where you said that that beliefs could be “more complicated” than aliefs, which made me unsure.
Sorry for the confusion here! I haven’t re-oriented myself to the whole context, but it sounds like I did invent a big disagreement that didn’t exist. This has to do with my continued confusion about your approach. But in retrospect I do think your early accusation that I was insisting on some rigid assumptions holds water; I needed to go a bit further afield to try and interpret what you were getting at.
Whether or not I manage to convince you that the Löbstacle doesn’t exist (because it’s founded on an untenable definition of trust), you have to acknowledge that the argument as presented there doesn’t address the following second problem.
Again, I haven’t yet understood your approach or even re-read the whole conversation here, but it now seems to me that I was doing something wrong and silly by insisting on a definition of trust that forces the Löbstacle. The original paper is careful to only state that Löb naively seems to present an obstacle, not that it really truly does so. It looks to me like I was repeatedly stubborn on this point in an unproductive way.
I have read almost all of this dialog, and my half-serious upshot is:
An agent A can’t prove that another agent B is correct in both its reasoning as well as semantics, but that doesn’t matter because it can’t trust its own reasoning to that degree either.
This glosses over a lot of details in the long and charitable comment thread above. I tried to get an overview of it with ChatGPT. I’m surprised how well that worked:
ChatGPT 5.2 extended thinking summary of the misunderstanding,
1) Shared formal background (what they both accept)
Let:
L be a recursively axiomatized theory strong enough for Löb-style self-reference.
ProvL(⌜φ⌝) be the standard provability predicate; write □Lφ:=ProvL(⌜φ⌝).
U be an “environment/world”.
S be a semantic interpretation mapping formulas of L to propositions about U. Morgan describes this as a semantic map S:Form(L)→C(U).
TrueS(φ) abbreviate “S(φ) holds in UUU”.
A natural “soundness schema relative to S” is:
Sound(L,S) := ∀φ (□Lφ→TrueS(φ)).
The Löbian obstacle setup (as Morgan summarizes it) is that a designer agent A wants to rely on proofs produced by a subordinate B, and this seems to demand something like a schema □Lφ→φ (or its intended-world analogue) for arbitrary φ, which is blocked by Löb-ish reasoning.
So far: aligned.
2) Where the models diverge: what counts as “solving” the obstacle
Demski treats “escape the obstacle” as: produce an agent design whose decision procedure can rationally delegate mission-critical tasks under something like the formal constraints studied in the Löbstacle literature.
A clean formalization of what he keeps asking for is:
There is an action rule πA such that A will take some action aaa after receiving some certificate from B,
and A’s normative warrant for that reliance is itself captured inside the agent’s formal reasoning (or at least specified as an explicit part of the algorithm), not as an unanalyzed extra assumption.
Hence his repeated demand: “what modification of the agent designs do you have in mind?” and later: “I still don’t understand how you propose for the agent to reason/behave differently.”
Morgan’s post is primarily: diagnose the source of the apparent paradox as a category error—treating a semantic claim (“my proofs track the world”) as if it were a syntactic theorem inside LLL.
His move is to relocate Sound(L,S) out of LLL and into a separate layer of “metalogical belief” that lives at the interface between proofs and world-interpretation. He says the agent need not (and cannot) prove the semantic mapping is sound; rather, the agent assumes/believes it when applying proofs to the world.
This is why he says “the agent doesn’t change its reasoning” (meaning: the proof calculus inside LLL is unchanged).
3) The first misunderstanding: “metalogical beliefs” as axioms vs as an external bridge
Demski hears:
“Allowing for metalogical beliefs immediately dissolves the obstacle.”
and translates it into the most standard formal move:
Build L′:=L+Sound(L,S)
Then he points out the familiar reflection hierarchy problem: L′ can at best endorse L, but not itself; so you get “trust weaker but not equal-strength” all over again. He explicitly frames this as “you just get a new system L′… which doesn’t believe its own soundness.” and distinguishes “L is sound” from “L plus me is sound.”
Morgan, meanwhile, is not proposing “add Sound(L,S) as an axiom used inside the proof system that generates actions.” He keeps insisting the semantic map is “defined on L, not on L′,” and that for the delegation story you can restrict attention to accepting L-proofs.
So the mismatch is:
Demski’s formalization: metalogical belief ⇒ explicit axiom(s) in an extended theory that the agent uses for decisions.
Morgan’s formalization: metalogical belief ⇒ an extra-theoretic acceptance bridge “when I see an L-proof, I treat its conclusion as world-true under S”—not something asserted/proved within L.
Formally, Morgan is closer to specifying a two-layer agent:
Object layer: derive □Lφ by proof search.
Bridge layer (not inside L): an acceptance rule Acc(φ) that (defeasibly) upgrades “□Lφ” into “act as if TrueS(φ).”
Demski’s objection is then immediate: if Acc is where the action happens, you have changed the decision architecture; and unless Acc is itself formally grounded, you haven’t solved the original agent-design problem, only renamed it. This is exactly his “back pocket” complaint: “fine, I grant the agent ‘has’ the semantic map; how does it use it?”
4) The second misunderstanding: what “reasoning” denotes
They overload “reasoning” differently.
Morgan uses “reasoning” ≈ “applying L’s inference rules to formulas” and treats semantic application as a separate stage.
Demski uses “reasoning” ≈ “the whole cognitive pipeline that produces decisions,” so if semantics affects what you do, it’s part of the reasoning/agent design.
This is why Demski keeps hearing a contradiction in Morgan’s stance:
“No, the agent doesn’t change its reasoning… The change is that we don’t require A to prove that their semantic mapping is sound…”
Under Demski’s formal lens, “not requiring proof” is already a behavioral modification, because the original Löbstacle agent was (by assumption) constrained to act only when it could justify that constraint internally.
5) The third misunderstanding: “trust B” = trusting proofs vs trusting delegated action
Morgan tries to narrow “trust” to “trusting LLL-proofs”:
B can even be inconsistent internally, as long as it outputs valid LLL-proofs; then AAjust checks the proof and uses it.
Demski pushes back that the Löbstacle was about mission-critical delegation, not merely proof checking; restricting to a weaker fragment can amount to “make the delegate dumber,” which is the classic non-solution.
Formally, Demski is insisting on a property like:
A delegates ⟹ ∃ nontrivial tasks where B’s full cognition matters, not just L-proof output.
Morgan is (at least in that subthread) defending a weaker property:
A trusts B ⟹ A accepts any syntactically valid L-proof produced by B as action-relevant.
Those are different “trust” relations, and each is natural given a different target.
6) The “is/ought” mini-misfire is the same structural issue
Morgan phrases the bridge as normative:
if φ is provable in L, the interpretation of φ “ought to be true”.
Demski hears “ought” as moral/normative in the usual sense and suggests rewriting it as a prediction (“will probably be true”).
But structurally, this is the same dispute:
Morgan is describing Acc: a norm of acceptance/decision (“treat proofs as truth-tracking”).
Demski wants a descriptive link inside the formal model that would license that acceptance without handwaving.
7) The clean formal diagnosis: they are proving different theorems
If you compress all of the above into two “theorems” they are implicitly trying to establish:
Morgan’s implicit claim (interface theorem)
If an agent’s action rule is allowed to use an extra-logical acceptance bridge Acc that is not required to be provable inside L, then Löb’s prohibition “L can’t prove its own soundness schema” does not block the agent from in fact relying on L-proofs in the world (conditional on S being sound).
This is basically: “Löb blocks internal proof of Sound(L,S), not the practical stance of treating proofs as reliable.”
Demski’s implicit claim (agent-design theorem)
If we keep the Löbstacle-style constraint that delegation must be justified within the agent’s own formal apparatus (or equivalently, we forbid a free external Acc oracle), then Morgan’s move either (i) reappears as L′ and triggers the reflection hierarchy, or (ii) reduces to trusting only weaker fragments, which is not the desired kind of delegation.
So: Morgan is dissolving a philosophical inference (“therefore no agent can trust its reasoning”), while Demski is defending a formal obstacle statement about a particular class of reflective agent designs.
Both can be simultaneously right—because the quantifiers differ.
It gets the core point:
the Löbstacle was about mission-critical delegation, not merely proof checking; restricting to a weaker fragment can amount to “make the delegate dumber,” which is the classic non-solution.
It is a non-solution in the sense that it doesn’t let the sub agent B run free because it can be fully trusted. But Morgan’s move does seem to enable a save kind of delegation. So practically, the different approaches come down to:
Rogers: Delegating computation to find checkable proofs
Demski: Delegating open-ended judgment and autonomous action
Clearly, 1 is weaker than 2. But we can’t get 2 anyway, so getting 1 seems like a win.
And maybe we can extend 1 into a full agent by wrapping B into a verifier. And that would nest for repeated delegation.
Looks like I forgot about this discussion! Did you post a more formal treatment?
Sorry for the confusion here! I haven’t re-oriented myself to the whole context, but it sounds like I did invent a big disagreement that didn’t exist. This has to do with my continued confusion about your approach. But in retrospect I do think your early accusation that I was insisting on some rigid assumptions holds water; I needed to go a bit further afield to try and interpret what you were getting at.
Again, I haven’t yet understood your approach or even re-read the whole conversation here, but it now seems to me that I was doing something wrong and silly by insisting on a definition of trust that forces the Löbstacle. The original paper is careful to only state that Löb naively seems to present an obstacle, not that it really truly does so. It looks to me like I was repeatedly stubborn on this point in an unproductive way.
I have read almost all of this dialog, and my half-serious upshot is:
This glosses over a lot of details in the long and charitable comment thread above. I tried to get an overview of it with ChatGPT. I’m surprised how well that worked:
ChatGPT 5.2 extended thinking summary of the misunderstanding,
1) Shared formal background (what they both accept)
Let:
L be a recursively axiomatized theory strong enough for Löb-style self-reference.
ProvL(⌜φ⌝) be the standard provability predicate; write □Lφ:=ProvL(⌜φ⌝).
U be an “environment/world”.
S be a semantic interpretation mapping formulas of L to propositions about U. Morgan describes this as a semantic map S:Form(L)→C(U).
TrueS(φ) abbreviate “S(φ) holds in UUU”.
A natural “soundness schema relative to S” is:
Sound(L,S) := ∀φ (□Lφ→TrueS(φ)).
The Löbian obstacle setup (as Morgan summarizes it) is that a designer agent A wants to rely on proofs produced by a subordinate B, and this seems to demand something like a schema □Lφ→φ (or its intended-world analogue) for arbitrary φ, which is blocked by Löb-ish reasoning.
So far: aligned.
2) Where the models diverge: what counts as “solving” the obstacle
Demski’s implicit “success criterion” (formal delegation)
Demski treats “escape the obstacle” as: produce an agent design whose decision procedure can rationally delegate mission-critical tasks under something like the formal constraints studied in the Löbstacle literature.
A clean formalization of what he keeps asking for is:
There is an action rule πA such that A will take some action aaa after receiving some certificate from B,
and A’s normative warrant for that reliance is itself captured inside the agent’s formal reasoning (or at least specified as an explicit part of the algorithm), not as an unanalyzed extra assumption.
Hence his repeated demand: “what modification of the agent designs do you have in mind?” and later: “I still don’t understand how you propose for the agent to reason/behave differently.”
Morgan’s “success criterion” (epistemic/interface diagnosis)
Morgan’s post is primarily: diagnose the source of the apparent paradox as a category error—treating a semantic claim (“my proofs track the world”) as if it were a syntactic theorem inside LLL.
His move is to relocate Sound(L,S) out of LLL and into a separate layer of “metalogical belief” that lives at the interface between proofs and world-interpretation. He says the agent need not (and cannot) prove the semantic mapping is sound; rather, the agent assumes/believes it when applying proofs to the world.
This is why he says “the agent doesn’t change its reasoning” (meaning: the proof calculus inside LLL is unchanged).
3) The first misunderstanding: “metalogical beliefs” as axioms vs as an external bridge
Demski hears:
and translates it into the most standard formal move:
Build L′:=L+Sound(L,S)
Then he points out the familiar reflection hierarchy problem: L′ can at best endorse L, but not itself; so you get “trust weaker but not equal-strength” all over again. He explicitly frames this as “you just get a new system L′… which doesn’t believe its own soundness.” and distinguishes “L is sound” from “L plus me is sound.”
Morgan, meanwhile, is not proposing “add Sound(L,S) as an axiom used inside the proof system that generates actions.” He keeps insisting the semantic map is “defined on L, not on L′,” and that for the delegation story you can restrict attention to accepting L-proofs.
So the mismatch is:
Demski’s formalization: metalogical belief ⇒ explicit axiom(s) in an extended theory that the agent uses for decisions.
Morgan’s formalization: metalogical belief ⇒ an extra-theoretic acceptance bridge “when I see an L-proof, I treat its conclusion as world-true under S”—not something asserted/proved within L.
Formally, Morgan is closer to specifying a two-layer agent:
Object layer: derive □Lφ by proof search.
Bridge layer (not inside L): an acceptance rule Acc(φ) that (defeasibly) upgrades “□Lφ” into “act as if TrueS(φ).”
Demski’s objection is then immediate: if Acc is where the action happens, you have changed the decision architecture; and unless Acc is itself formally grounded, you haven’t solved the original agent-design problem, only renamed it. This is exactly his “back pocket” complaint: “fine, I grant the agent ‘has’ the semantic map; how does it use it?”
4) The second misunderstanding: what “reasoning” denotes
They overload “reasoning” differently.
Morgan uses “reasoning” ≈ “applying L’s inference rules to formulas” and treats semantic application as a separate stage.
Demski uses “reasoning” ≈ “the whole cognitive pipeline that produces decisions,” so if semantics affects what you do, it’s part of the reasoning/agent design.
This is why Demski keeps hearing a contradiction in Morgan’s stance:
Under Demski’s formal lens, “not requiring proof” is already a behavioral modification, because the original Löbstacle agent was (by assumption) constrained to act only when it could justify that constraint internally.
5) The third misunderstanding: “trust B” = trusting proofs vs trusting delegated action
Morgan tries to narrow “trust” to “trusting LLL-proofs”:
B can even be inconsistent internally, as long as it outputs valid LLL-proofs; then AAjust checks the proof and uses it.
Demski pushes back that the Löbstacle was about mission-critical delegation, not merely proof checking; restricting to a weaker fragment can amount to “make the delegate dumber,” which is the classic non-solution.
Formally, Demski is insisting on a property like:
A delegates ⟹ ∃ nontrivial tasks where B’s full cognition matters, not just L-proof output.
Morgan is (at least in that subthread) defending a weaker property:
A trusts B ⟹ A accepts any syntactically valid L-proof produced by B as action-relevant.
Those are different “trust” relations, and each is natural given a different target.
6) The “is/ought” mini-misfire is the same structural issue
Morgan phrases the bridge as normative:
Demski hears “ought” as moral/normative in the usual sense and suggests rewriting it as a prediction (“will probably be true”).
But structurally, this is the same dispute:
Morgan is describing Acc: a norm of acceptance/decision (“treat proofs as truth-tracking”).
Demski wants a descriptive link inside the formal model that would license that acceptance without handwaving.
7) The clean formal diagnosis: they are proving different theorems
If you compress all of the above into two “theorems” they are implicitly trying to establish:
Morgan’s implicit claim (interface theorem)
This is basically: “Löb blocks internal proof of Sound(L,S), not the practical stance of treating proofs as reliable.”
Demski’s implicit claim (agent-design theorem)
So: Morgan is dissolving a philosophical inference (“therefore no agent can trust its reasoning”), while Demski is defending a formal obstacle statement about a particular class of reflective agent designs.
Both can be simultaneously right—because the quantifiers differ.
It gets the core point:
It is a non-solution in the sense that it doesn’t let the sub agent B run free because it can be fully trusted. But Morgan’s move does seem to enable a save kind of delegation. So practically, the different approaches come down to:
Rogers: Delegating computation to find checkable proofs
Demski: Delegating open-ended judgment and autonomous action
Clearly, 1 is weaker than 2. But we can’t get 2 anyway, so getting 1 seems like a win.
And maybe we can extend 1 into a full agent by wrapping B into a verifier. And that would nest for repeated delegation.