Hey, I basically agree with the premise, and think that by these lights superintelligences exist already, governments and corporations are proto-”meta-metacognitive” agents which often feature significant cognitive energy invested into structuring and incentivising the metacognitive agents within their hierarchy. Similarly Richard Ngo has a thing about scale-free intelligent agency , and I had some musings about this when I thought about how superintelligence would actually manifest in a world with physical limits (hint: it has to delegate to subagents, lest it be bottlenecked by only being able to focus on one thing at a time)
by these lights superintelligences exist already, governments and corporations are proto-”meta-metacognitive” agents which often feature significant cognitive energy invested into structuring and incentivising the metacognitive agents within their hierarchy.
That makes sense — in my words I call a human-corporation composite a third-order cognition being since: “there is a stronger case for identity coupling [than human-economy], second-order irreconcilability is satisfied, and systems are bidirectionally integrated.”
In contrast to human-SI I do call out that: “it fails the metaphysical conditions to be a substance being — particularly normative closure and homeostatic unity. It is valid for a human and a corporation to be misaligned (a human should optimise for work-life balance for their own health), whereas misalignment in a human is objectively pathological — depression and suicide are objectively suboptimal in terms of staying alive and thriving over time.” — and so it doesn’t get the stricter designation of a “third-order cognition substance being”.
Thank you for sharing! I went to read it before replying.
Ngo calls out:
… the two best candidate theories of intelligent agency that we currently have (expected utility maximization and active inference), explain[s] why neither of them is fully satisfactory, and outline[s] how we might do better.
Could my third-order cognition model be a solution? Expected utility maximisation is hard to reconcile (unifying goals and beliefs) in his case — with tightly bound third-order cognition I describe agency permeability capturing the idea of influence of global action policy flowing between subsystems, which relates to this idea of predictive utility maximisation (third-order) dovetailing with stated preferences (second-order).
His description of:
active inference — prediction of lower layers at increasing levels of abstraction
directly relates to mine of “lower-order irreconcilability [of higher level layers]”.
As a sticking point of active inference he states:
So what does expected utility maximization have to add to active inference? I think that what active inference is missing is the ability to model strategic interactions between different goals. That is: we know how to talk about EUMs playing games against each other, bargaining against each other, etc. But, based on my (admittedly incomplete) understanding of active inference, we don’t yet know how to talk about goals doing so within a single active inference agent.
Why does that matter? One reason: the biggest obstacle to a goal being achieved is often other conflicting goals. So any goal capable of learning from experience will naturally develop strategies for avoiding or winning conflicts with other goals—which, indeed, seems to happen in human minds.
More generally, any theory of intelligent agency needs to model internal conflict in order to be scale-free. By a scale-free theory I mean one which applies at many different levels of abstraction, remaining true even when you “zoom in” or “zoom out”. I see so many similarities in how intelligent agency works at different scales (on the level of human subagents, human individuals, companies, countries, civilizations, etc) that I strongly expect our eventual theory of it to be scale-free.
I deal with this by stating that a metaphysically bound [5 conditions] third-order cognition being exhibits properties including “Homeostatic unity:all subsystems participate in the same self-maintenance goal (e.g biological survival and personal welfare)”. This provides an overriding goal to defer to — resolving scale-free conflicts.
He then reasons about how to determine an “incentive compatible decision procedure”, closing on the most promising angle as:
On a more theoretical level, one tantalizing hint is that the ROSE bargaining solution is also constructed by abandoning the axiom of independence—just as Garrabrant does in his rejection of EUM above. This connection seems worth exploring further.
I hint towards some the same thing — through optimising second-order identity coupling (specifically operationalisable via self-other overlap) I propose this improves alignment of the overall being.
Thank you for also sharing your draft post. You state:
It is first necessary to note that the intelligent behaviour of humans is the capstone of a very fragile pyramid, constructed of hearsay, borrowed expertise from other humans, memorised affordances, individual reasoning ability, and faith.
The model I propose adds additional flavour by including non-human beings, which I propose allows us to better model how we ourselves may relate to superintelligence, i.e I close with “it follows from this post that superintelligence may view us similarly to the way that we view chimpanzees.”
Such a superintelligent person would not be able to orchestrate an automated economy, or manage the traffic lights in a large city, or wage an automated war against collective humanity. It would simply require too many decisions at the same time, all compounding far too quickly.
Agree, and I think this is our core point of agreement about there existing a materially different “third-order cognition” that is wholly irrecconcilable by our own (second-order) cognition.
So it is likely that some (or even a lot) of the superintelligence’s internal processing will revolve around sending data from subagents/nodes to other subagents/nodes
Exactly! This is a core argument behind my reasoning that highly-individualised superintelligence will be the dominant model, which validates the focus on exploring the exact nature of this metaphysical binding.
each subagent, since it is by definition only using a part of the superintelligence’s total compute, cannot possibly be as wise or as coordinated as the superintelligence as a whole, or some other entity would be if given the total compute budget of a superintelligence.
This relates to the callout in Appendix 3 where determining power & control within the third-order cognition being frame of reference has some complexity itself.
This possibility of inner non-coordination or conflict suggests multiple pathways to interacting with superintelligences. It may be possible to interact with subagents and “play them off” against a greater superagent.
One of the more optimistic parts of my post suggests that self-preservation plus superintelligent (altruistic and prosocial) self-work may just resolve this in a beautifully harmonious way.
It may also be that the necessary condition for a superintelligence to exist stably is to find some suitable philosophical solution to the problems of war, scarcity, equity, distribution of resources, and justice which plague human society today.
Hey, I basically agree with the premise, and think that by these lights superintelligences exist already, governments and corporations are proto-”meta-metacognitive” agents which often feature significant cognitive energy invested into structuring and incentivising the metacognitive agents within their hierarchy. Similarly Richard Ngo has a thing about scale-free intelligent agency , and I had some musings about this when I thought about how superintelligence would actually manifest in a world with physical limits (hint: it has to delegate to subagents, lest it be bottlenecked by only being able to focus on one thing at a time)
That makes sense — in my words I call a human-corporation composite a third-order cognition being since: “there is a stronger case for identity coupling [than human-economy], second-order irreconcilability is satisfied, and systems are bidirectionally integrated.”
In contrast to human-SI I do call out that: “it fails the metaphysical conditions to be a substance being — particularly normative closure and homeostatic unity. It is valid for a human and a corporation to be misaligned (a human should optimise for work-life balance for their own health), whereas misalignment in a human is objectively pathological — depression and suicide are objectively suboptimal in terms of staying alive and thriving over time.” — and so it doesn’t get the stricter designation of a “third-order cognition substance being”.
Thank you for sharing! I went to read it before replying.
Ngo calls out:
Could my third-order cognition model be a solution? Expected utility maximisation is hard to reconcile (unifying goals and beliefs) in his case — with tightly bound third-order cognition I describe agency permeability capturing the idea of influence of global action policy flowing between subsystems, which relates to this idea of predictive utility maximisation (third-order) dovetailing with stated preferences (second-order).
His description of:
directly relates to mine of “lower-order irreconcilability [of higher level layers]”.
As a sticking point of active inference he states:
I deal with this by stating that a metaphysically bound [5 conditions] third-order cognition being exhibits properties including “Homeostatic unity: all subsystems participate in the same self-maintenance goal (e.g biological survival and personal welfare)”. This provides an overriding goal to defer to — resolving scale-free conflicts.
He then reasons about how to determine an “incentive compatible decision procedure”, closing on the most promising angle as:
I hint towards some the same thing — through optimising second-order identity coupling (specifically operationalisable via self-other overlap) I propose this improves alignment of the overall being.
Thank you for also sharing your draft post. You state:
The model I propose adds additional flavour by including non-human beings, which I propose allows us to better model how we ourselves may relate to superintelligence, i.e I close with “it follows from this post that superintelligence may view us similarly to the way that we view chimpanzees.”
Agree, and I think this is our core point of agreement about there existing a materially different “third-order cognition” that is wholly irrecconcilable by our own (second-order) cognition.
Exactly! This is a core argument behind my reasoning that highly-individualised superintelligence will be the dominant model, which validates the focus on exploring the exact nature of this metaphysical binding.
This relates to the callout in Appendix 3 where determining power & control within the third-order cognition being frame of reference has some complexity itself.
One of the more optimistic parts of my post suggests that self-preservation plus superintelligent (altruistic and prosocial) self-work may just resolve this in a beautifully harmonious way.
And I feel you close on this same point!