Human Agency in a Superintelligent World

Superintelligence doesn’t make human decisions unnecessary, any more than the laws of physics make them unnecessary, these are two instances of exactly the same free will vs. determinism puzzle. When something knows or carries out your actions, as the physical world does (even if that is the only way in which your actions are ever carried out), that by itself doesn’t take away your agency over those actions. Agency requires influence over actions, but it’s not automatically lost as a result of something else gaining influence over them, or having foreknowledge of what they are going to be, or carrying them out on your behalf, perhaps without your knowledge; such circumstances are compatible with retaining your own influence over those actions.

Path Dependence

Humans are more agentic than the physical world, it’s easy to tell if you are in control of your own physical body (if it’s destroyed, you are no longer in control). But if you are confused and gullible, perhaps other humans have more influence over some actions than you do. And if you are a human living in a superintelligent world that’s not keeping you whole, you are not necessarily even yourself anymore. Defining clearly what it means to be yourself becomes crucial in that context, or else we wouldn’t be able to ask whether you yourself retain agency over your decisions that take place in such a world (or over your own values, if they are to retain some influence).

The code of a (pure) computer program perfectly screens off the world from the process of computation that follows the program. Nothing else can influence what the code does that’s not already given by the code, only the process of computation itself can decide to take some consideration into account for how to continue the computation. The world must follow all decisions of the program when computing it, or else it’s computing something else. It can’t decide to alter something about how the computation proceeds without thereby destroying the legitimacy of the process of carrying out what the program says.

A human is very path dependent, different events and influences would lead the same person down very different paths, and influences from a superintelligence might be able to alter that person on a fundamental level. To rescue the analogy, consider all hypothetical histories (arbitrarily detailed life stories) of a person, for all possible influences and interactions. A person determines this collection of hypothetical histories, and while outcomes (later events) of these histories are path dependent (they depend on what else is going on there, not just on the person), the collection of all these histories taken together isn’t as a whole path dependent, it doesn’t depend on what would actually happen.

Legitimate Decisions

In a hypothetical history where a human brain is rewritten into something else, the decisions of the resulting brain are not legitimate decisions of the original human. Thus, we can look over the hypothetical histories and see if some of them are not doing things like that. Perhaps some of these histories don’t have superintelligent AIs or supercompetent swindlers at all, the hypothetical worlds where they take place are devoid of such. Or this human’s ability to think mildly improves in centrally benign and nonintrusive ways, with tools and options for getting better at figuring out how to know what to think or what to do. In these histories, decisions largely remain under control of that human’s own agency (in ways dependent on the history), and considering all such histories together rather than individually makes the resulting collection not itself path dependent.

What happens in such non-pathological hypothetical histories, taken together, could serve as ground truth for what kinds of decisions that human would legitimately take. Legitimacy of histories, the options available in them, and aggregation from developments in different histories are themselves subject to interpretation, which should be mostly ultimately routed back to decisions made by the human from within the histories themselves, giving some sort of fixpoint. However this is constructed in detail, the claim is that there is a much more robust and objective grounding for what counts as legitimate decisions and values-on-reflection of a given human, much more so than if we merely imagine what a human might end up actually asking for, faced with a superintelligent world directly.

Superintelligence is Unable to Help

Hypothetical histories screen off superintelligent influence (in the outside world) from legitimate decisions. As such, superintelligence can’t influence them (without breaking legitimacy of a hypothetical history), but also it can’t help the human arrive at them (if the human doesn’t reach out for such help within the hypotheticals, and some of the hypotheticals lack the option). Any substantive decisions would still need to be resolved the hard way.

In this sense humans can’t become unnecessary for figuring out what they would decide, as a process that doesn’t actually consult how humans would decide isn’t legitimately following their decisions, and too much superintelligent help with such decisions breaks legitimacy of the process (or introduces path dependence, so that the decisions can no longer be attributed primarily to specific people, rather than to other factors). Superintelligence may ignore humans, just as the physical world may send a giant asteroid, but that’s no subtle and inevitable obsolescence, it’s not loss of agency inherent to disparity in optimization power.

A human who retains influence in a superintelligent world still retains it in a normal way, even if from within hypothetical worlds merely imagined collectively by said superintelligence. The presence of a superintelligence in actuality doesn’t make it incoherent to talk about decisions and values of an initially weaker human, doesn’t make the substantive work of arriving at those decisions and values any less that human’s own work, doesn’t make that work being carried out by the human any less necessary in deciding what those outcomes are. Some of these decisions might even ask for the human to retain a form that’s not merely a figment of superintelligent imagination, or to manifest some other form later, once it’s clearer what it should be.

(This is another attempt at a post from last year, on compatibilism/requiredism within a superintelligent substrate that preserves humans as mostly autonomously self-determined mesa-optimizers who don’t get optimized away by the outer agent and have the opportunity to grow up on their own terms.)