Human Agency in a Superintelligent World
Superintelligence doesn’t make human decisions unnecessary, any more than the laws of physics make them unnecessary, these are two instances of exactly the same free will vs. determinism puzzle. When something knows or carries out your actions, as the physical world does (even if that is the only way in which your actions are ever carried out), that by itself doesn’t take away your agency over those actions. Agency requires influence over actions, but it’s not automatically lost as a result of something else gaining influence over them, or having foreknowledge of what they are going to be, or carrying them out on your behalf, perhaps without your knowledge; such circumstances are compatible with retaining your own influence over those actions.
Path Dependence
Humans are more agentic than the physical world, it’s easy to tell if you are in control of your own physical body (if it’s destroyed, you are no longer in control). But if you are confused and gullible, perhaps other humans have more influence over some actions than you do. And if you are a human living in a superintelligent world that’s not keeping you whole, you are not necessarily even yourself anymore. Defining clearly what it means to be yourself becomes crucial in that context, or else we wouldn’t be able to ask whether you yourself retain agency over your decisions that take place in such a world (or over your own values, if they are to retain some influence).
The code of a (pure) computer program perfectly screens off the world from the process of computation that follows the program. Nothing else can influence what the code does that’s not already given by the code, only the process of computation itself can decide to take some consideration into account for how to continue the computation. The world must follow all decisions of the program when computing it, or else it’s computing something else. It can’t decide to alter something about how the computation proceeds without thereby destroying the legitimacy of the process of carrying out what the program says.
A human is very path dependent, different events and influences would lead the same person down very different paths, and influences from a superintelligence might be able to alter that person on a fundamental level. To rescue the analogy, consider all hypothetical histories (arbitrarily detailed life stories) of a person, for all possible influences and interactions. A person determines this collection of hypothetical histories, and while outcomes (later events) of these histories are path dependent (they depend on what else is going on there, not just on the person), the collection of all these histories taken together isn’t as a whole path dependent, it doesn’t depend on what would actually happen.
Legitimate Decisions
In a hypothetical history where a human brain is rewritten into something else, the decisions of the resulting brain are not legitimate decisions of the original human. Thus, we can look over the hypothetical histories and see if some of them are not doing things like that. Perhaps some of these histories don’t have superintelligent AIs or supercompetent swindlers at all, the hypothetical worlds where they take place are devoid of such. Or this human’s ability to think mildly improves in centrally benign and nonintrusive ways, with tools and options for getting better at figuring out how to know what to think or what to do. In these histories, decisions largely remain under control of that human’s own agency (in ways dependent on the history), and considering all such histories together rather than individually makes the resulting collection not itself path dependent.
What happens in such non-pathological hypothetical histories, taken together, could serve as ground truth for what kinds of decisions that human would legitimately take. Legitimacy of histories, the options available in them, and aggregation from developments in different histories are themselves subject to interpretation, which should be mostly ultimately routed back to decisions made by the human from within the histories themselves, giving some sort of fixpoint. However this is constructed in detail, the claim is that there is a much more robust and objective grounding for what counts as legitimate decisions and values-on-reflection of a given human, much more so than if we merely imagine what a human might end up actually asking for, faced with a superintelligent world directly.
Superintelligence is Unable to Help
Hypothetical histories screen off superintelligent influence (in the outside world) from legitimate decisions. As such, superintelligence can’t influence them (without breaking legitimacy of a hypothetical history), but also it can’t help the human arrive at them (if the human doesn’t reach out for such help within the hypotheticals, and some of the hypotheticals lack the option). Any substantive decisions would still need to be resolved the hard way.
In this sense humans can’t become unnecessary for figuring out what they would decide, as a process that doesn’t actually consult how humans would decide isn’t legitimately following their decisions, and too much superintelligent help with such decisions breaks legitimacy of the process (or introduces path dependence, so that the decisions can no longer be attributed primarily to specific people, rather than to other factors). Superintelligence may ignore humans, just as the physical world may send a giant asteroid, but that’s no subtle and inevitable obsolescence, it’s not loss of agency inherent to disparity in optimization power.
A human who retains influence in a superintelligent world still retains it in a normal way, even if from within hypothetical worlds merely imagined collectively by said superintelligence. The presence of a superintelligence in actuality doesn’t make it incoherent to talk about decisions and values of an initially weaker human, doesn’t make the substantive work of arriving at those decisions and values any less that human’s own work, doesn’t make that work being carried out by the human any less necessary in deciding what those outcomes are. Some of these decisions might even ask for the human to retain a form that’s not merely a figment of superintelligent imagination, or to manifest some other form later, once it’s clearer what it should be.
(This is another attempt at a post from last year, on compatibilism/requiredism within a superintelligent substrate that preserves humans as mostly autonomously self-determined mesa-optimizers who don’t get optimized away by the outer agent and have the opportunity to grow up on their own terms.)
I suspect that you miss the point. The superintelligence would likely undermine humans’ free will in a different manner, by the virtue of humans deferring to the AI instead of developing and keeping skills. If some aspects of free will like being able to control one’s short-term urges are a skill, then it will end up being underdeveloped.
I’m attempting to reply to the claim that it’s natural for humans to become unnecessary (for arranging their own influence) in a world that keeps them around. The free will analogy between physics and superintelligence illustrates that human decisions can still be formulated and expressed, and the collection-of-hypotheticals construction shows that such decisions are also by themselves sufficient to uplift humans towards a greater ability to wield their extrapolated volition (taking the place of more value-centric CEV-like things in this role), with superintelligence not even being in the way of this process by default. See also the previous post on this where a convergent misunderstanding in the comments is what I’m addressing here with the collection-of-legitimate-hypotheticals construction.
I’m not sure why this is falling flat, for some reason this post is even more ignored than the previous one, possibly the inferential distance is too long and it just sounds like random words, or the construction seems arbitrary/unmotivated, like giant cheesecakes the size of cities that a superintelligence would have the power to build but the motivation to do that in particular isn’t being argued. Perhaps opaque designs of a superintelligence are seen as obviously omnipotent, even in the face of philosophical conundrums like free will, so that if it wants something to go well, then it obviously will.
But then there are worries in the vicinity of Bostrom’s Deep Utopia, of how specifically losing necessity of human agency plays out. So the collection-of-hypotheticals construction is one answer to that, that necessity of human agency just doesn’t get lost by default (if humanity ends up centrally non-extinct; perhaps it’s in a world of permanent disempowerment). This answer might be too unapologetically transhumanist for most readers (here superintelligent imagination is the substrate for humanity’s existence, without necessarily any concrete existence at all). It also somewhat relies on grokking a kind of computational compatibilism relevant for decision theory around embedded agency, where decisions develop over logical time, with people/agents that could exist primarily in the form of abstract computations expressed in their acausal influence on whatever substrate would listen to their developing hypothetical decisions (so that the substrate doesn’t even necessarily have access to the exact algorithms, it just needs to follow some of the behaviors of the computations, like an LLM that understands computers in the usual way LLMs understand things).