Computation complexity of AGI design

Summary of main point: I argue that there is a significant probability that creating de novo AGI is an intractable problem. Evolution only solved this problem because of anthropic reasons. Conclusions are drawn regarding priorities in AI risk research.

Sketch of main argument: There are suggestive relations between AGI and NP-completeness. These relations lead me to hypothesize that AGI programs posses large Levin-Kolmogorov complexity which implies that producing them is a computationally intractable problem. The timing of events in the evolution of human intelligence seems to be consistent with the assumption evolution’s success is anthropic, if we postulate human intelligence as arising from a combination of two modules: an “easy” (low complexity) module and a “hard” (high complexity) module. Therefore, creating superhuman intelligence will require reverse engineering the human brain and be limited to improving the “easy” module (since creating a better “hard” module is again computationally intractable).

AGI and P vs. NP

There are several arguments the AGI problem is of a similar “flavor” to problems that are NP-complete.

The first argument is rather vague but IMO still compelling. Many class separations in complexity theory (P vs. NP, L vs. P, R vs. RE) hinge on the existence of a complete language. This means there is a single problem solving which under the stronger resource constraints would lead to solving all problems in the larger class. Similarly, Goedel incompleteness means there is no single algorithm (a program which terminates on all inputs) for proving all provable theorems. It feels like there is a principle of mathematics which rules out algorithms that are “too good to be true”: a single “magic wand” to solve all problems. In a similar way, AGI is a “magic wand”: it solves “all” problems because you can simply delegate them to the AGI.

Another argument has to do with Solomonoff induction. Solomonoff induction is incomputable but it becomes computable if we set a limit T on the run-time of the “hypotheses” (programs) we consider. However, the resulting computable induction carries an
O(T 2T) slow-down penalty (the time it takes to run all possible hypotheses). On the other hand, the problem is easy modulo P# and tractable given an NP-complete oracle given certain assumptions on the required probability accuracy.

Yet another argument goes through logical uncertainty. The latter is widely suspected to be an important component of AGI and there is a compelling relation between it and P vs. NP.

What does all of it mean? We certainly don’t need an NP-oracle to construct an AGI since humans are “A”GIs and (presumably) there are no NP-oracles in our brain. To shed light on this, it is useful to take the quantitative point of view on AGI. Namely, there is a metric which rates programs according to how “intelligent” they are. From this point-of-view, an AGI is just a program which ranks high on this metric. The first such metric was suggested by Legg and Hutter and I improved on their construction by combining it with UDT.

This way the AGI design problem becomes an optimization problem: find a program with an intelligence metric as high as possible. The NP-connection now suggests the following conjecture: the AGI optimization program is of exponential complexity in program length. Of course we don’t necessarily need the best program of a given length but the impression remains that AGI design is hard in some rigorous complexity theoretic sense. In particular, I’m guessing there should be a relation between the intelligence (in the precise quantitative sense) of a program and its Levin-Kolmogorov complexity.

The anthropic scenario

If we buy into the conjecture above, a glaring problem appears: if AGI design is so hard, how come evolution succeeded in it? After all, evolution is also a process with bounded computing resources. The only explanation that seems to remain is the anthropic one: evolution’s a priori probability of success was insanely low but in an infinite universe it still succeeds infinitely many times and we observe one of these times for the obvious reason.

This explanation produces probabilistic predictions regarding the timing of events. For example, if there was no cosmological upper bound on when intelligence can appear we would expect it would appear extremely late. This is not the case in our universe (on a cosmological time scale). However, this is not difficult to explain since there is a relatively short time window in the lifetime of the universe in which suitable planets revolving suitable stars exist. In particular, on Earth in 0.6 billion years there won’t be trees any more and in 1.1 billion years there won’t oceans.

As well known, in scenarios with hard steps that are overcome anthropically, the hard steps are expected to be distributed on the timeline approximately uniformly. This seems to conflict with the most intuitive location of the intelligence hard step: somewhere between chimp and human. However, the apparent discrepancy goes away if we consider a model with two coupled “intelligence modules”: an “easy” module E which is susceptible to non-anthropic evolutionary optimization and a “hard” module H which contains most of the Levin-Kolmogorov complexity and whose appearance is the hard step in question.

Before the hard step, an early version E1 of E co-evolves with a module h which performs a similar function to H but does it much worse (imagine a rough heuristic which works for many of the cases in a relatively narrow domain). During the hard step, H appears “out of the blue” due to sheer anthropic luck after which the E1-h “wire” is replaced by an E1-H wire. After the hard step, natural selection proceeds to transform E1 into its final version E2. This picture seems to be consistent with hard step happening to our chimp-like ancestor after which natural selection rapidly transformed the result into homo sapiens sapiens.

This scenario would be undermined if there was an “E-like” property of our ancestors which evolved shortly before the presumed hard step. What can this property be? The best candidate I can think of is the evolution of hands. Apparently, hands evolved 100 millions years ago. The ratio between this number and the remaining 600 million years doesn’t seem to be small enough to rule out the anthropic scenario. The argument is made stronger if we take into account that there is an extinction event every 100 million years or so which means we can’t reasonably expect a much larger time difference.

Consequences for future of mankind

If AGI is a computationally intractable problem, we won’t be able to solve it “fairly” in the near future. However, we can use the existing solution: homo sapiens sapiens. This means reverse engineering the brain and either modifying it (improving module E) or extracting (emulating) H and writing E from scratch. It is not clear how much intelligence improvement to expect: on the one hand we’re stuck with the current H on the other hand E might still have lots of room for improvement (which is intuitively likely). It is not clear whether the monopole (singleton) or multipole scenario is more likely. It feels to me that a singleton will require rewriting E and it will be easier to start tweaking it therefore multipole superhuman intelligence will be first.

Reverse engineering and modifying the brain is a project which is likely to require considerable resources and encounter enormous legal barriers. As opposed to de novo AGI, it is difficult to imagine it accomplished by a small group or any private organization. The most likely scenario seems to be a major government project in the spirit of Manhattan, Apollo or LHC. The currently prevailing culture /​ system of beliefs makes it extremely unlikely for the government of a liberal country to undertake such a project if the technology was available. If this circumstance doesn’t change, the first government to try will be an authoritarian one like China. Such a government will ensure the resulting superhumans will have extreme built-in loyalty*, resulting in a world-wide superdictatorship. Therefore, the highest priority seems to be changing culture in a way that will ensure a supportive public opinion for a future friendly superintelligence project. Another high priority is continuing to develop the abstract mathematical theory to better understand the likelihood of this and other scenarios.

* I am assuming (or hoping) that no government will be stupid enough to try it before brain reverse engineering identifies the “utility function module”

EDIT: The treatment of anthropics in this post is unforgivably oversimplified. I’m hoping to a write a UDT-based analysis later. Also, thanks to Mark Friedenbach for point out the extremely relevant paper by Shulman and Bostrom.