Disentangling Competence and Intelligence

I struggle quite a bit with publishing, so today I am just going to publish a bunch of hopefully relatively refined drafts. Any comments and questions are welcome, I would be grateful to elaborate on any area where I appear unclear. If you like, you can also let me know some thoughts about formatting this to be easier to read. This applies to all my future posts.

This post captures most of my current thoughts on deconfusing intelligence.
note: the two parts overlap somewhat, but I didn’t quite synthesize them yet. Part 1 had been included in our publications in AISC before, but I think it’s cleaner to post it seperately.

Part 1 (comments on conceptual confusion):

The “ability to achieve goals in a wide range of environments” as per Markus Hutter and Shane Legg (https://​​arxiv.org/​​abs/​​0706.3639) is often used as a definition of general intelligence, but in my opinion it better captures the notion of general competence.

How might the two concepts differ?

Most agents can achieve significantly fewer goals in many environments if their embodiments are changed (say, if their legs were removed), even when retaining the same cognitive capabilities. Is it therefore right to say that an agent has become less intelligent as a consequence of losing its legs?

I believe that including the embodiment/​interface of an agent within our measure/​concept of its intelligence inflates the concept and leads to confusion.

An example of such confusion would be the notion that companies are already superintelligent agents/​entities and thus serve as useful analogies/​frameworks for thinking about how to align superintelligent agents in general. While I agree that companies can be more competent than individuals in many domains, they are interfacing with the world very differently, which is something that needs to be accounted for before characterising this difference in competence as purely/​mostly a matter of intelligence.

Competence is contextual and compositional, there are many factors that can contribute to it. The sensors that an agent/​system possesses, its actuators, processing speed, data storage capacity, and so on, are things that I believe should be understood separately from intelligence. If we study the factors that make up the construct of competence, both as individual factors and in their interaction/​relation with each other, we become capable of generating more precise explanations and predictions, like “What happens when we increase the intelligence of agent x by y% in environment z, while keeping all the other factors the same? What about if we just double the processing speed, but change nothing else?”

There is perhaps no ground truth to what intelligence is. If competence can be cleanly divided into, say, 8 different contributing factors, we are free to refer to one of them (or a group) as intelligence, or to say that we explained the mystery away. The term is only important in so far as it guides intuition and association, and allows for clear communication. And exactly this is what gets muddied by the conflation: if people think of intelligence as this complex phenomenon that basically fully predicts competence and can hardly be split into smaller pieces, they are inhibited from considering more compositional and therefore more understandable notions of competence.

Here is the way I like thinking about cognition and, in that context, intelligence:

Simply speaking, cognition can be understood as the computational translation process between an agent’s observations to its actions, and sophisticated agents usually include observations about their cognitive architecture, and actions within their cognitive architecture, in that space.

There are three basic levels to how this translation may occur.

  • Level 1 is akin to a lookup table, where there is just a direct connection from observation to action, without any conditional processing in between.

  • Level 2 is an algorithm, a model or composition thereof, that takes in observations and perhaps some extra data like memories or current preferences, and outputs the action. For embedded agents one can argue that, because of the good regulator theorem, the most useful models will often be partial and compositional simulations of the environment. One can create level 1 structures this way, by simply storing the pair of observation and computed action.

  • Level 3 is any algorithm that makes changes to level 2 or level 3 according to some metric of improvement, it is basically the process by which the agent creates and updates models. This constitutes an agent’s ability to learn novel things and update its models, primarily by reducing prediction errors.

An agent’s competence in interacting with their environment, given a particular interface (/​embodiment) to the environment, is largely determined by level 2. Only by putting agents into novel circumstances can their level 3 capabilities be inferred from their behavior/​competence. Most contemporary AIs only have access to level 1 and 2 during deployment, since they lack the ability to update their weights after training. This is mostly because an inadequate (relative to the environment) level 3 capability can mess up a well developed level 2 capability.

It should be noted, however, that level 3 is usually just a particular version of level 2, with the main difference being that the output action refers to a structural change in the system’s cognition. Without going into too much detail here, there can be functional equivalency between competence growth through changing one’s environment vs changing one’s models, meaning that a LLM can functionally possess some limited level 3 capabilities during deployment by generating the context for its future outputs, even though the weights are frozen.

I like to think of level 2 as Understanding and level 3 as Intelligence, though terming them “crystallized intelligence” and “fluid intelligence” may be more intuitive to some readers. Intelligence is the way to get to Understanding that an agent doesn’t already have and is therefore indirectly related to competence.
I am not sure how to discuss level 4, which is simply the algorithmic layer updating the level 3 algorithms. This sort of meta-learning is crucial for efficiency and cognitive domain adaptation, but it is also quite exfohazardous.

Part 2 (closer to formalization):

I would measure/​define Competence as the ability to achieve goals in an environment. General Competence denotes this for a wide range of environments, in order to make the notion of Competence less contextual to a specific environment. One could have comparative or absolute notions of (General) Competence.

”Achieving goals in an environment” translates to achieving a subset of possible states of a given environment. We usually think of these states as not being the default ones, so that the agent has to take an active role in bringing about the subset.

Taking an active role in bringing about a subset of states of an environment(/​ a system) means that the agent needs to interface with the environment and introduce changes away from the “default causal trajectory”.
This “interfacing” is done through actuation, meaning that the agent makes changes to the environment through the boundary of its embodiment. One could imagine this actuation interface for every possible state of the environment (including the agent), offering different access to the sub-states of the environment, depending on many factors like e.g. the agent’s location and height.

In principle, this actuation interface could be so complete that the agent has direct control over every sub-component(/​parameter) of the environment, making the issue of achieving goals so trivial that the agent does not even require perception. It could just copy-paste a desired state through the interface. This is a special case where no perception is required to be maximally competent with respect to the environment.

Usually however, perception is required and can be seen as part of the interface on the boundary between agent and environment. It provides evidence about the current state of the environment. It can potentially inform the agent about which part of the environment can be altered through actuation and in what way.

This is relevant for all non-trivial implementations of cognition. I conceive of cognition as the translation between perception and actuation, but with the optional addition of perception about the agent’s internal state and actuation on it—as part of the environment, the agent may also be interfacing with itself. The performance of the cognition is dependent on the “translation algorithm” and the computational substrate that it runs on.

I like to think of “efficient translation software” as Understanding, and I think that, in the embedded agency setting, it generally cashes out as a composition of computational models that capture predictions about relevant sub-states of the environment and map them onto actions.

Now, Intelligence is the algorithm/​mechanism that updates and creates Understanding. It potentially makes the “translation software” more efficient, more effective, or adjusts it to new contexts (which I felt was worth mentioning separately, though it can be mapped onto increased efficiency/​effectiveness).

Taking back a step, it should be apparent that multiple factors interplay to determine an agent’s Competence. If we lower the actuation capability of the agent, it potentially needs to find a “cleverer” way to achieve its goal, or maybe the goal has become principally untouchable from the new interface. If we turn any of these dials of perception, actuation, cognitive hardware and cognitive software, or the way in which these all connect, we can alter the agent’s (General) Competence.

Intelligence is the mechanism by which a functional translation algorithm is developed over time. If the hardware capacity suddenly doubles, one will need Intelligence to make use of that. Similar for any other turn of the dial. If one eye goes blind, Intelligence will need to adjust the Understanding to account for that.

And yet, Intelligence can be superfluous if all these components are already correctly calibrated and connected, and the environment does not receive any unusual disturbance. In this case, Intelligence would not even register as a factor in the agent’s Competence. It is an indirect contributor.

An agent could self-improve in the absence of Intelligence, if its Understanding already encodes behavior that would lead to self-improvements. This is also why the boundary between Intelligence and Understanding is somewhat ambiguous.

Intelligence is also just a sort of translation algorithm from observation to action, but the environment that it chooses actions within is the cognition of the agent. And in this environment, it can be extremely competent—not least due to its potent interface.