Adaptation Executors and the Telos Margin

Thank you to Justis Mills for feedback on a draft of this post.

You’ll often hear this bit of wisdom: “Humans are not utility optimizers, but rather adaptation executors.” At first glance, it seems to be pretty self-explanatory. Humans are not effectively described by the optimization of some particular utility function—to the contrary, human behavior is the product of a slew of hot-fix adaptations, most easily understood in terms of how they function.

On a second look, though, there’s a little more here. What’s the difference between these two representations? For any given pattern of behavior, a utility function can be selected that values precise adherence to any given behavioral pattern. On some level, then, an adaptation executor is a utility maximizer—at minimum, we can retrospectively say that its utility function was based on how well it did what the adaptations drove it toward. That’s not very satisfying, though, as there does seem to be a substantial difference. Looking for what sets them apart, the only option is the operative qualifier: that behavior is effectively described.

To effectively describe some behavior, it’s necessary to describe that behavior as simply and directly as possible. Notice how convoluted that boilerplate utility function is. “Actions that would be produced by behavioral model X” fits the form of a utility function, but it’s certainly not the form that comes to mind when thinking about utility functions in general. Taking this route to transform a pattern of behavior into a utility function is always going to increase the complexity. So while you can theoretically construct a utility function to explain human choices, there very well may not be a utility function which expresses human decisions with substantially less complexity than the behavior produced by a taxonomy of adaptations.

What I propose to call an agent’s telos is the difference between how complex that agent’s behavior is and how complex the simplest utility function corresponding to that behavior is. It’s the degree to which an agent is better expressed by purpose than procedural action. I’m going to formalize that in the next few sections, but if you just want to see how it looks in practice, you can skip to “Back to Reality.”

The Behavior Form

That’s a good start, but it’s not worth much to wave around algorithmic complexity without having phrased the problem using algorithms. So, we’re going to be looking at a computable agent, and a computable environment. Since they’re both programs, they’ll need to interact in discrete steps—the agent sends an output to the environment, and the environment gives it new input, and so on ad infinitum. They’ll both have memory of all past exchanges to draw on, allowing things to get a little more complicated than function composition. In this paradigm, the agent’s behavior form is just the program that links its input and its memory to its output—it’s the way it decides what action to take given its circumstances up to present. The behavior complexity, then, is simply the algorithmic complexity of this procedure—how simply can the way the agent acts be described, moving from stimulus to reaction?

The Utility Form

What I’ll call the utility form is a shift from looking at the behavior to the preference. What’s the simplest computable utility function that the behavior is already optimizing?

Starting from basic structure, a program filling the role of a utility function would simply take a history of inputs and outputs from an agent and assign it a weight. Basically, it would grade an agent’s choices, as a utility function tends to.

Meanwhile, it’s true that the environment is unknown to the agent, at least beyond the information it gets from its history of interactions. However, by virtue of it being a computable environment, we can make certain assumptions. In particular, from the lab of the mad computer scientist Ray Solomonoff, the concept of algorithmic probability allows us to set expectations for any given output based on the premise of a program with random code. We can narrow this down using Bayes’ theorem, in a (non-computable) algorithm called Solomonoff induction. With that approach, there are actually pretty significant expectations for an environment with a particular history, setting up an optimal path for any given utility function.

So, smashing these two rocks together, we can describe the utility form of an agent as the minimum program which, when used to weight the expectations of various outcomes according to conditional algorithmic probability, makes the actions of the target behavioral program optimal. The utility complexity can then be just the length of this program. It comes down to finding the simplest way of representing an agent as optimizing a utility function.

In reality, most utility forms have more than one behavioral optimizer. A particularly concerning case is the function which always returns 0 - extremely simple, and optimized for by all behaviors. This issue can be addressed by including the burden of specification in the utility complexity. That’s the last type of complexity I’ll introduce, and I’ll call it specification complexity—the number of bits, on average, needed to distinguish the behavior form in question from the space of other behavior forms that satisfy a utility form. This is just , where is the algorithmic probability of the behavior form out of only those algorithms that optimize U. For example, the always-0 utility function doesn’t constrain algorithmic probability at all, so nearly the agent’s full behavior complexity must be added. By that token, the utility complexity is actually going to be the length of the minimal program + the specification complexity.[1]

Properly Introducing Telos

Okay, so we have two ways of representing a given agent: we start by directly coding its behavior, but from there we can also represent it with a computable utility function for which it’s a true optimizer. In both of these forms, we can describe its complexity. However, the trick from before—choosing a utility function that just rewards the behavior established—means that the utility complexity can only exceed the behavior complexity by a constant amount. After all, we can represent this baseline utility function with any particular behavior form hardcoded. There’s no bound on how simple the utility function can be, though; a menagerie of complex conditionals can boil down to the pursuit of a goal that can be expressed in a couple lines.

What I’ve labeled telos is the difference between this baseline, the behavior complexity + C, and the actual utility complexity. As mentioned before, this is a measurement for the extent to which the behavior of an agent is better described by a categorical purpose than a procedure—in other words, how teleological its design is.

Back to Reality

To contrast “high-telos” and “low-telos” agents, let’s run through a couple scenarios.

Suppose we have a hypothetical AI which is constructed to maximize the production of paperclips, as an old tale describes. This is the spitting image of a high telos agent: its behavior complexity is tremendous, as evidenced by its ability to respond dynamically to whatever challenges it encounters in paperclip-optimization, but its utility complexity is roughly (thought not quite) as tiny as a program that checks known paperclips. The difference between these, its telos, is unthinkable.

On the other extreme, a pocket calculator is about as low-telos as they come. If you had to represent it as having a goal, it would be to spit out the appropriate calculation for its input, which is just as complex as its behavior (namely, spitting out the appropriate calculation for its input).

Somewhere in the middle is a human, and I suspect that’s the significance of the contrast between people as “utility maximizers” and “adaptation executors.” The goals of humans are not categorical, and so even the simplest utility function is not massively better than a collection of evolutionary incidentals. This makes it much more plausible to understand people in terms of behavior, as opposed to goals.

What’s exciting to me is that it’s very likely intelligence and telos are mostly orthogonal properties. Humans are the quintessential example—abstract thinking happens primarily in the neocortex, which is devoted to sensory processing. Both motor control and reward-optimization, the latter being where any telos we have certainly comes from, are completely separate from this center. In other words, it’s feasible for us to imitate only the reasoning portion of our neurology and build low-telos intelligent systems—pure processors of information which don’t meaningfully “want” anything.

  1. ^

    It makes nearly no difference, but this coding of “expected bits needed” is actually also appropriate for behavior complexity and the unspecified utility complexity. Both of these actually correspond to the same form. Notice that this is usually going to be almost exactly the length of the shortest program, though, since adding length decreases the probabilities exponentially. I found it easier to communicate what this operation actually means through the idea of a minimum.

    Tacking the specification complexity on to each utility form, we get:

    All sums here are just over all individual behavior/​utility forms that generate the agent’s actions.

    At a slightly deeper level, that’s what telos is: the difference between the average number of bits needed for behavior + the cost of conversion, and the average number of bits needed for both utility and specification within that utility.