What’s an accidental optimizer?
Imagine a pool of tenure-track professors. They all want to get tenure and they all have some random strategies. If we look at the ones who do get tenure, they happen to have strategies that are good for getting tenure. For instance, they happen to have strategies for producing lots of citations for their publication; they happen to have strategies for collaborating with well-known professors in the field. So if we select for professors who are tenured now, we are going to get a set of professors who are good at accumulating citations or collaborating with well-established professors. This can be true even if the professors aren’t explicitly trying to maximize citations or their chances of collaborating with well-known professors.
This is accidental optimizer selection: Agents are selected based on some criterion. For instance, professors are selected based on their tenure success, college graduates are selected based on their job market success, etc. When we look at those selected agents, they happen to have strategies that are good for succeeding in that criterion, even if it is not the explicit goal of the agent. This phenomenon shows up in AI training, successful CEOs, straight A college students, etc.
In this post, I will discuss what accidental optimizers are, how to spot one in the wild, and why it matters to identify them.
Accidental optimizers vs. not accidental optimizers
Accidental optimizers can be separated from non-accidental optimizers by answering two questions:
Do agents intend to optimize?
Do agents in fact optimize?
Not all agents who intend to optimize will in fact optimize; On the other hand, not all agents who in fact optimize will deliberately intend to optimize. From the answers to the two questions, we can have a 2 by 2 matrix:
Intend to optimize?
|No: Accidental||Yes: Deliberate|
In fact optimize?
|Optimize||Accidental optimizer||Deliberate optimizer|
|Not Optimize||Accidental non-optimizer||Deliberate non-optimizer|
Let’s break down each scenario with specific examples:
Accidental optimizers. They happen to have strategies that optimize some criterion. For instance, we select for college graduates who got good jobs after their graduation. We may end up having graduates who happen to have strategies for getting good jobs, like majoring in computer science. They might have picked the major without the job market in mind.
Deliberate optimizers. They know precisely what strategies optimize the criterion. For instance, Elon Musk and his companies have been very successful in several different industries/fields, which suggests he deliberately trained or developed strategies that would generalize well.
Accidental non-optimizers. They neither intend to nor in fact optimize. For instance, a headphone that is turned off or rock staying still. In general, do-nothing cases fall into this category.
Deliberate non-optimizers. They intend to optimize but do so ineffectively. For instance, some college students went to college to pursue a business degree, trying to make more money after graduation. But they end up making way less money than the people who got a CS degree.
Why does it matter?
Why do we care about accidental optimizers? We might, for example, be looking for a CEO to learn from or to hire for a new company; if a CEO has accidentally stumbled into the right strategies, this would be less likely to generalize.
Deliberately optimizing a criterion is more likely to produce generalizable strategies than an accidental outcome of optimizing that criterion. “No human being with the deliberate goal of maximizing their alleles’ inclusive genetic fitness, would ever eat a cookie unless they were starving. ” In order to produce such generalizable strategies, we prefer deliberate optimizers who have clear intention to optimize something and know precisely what strategies optimize that thing.
In the meantime, we want to be able to separate accidental optimizers from deliberate optimizers in order to help identify generalizable strategies.
How to spot an accidental optimizer?
Imagine that we have a whole bunch of AIs with different random parameters. They are all playing the video game Fortnite in the training process. If we look at the ones that survive (coming out of the training process alive), just like natural selection, they happen to have a set of strategies to win the game. Let’s play a different video game, for instance World of Warcraft, with those AIs that came out of Fortnight. Some of the AIs might fail to survive. When that happens, we argue that they accidentally landed on a strategy that was adaptive within the Fortnite ruleset. So they are “accidental optimizers”, not “deliberate optimizers”.
Another example: Suppose we have a pool of CEOs who have managed one or more companies successfully. We randomly select a set of CEOs who have (or happen to have) strategies to succeed in their own company. If we want to know if those selected CEOs are accidental optimizers or deliberate optimizers, we could look at if they are also successful managing other companies, ideally in different contexts, for instance different size, different industry, etc. If they were successful in managing company A, but failed in managing companies B and C, it’s likely they are accidental optimizers and they happen to have strategies to manage company A successfully. But those strategies failed to generalize.
Not all agents who in fact optimize intend to deliberately optimize in the first place. Just because an agent behaves as if it is optimizing for something does not mean that it is necessarily explicitly trying to optimize that. In particular, this post suggests calling those agents accidental optimizers. Accidental optimizers do exist in various systems, for instance AI training, professors getting tenure, successful CEO selection, etc. We expect accidental optimizers do not generalize well. Therefore, we can look for evidence of generalization to test if an optimizer is an accidental one or not.
Thank you Miranda Dixon-Luinenburg / lesswrong for editing help!