Selec­tion Theorems

A Selection Theorem tells us something about what agent type signatures will be selected for in some broad class of environments. Two important points:

  • The theorem need not directly talk about selection—e.g. it could state some general property of optima, of “broad” optima, of “most” optima, or of optima under a particular kind of selection pressure (like natural selection or financial profitability).

  • Any given theorem need not address every question about agent type signatures; it just needs to tell us something about agent type signatures.

For instance, the subagents argument says that, when our “agents” have internal state in a coherence-theorem-like setup, the “goals” will be pareto optimality over multiple utilities, rather than optimality of a single utility function. This says very little about embeddedness or world models or internal architecture; it addresses only one narrow aspect of agent type signatures. And, like the coherence theorems, it doesn’t directly talk about selection; it just says that any strategy which doesn’t fit the pareto-optimal form is strictly dominated by some other strategy (and therefore we’d expect that other strategy to be selected, all else equal).

From: Selection Theorems: A Program For Understanding Agents

Selec­tion The­o­rems: A Pro­gram For Un­der­stand­ing Agents

johnswentworth28 Sep 2021 5:03 UTC
Epistemic Strate­gies of Selec­tion Theorems

adamShimi18 Oct 2021 8:57 UTC
Some Ex­ist­ing Selec­tion Theorems

johnswentworth30 Sep 2021 16:13 UTC
Un­der­stand­ing Selec­tion Theorems

adamk28 May 2022 1:49 UTC
What Selec­tion The­o­rems Do We Ex­pect/​Want?

johnswentworth1 Oct 2021 16:03 UTC
Les­sons from Con­ver­gent Evolu­tion for AI Alignment

27 Mar 2023 16:25 UTC
[Question] Why The Fo­cus on Ex­pected Utility Max­imisers?

DragonGod27 Dec 2022 15:49 UTC
Selec­tion pro­cesses for subagents

Ryan Kidd30 Jun 2022 23:57 UTC
How Do Selec­tion The­o­rems Re­late To In­ter­pretabil­ity?

johnswentworth9 Jun 2022 19:39 UTC
Clar­ify­ing the Agent-Like Struc­ture Problem

johnswentworth29 Sep 2022 21:28 UTC
Riffing on the agent type

Quinn8 Dec 2022 0:19 UTC
Pro­ject In­tro: Selec­tion The­o­rems for Modularity

4 Apr 2022 12:59 UTC
AXRP Epi­sode 15 - Nat­u­ral Ab­strac­tions with John Wentworth

DanielFilan23 May 2022 5:40 UTC
Select Agent Speci­fi­ca­tions as Nat­u­ral Abstractions

lukemarks7 Apr 2023 23:16 UTC
