RSS

Embed­ded Agency

TagLast edit: 4 Jan 2023 2:57 UTC by Daniel_Eth

Embedded Agency is the problem that an understanding of the theory of rational agents must account for the fact that the agents we create (and we ourselves) are inside the world or universe we are trying to affect, and not separated from it. This is in contrast with much current basic theory of AI or Rationality (such as Solomonoff induction or Bayesianism) which implicitly supposes a separation between the agent and the-things-the-agent-has-beliefs about. In other words, agents in this universe do not have Cartesian or dualistic boundaries like much of philosophy assumes, and are instead reductionist, that is agents are made up of non-agent parts like bits and atoms.

Embedded Agency is not a fully formalized research agenda, but Scott Garrabrant and Abram Demski have written the canonical explanation of the idea in their sequence Embedded Agency. This points to many of the core confusions we have about rational agency and attempts to tie them into a single picture.

Embed­ded Agency (full-text ver­sion)

15 Nov 2018 19:49 UTC
154 points
16 comments54 min readLW link

Embed­ded Agents

29 Oct 2018 19:53 UTC
202 points
41 comments1 min readLW link2 reviews

In­tro­duc­tion to Carte­sian Frames

Scott Garrabrant22 Oct 2020 13:00 UTC
146 points
29 comments22 min readLW link1 review

Hu­mans Are Embed­ded Agents Too

johnswentworth23 Dec 2019 19:21 UTC
80 points
19 comments5 min readLW link

Draft pa­pers for REALab and De­cou­pled Ap­proval on tampering

28 Oct 2020 16:01 UTC
47 points
2 comments1 min readLW link

Embed­ded World-Models

2 Nov 2018 16:07 UTC
88 points
16 comments1 min readLW link

De­ci­sion Theory

31 Oct 2018 18:41 UTC
114 points
46 comments1 min readLW link

Sub­sys­tem Alignment

6 Nov 2018 16:16 UTC
100 points
12 comments1 min readLW link

Ro­bust Delegation

4 Nov 2018 16:38 UTC
111 points
10 comments1 min readLW link

Embed­ded Curiosities

8 Nov 2018 14:19 UTC
89 points
1 comment2 min readLW link

You Only Get One Shot: an In­tu­ition Pump for Embed­ded Agency

Oliver Sourbut9 Jun 2022 21:38 UTC
22 points
4 comments2 min readLW link

Embed­ded Agency via Abstraction

johnswentworth26 Aug 2019 23:03 UTC
40 points
20 comments11 min readLW link

“em­bed­ded self-jus­tifi­ca­tion,” or some­thing like that

nostalgebraist3 Nov 2019 3:20 UTC
37 points
14 comments5 min readLW link
(nostalgebraist.tumblr.com)

(Dou­ble-)In­verse Embed­ded Agency Problem

shminux8 Jan 2020 4:30 UTC
27 points
8 comments2 min readLW link

Embed­ded Agency: Not Just an AI Problem

johnswentworth27 Jun 2019 0:35 UTC
15 points
10 comments2 min readLW link

(A → B) → A

Scott Garrabrant11 Sep 2018 22:38 UTC
64 points
11 comments2 min readLW link

Bot­world: a cel­lu­lar au­toma­ton for study­ing self-mod­ify­ing agents em­bed­ded in their environment

So8res12 Apr 2014 0:56 UTC
78 points
55 comments7 min readLW link

When does ra­tio­nal­ity-as-search have non­triv­ial im­pli­ca­tions?

nostalgebraist4 Nov 2018 22:42 UTC
66 points
11 comments3 min readLW link

Log­i­cal Up­date­less­ness as a Ro­bust Del­e­ga­tion Problem

Scott Garrabrant27 Oct 2017 21:16 UTC
30 points
2 comments2 min readLW link

Up­dates and ad­di­tions to “Embed­ded Agency”

29 Aug 2020 4:22 UTC
73 points
1 comment3 min readLW link

The whirlpool of reality

Gordon Seidoh Worley27 Sep 2020 2:36 UTC
9 points
2 comments2 min readLW link

Ad­di­tive Oper­a­tions on Carte­sian Frames

Scott Garrabrant26 Oct 2020 15:12 UTC
62 points
6 comments11 min readLW link

Biex­ten­sional Equivalence

Scott Garrabrant28 Oct 2020 14:07 UTC
43 points
13 comments10 min readLW link

Con­trol­lables and Ob­serv­ables, Revisited

Scott Garrabrant29 Oct 2020 16:38 UTC
35 points
5 comments8 min readLW link

Func­tors and Coarse Worlds

Scott Garrabrant30 Oct 2020 15:19 UTC
51 points
4 comments8 min readLW link

Sub-Sums and Sub-Tensors

Scott Garrabrant5 Nov 2020 18:06 UTC
34 points
4 comments8 min readLW link

Mul­ti­plica­tive Oper­a­tions on Carte­sian Frames

Scott Garrabrant3 Nov 2020 19:27 UTC
34 points
23 comments12 min readLW link

Subagents of Carte­sian Frames

Scott Garrabrant2 Nov 2020 22:02 UTC
48 points
5 comments8 min readLW link

Carte­sian Frames Definitions

Rob Bensinger8 Nov 2020 12:44 UTC
25 points
0 comments4 min readLW link

Com­mit­ting, As­sum­ing, Ex­ter­nal­iz­ing, and Internalizing

Scott Garrabrant9 Nov 2020 16:59 UTC
31 points
25 comments10 min readLW link

Eight Defi­ni­tions of Observability

Scott Garrabrant10 Nov 2020 23:37 UTC
34 points
26 comments12 min readLW link

Time in Carte­sian Frames

Scott Garrabrant11 Nov 2020 20:25 UTC
48 points
16 comments7 min readLW link

AXRP Epi­sode 9 - Finite Fac­tored Sets with Scott Garrabrant

DanielFilan24 Jun 2021 22:10 UTC
56 points
2 comments58 min readLW link

MIRI/​OP ex­change about de­ci­sion theory

Rob Bensinger25 Aug 2021 22:44 UTC
49 points
7 comments10 min readLW link

In­fra-Bayesi­anism Distil­la­tion: Real­iz­abil­ity and De­ci­sion Theory

Thomas Larsen26 May 2022 21:57 UTC
34 points
9 comments18 min readLW link

Gen­eral al­ign­ment properties

TurnTrout8 Aug 2022 23:40 UTC
49 points
2 comments1 min readLW link

Con­se­quen­tial­ists: One-Way Pat­tern Traps

David Udell16 Jan 2023 20:48 UTC
47 points
3 comments14 min readLW link

What Pro­gram Are You?

RobinHanson12 Oct 2009 0:29 UTC
36 points
43 comments2 min readLW link

Time­less De­ci­sion The­ory and Meta-Cir­cu­lar De­ci­sion Theory

Eliezer Yudkowsky20 Aug 2009 22:07 UTC
40 points
37 comments10 min readLW link

Minds: An Introduction

Rob Bensinger11 Mar 2015 19:00 UTC
41 points
2 comments6 min readLW link

Are pre-speci­fied util­ity func­tions about the real world pos­si­ble in prin­ci­ple?

mlogan11 Jul 2018 18:46 UTC
24 points
7 comments4 min readLW link

Ad­di­tive and Mul­ti­plica­tive Subagents

Scott Garrabrant6 Nov 2020 14:26 UTC
20 points
7 comments12 min readLW link

Troll Bridge

abramdemski23 Aug 2019 18:36 UTC
77 points
59 comments12 min readLW link

Coun­ter­fac­tual Plan­ning in AGI Systems

Koen.Holtman3 Feb 2021 13:54 UTC
8 points
0 comments5 min readLW link

Phy­lac­tery De­ci­sion Theory

Bunthut2 Apr 2021 20:55 UTC
14 points
6 comments2 min readLW link

Iden­ti­fi­a­bil­ity Prob­lem for Su­per­ra­tional De­ci­sion Theories

Bunthut9 Apr 2021 20:33 UTC
17 points
16 comments2 min readLW link

Es­cap­ing the Löbian Obstacle

Morgan_Rogers16 Jun 2021 0:02 UTC
12 points
10 comments7 min readLW link

An­throp­ics and Embed­ded Agency

dadadarren26 Jun 2021 1:45 UTC
7 points
2 comments2 min readLW link

Op­ti­miza­tion Con­cepts in the Game of Life

16 Oct 2021 20:51 UTC
74 points
16 comments11 min readLW link

A Pos­si­ble Re­s­olu­tion To Spu­ri­ous Counterfactuals

JoshuaOSHickman6 Dec 2021 18:26 UTC
15 points
5 comments4 min readLW link

Ex­plor­ing De­ci­sion The­o­ries With Coun­ter­fac­tu­als and Dy­namic Agent Self-Pointers

JoshuaOSHickman18 Dec 2021 21:50 UTC
2 points
0 comments4 min readLW link

For­mal­iz­ing Two Prob­lems of Real­is­tic World Models

So8res22 Jan 2015 23:12 UTC
32 points
5 comments2 min readLW link

A Rephras­ing Of and Foot­note To An Embed­ded Agency Proposal

JoshuaOSHickman9 Mar 2022 18:13 UTC
5 points
0 comments5 min readLW link

[Question] Choice := An­throp­ics un­cer­tainty? And po­ten­tial im­pli­ca­tions for agency

Antoine de Scorraille21 Apr 2022 16:38 UTC
6 points
1 comment1 min readLW link

Ex­plor­ing Mild Be­havi­our in Embed­ded Agents

Megan Kinniment27 Jun 2022 18:56 UTC
21 points
4 comments18 min readLW link

De­liber­a­tion, Re­ac­tions, and Con­trol: Ten­ta­tive Defi­ni­tions and a Res­tate­ment of In­stru­men­tal Convergence

Oliver Sourbut27 Jun 2022 17:25 UTC
10 points
0 comments11 min readLW link

Strange Loops—Self-Refer­ence from Num­ber The­ory to AI

ojorgensen28 Sep 2022 14:10 UTC
9 points
5 comments18 min readLW link

Op­ti­miza­tion at a Distance

johnswentworth16 May 2022 17:58 UTC
79 points
16 comments4 min readLW link

LLMs may cap­ture key com­po­nents of hu­man agency

catubc17 Nov 2022 20:14 UTC
25 points
0 comments4 min readLW link

Riffing on the agent type

Quinn8 Dec 2022 0:19 UTC
16 points
0 comments4 min readLW link

Beyond Re­wards and Values: A Non-du­al­is­tic Ap­proach to Univer­sal Intelligence

Akira Pyinya30 Dec 2022 19:05 UTC
20 points
4 comments14 min readLW link

Causal rep­re­sen­ta­tion learn­ing as a tech­nique to pre­vent goal misgeneralization

PabloAMC4 Jan 2023 0:07 UTC
18 points
0 comments8 min readLW link

Nor­ma­tive vs De­scrip­tive Models of Agency

mattmacdermott2 Feb 2023 20:28 UTC
22 points
5 comments4 min readLW link

Perfor­mance guaran­tees in clas­si­cal learn­ing the­ory and in­fra-Bayesianism

matolcsid28 Feb 2023 18:37 UTC
8 points
4 comments31 min readLW link

Could Roko’s basilisk aca­su­ally bar­gain with a pa­per­clip max­i­mizer?

Christopher King13 Mar 2023 18:21 UTC
1 point
7 comments1 min readLW link