Subagents

Tag

Why Subagents?

johnswentworth1 Aug 2019 22:17 UTC

179 points

50 comments7 min readLW link 1 review

Multi-agent predictive minds and AI alignment

Jan_Kulveit12 Dec 2018 23:48 UTC

63 points

18 comments10 min readLW link

Building up to an Internal Family Systems model

Kaj_Sotala26 Jan 2019 12:25 UTC

295 points

86 comments28 min readLW link 2 reviews

A non-mystical explanation of insight meditation and the three characteristics of existence: introduction and preamble

Kaj_Sotala5 May 2020 19:09 UTC

136 points

40 comments12 min readLW link

Mental Mountains

Scott Alexander27 Nov 2019 5:30 UTC

163 points

14 comments15 min readLW link 1 review

(slatestarcodex.com)

Book Summary: Consciousness and the Brain

Kaj_Sotala16 Jan 2019 14:43 UTC

183 points

20 comments26 min readLW link 1 review

Forcing yourself to keep your identity small is self-harm

Gordon Seidoh Worley3 Apr 2021 14:03 UTC

40 points

10 comments2 min readLW link

Resolving internal conflicts requires listening to what parts want

Richard_Ngo19 May 2023 0:04 UTC

73 points

0 comments4 min readLW link

My current take on Internal Family Systems “parts”

Kaj_Sotala26 Jun 2022 17:40 UTC

99 points

11 comments3 min readLW link

(kajsotala.fi)

The hostile telepaths problem

Valentine27 Oct 2024 15:26 UTC

436 points

107 comments15 min readLW link 6 reviews

Quick thoughts on the implications of multi-agent views of mind on AI takeover

Kaj_Sotala11 Dec 2023 6:34 UTC

48 points

14 comments4 min readLW link

Simulate and Defer To More Rational Selves

LoganStrohl17 Sep 2014 18:11 UTC

221 points

114 comments5 min readLW link

[Question] How effective are tulpas?

Evenflair9 Mar 2020 17:35 UTC

42 points

62 comments2 min readLW link

Subagents, trauma and rationality

Kaj_Sotala14 Aug 2019 13:14 UTC

113 points

4 comments19 min readLW link

Subagents, akrasia, and coherence in humans

Kaj_Sotala25 Mar 2019 14:24 UTC

143 points

31 comments16 min readLW link

Subagents, neural Turing machines, thought selection, and blindspots

Kaj_Sotala6 Aug 2019 21:15 UTC

87 points

3 comments12 min readLW link

Integrating disagreeing subagents

Kaj_Sotala14 May 2019 14:06 UTC

152 points

15 comments21 min readLW link

Subagents, introspective awareness, and blending

Kaj_Sotala2 Mar 2019 12:53 UTC

114 points

19 comments9 min readLW link

[Question] How to select a long-term goal and align my mind towards it?

Alexander24 Dec 2021 11:40 UTC

19 points

8 comments2 min readLW link

Book summary: Unlocking the Emotional Brain

Kaj_Sotala8 Oct 2019 19:11 UTC

340 points

46 comments21 min readLW link 3 reviews

Complex Behavior from Simple (Sub)Agents

moridinamael10 May 2019 21:44 UTC

113 points

14 comments9 min readLW link 1 review

Shoulder Advisors 101

Duncan Sabien (Inactive)9 Oct 2021 5:30 UTC

208 points

124 comments14 min readLW link 2 reviews

Consistently Inconsistent

Kaj_Sotala4 Aug 2011 22:33 UTC

81 points

25 comments5 min readLW link

City of Lights

Alicorn31 Mar 2010 23:30 UTC

56 points

43 comments4 min readLW link

On the construction of the self

Kaj_Sotala29 May 2020 13:04 UTC

80 points

18 comments17 min readLW link

Two Explorations

alkjash16 Dec 2020 21:27 UTC

63 points

8 comments9 min readLW link

(radimentary.wordpress.com)

Remarks 1–18 on GPT (compressed)

Cleo Nardo20 Mar 2023 22:27 UTC

148 points

35 comments31 min readLW link

Internalizing Internal Double Crux

TurnTrout30 Apr 2018 18:23 UTC

39 points

13 comments4 min readLW link

Announcing the Alignment of Complex Systems Research Group

Jan_Kulveit and technicalities

4 Jun 2022 4:10 UTC

92 points

20 comments5 min readLW link

Neural Basis for Global Workspace Theory

Hazard22 Jun 2020 4:19 UTC

31 points

9 comments8 min readLW link

A mechanistic model of meditation

Kaj_Sotala6 Nov 2019 21:37 UTC

139 points

12 comments21 min readLW link

Intrapersonal negotiation

datadataeverywhere23 Jan 2011 23:02 UTC

34 points

42 comments4 min readLW link

Reward Is Not Enough

Steven Byrnes16 Jun 2021 13:52 UTC

137 points

19 comments10 min readLW link 1 review

Wildfire of strategicness

TsviBT5 Jun 2023 13:59 UTC

40 points

19 comments1 min readLW link

Embedded Agency via Abstraction

johnswentworth26 Aug 2019 23:03 UTC

42 points

20 comments11 min readLW link

The Game of Masks

Slimepriestess27 Apr 2022 18:03 UTC

50 points

18 comments11 min readLW link

(hivewired.wordpress.com)

What Value Subagents?

Gordon Seidoh Worley20 Jul 2017 19:19 UTC

7 points

1 comment4 min readLW link

(mapandterritory.org)

Mental subagent implications for AI Safety

moridinamael3 Jan 2021 18:59 UTC

11 points

0 comments3 min readLW link

A Master-Slave Model of Human Preferences

Wei Dai29 Dec 2009 1:02 UTC

106 points

94 comments3 min readLW link

Sequence introduction: non-agent and multiagent models of mind

Kaj_Sotala7 Jan 2019 14:12 UTC

126 points

16 comments7 min readLW link 1 review

Game Theory without Argmax [Part 2]

Cleo Nardo11 Nov 2023 16:02 UTC

31 points

14 comments13 min readLW link

The Friendly Telepath Problems

Gunnar_Zarncke15 Feb 2026 15:08 UTC

32 points

9 comments7 min readLW link

Goodhart’s Law inside the human mind

Kaj_Sotala17 Apr 2023 13:48 UTC

129 points

13 comments16 min readLW link

Indecision and internalized authority figures

Kaj_Sotala6 Jul 2024 10:10 UTC

69 points

1 comment2 min readLW link

(kajsotala.fi)

Game Theory without Argmax [Part 1]

Cleo Nardo11 Nov 2023 15:59 UTC

78 points

18 comments19 min readLW link

Self-empathy as a source of “willpower”

Academian26 Oct 2010 14:20 UTC

83 points

32 comments2 min readLW link

Many therapy schools work with inner multiplicity (not just IFS)

David Althaus and Ewelina Tur

17 Sep 2022 10:27 UTC

53 points

16 comments18 min readLW link

The horror of what must, yet cannot, be true

Kaj_Sotala2 Jun 2022 10:20 UTC

56 points

18 comments2 min readLW link

(kajsotala.fi)

A non-mystical explanation of “no-self” (three characteristics series)

Kaj_Sotala8 May 2020 10:37 UTC

123 points

65 comments20 min readLW link 1 review

Conditions under which misaligned subagents can (not) arise in classifiers

anon111 Jul 2018 1:52 UTC

12 points

2 comments2 min readLW link

System 2 as working-memory augmented System 1 reasoning

Kaj_Sotala25 Sep 2019 8:39 UTC

117 points

23 comments16 min readLW link

Seven Shiny Stories

Alicorn1 Jun 2010 0:43 UTC

148 points

34 comments7 min readLW link

Eight Definitions of Observability

Scott Garrabrant10 Nov 2020 23:37 UTC

34 points

26 comments12 min readLW link

Tentatively considering emotional stories (IFS and “getting into Self”)

Kaj_Sotala30 Nov 2018 7:40 UTC

40 points

31 comments4 min readLW link

(kajsotala.fi)

Slack matters more than any outcome

Valentine31 Dec 2022 20:11 UTC

177 points

59 comments19 min readLW link 1 review

One: a story

Richard_Ngo10 Oct 2023 0:18 UTC

36 points

0 comments4 min readLW link

(www.narrativeark.xyz)

Robust Agency for People and Organizations

Raemon19 Jul 2019 1:18 UTC

65 points

10 comments12 min readLW link

Conflicts Between Mental Subagents: Expanding Wei Dai’s Master-Slave Model

Scott Alexander4 Aug 2010 9:16 UTC

71 points

81 comments10 min readLW link

Hierarchical Agency: A Missing Piece in AI Alignment

Jan_Kulveit27 Nov 2024 5:49 UTC

123 points

23 comments11 min readLW link 1 review

Why Productivity Systems Don’t Stick

Trinley Goldenberg16 Jan 2021 17:45 UTC

62 points

22 comments3 min readLW link

Embedded Agency (full-text version)

Scott Garrabrant and abramdemski

15 Nov 2018 19:49 UTC

225 points

17 comments54 min readLW link

Shard Theory: An Overview

David Udell11 Aug 2022 5:44 UTC

168 points

34 comments10 min readLW link

Three characteristics: impermanence

Kaj_Sotala5 Jun 2020 7:48 UTC

73 points

4 comments18 min readLW link

Craving, suffering, and predictive processing (three characteristics series)

Kaj_Sotala15 May 2020 13:21 UTC

99 points

56 comments19 min readLW link

Internal communication framework

rosehadshar and Nora_Ammann

15 Nov 2022 12:41 UTC

38 points

13 comments12 min readLW link

Resolving von Neumann-Morgenstern Inconsistent Preferences

niplav22 Oct 2024 11:45 UTC

39 points

5 comments58 min readLW link

Strategic ignorance and plausible deniability

Kaj_Sotala10 Aug 2011 9:30 UTC

62 points

59 comments4 min readLW link

Ayn Rand’s model of “living money”; and an upside of burnout

AnnaSalamon16 Nov 2024 2:59 UTC

246 points

64 comments5 min readLW link 2 reviews

[Question] Anyone been through IFS or coherence therapy?

warrenjordan15 Mar 2021 18:35 UTC

5 points

3 comments1 min readLW link

Non-Coercive Perfectionism

Trinley Goldenberg26 Jan 2021 16:53 UTC

25 points

25 comments3 min readLW link

Synthesis of subagents: exercise

Julija Kobrinovich20 Sep 2019 17:24 UTC

10 points

2 comments14 min readLW link

On Internal Family Systems and multi-agent minds: a reply to PJ Eby

Kaj_Sotala29 Oct 2019 14:56 UTC

41 points

31 comments25 min readLW link

Actually updating

SaraHax23 Aug 2019 17:46 UTC

56 points

10 comments4 min readLW link

The self-unalignment problem

Jan_Kulveit and rosehadshar

14 Apr 2023 12:10 UTC

159 points

24 comments10 min readLW link

Two Coordination Styles

abramdemski7 Feb 2018 9:00 UTC

42 points

14 comments7 min readLW link

From self to craving (three characteristics series)

Kaj_Sotala22 May 2020 12:16 UTC

63 points

21 comments11 min readLW link

A Framework for Internal Debugging

Trinley Goldenberg16 Jan 2019 16:04 UTC

45 points

3 comments5 min readLW link

Should rationalists be spiritual / Spirituality as overcoming delusion

Kaj_Sotala and romeostevensit

25 Mar 2024 16:48 UTC

51 points

58 comments29 min readLW link

Committing, Assuming, Externalizing, and Internalizing

Scott Garrabrant9 Nov 2020 16:59 UTC

31 points

25 comments10 min readLW link

Subagents of Cartesian Frames

Scott Garrabrant2 Nov 2020 22:02 UTC

53 points

6 comments8 min readLW link

Additive and Multiplicative Subagents

Scott Garrabrant6 Nov 2020 14:26 UTC

20 points

7 comments12 min readLW link

Silence

alkjash18 Mar 2018 4:10 UTC

61 points

17 comments4 min readLW link

(radimentary.wordpress.com)

A Clearer Thinking tool that teaches you to use Internal Family Systems concepts

spencerg28 Apr 2023 13:42 UTC

31 points

1 comment1 min readLW link

(programs.clearerthinking.org)

Integrating Three Models of (Human) Cognition

jbkjr23 Nov 2021 1:06 UTC

40 points

4 comments32 min readLW link

The Solitaire Principle: Game Theory for One

alkjash17 Jan 2018 0:14 UTC

26 points

8 comments9 min readLW link

(radimentary.wordpress.com)

Beware Social Coping Strategies

Lulie5 Feb 2018 4:48 UTC

58 points

24 comments7 min readLW link

Make an appointment with your saner self

MalcolmOcean8 Feb 2019 5:05 UTC

28 points

0 comments4 min readLW link

Which Parts Are “Me”?

Eliezer Yudkowsky22 Oct 2008 18:15 UTC

74 points

117 comments5 min readLW link

Prosaic misalignment from the Solomonoff Predictor

Cleo Nardo9 Dec 2022 17:53 UTC

43 points

3 comments5 min readLW link

TDT for Humans

alkjash28 Feb 2018 5:40 UTC

27 points

7 comments5 min readLW link

(radimentary.wordpress.com)

Reflection of Hierarchical Relationship via Nuanced Conditioning of Game Theory Approach for AI Development and Utilization

Kyoung-cheol Kim4 Jun 2021 7:20 UTC

2 points

2 comments7 min readLW link

Selection processes for subagents

Ryan Kidd30 Jun 2022 23:57 UTC

37 points

2 comments9 min readLW link

A Cautionary Note on Unlocking the Emotional Brain

eapache8 Feb 2020 17:21 UTC

56 points

20 comments2 min readLW link

Restricted Antinatalism on Subagents

Josephine13 May 2021 1:48 UTC

3 points

1 comment2 min readLW link

Alien parasite technical guy

PhilGoetz27 Jul 2010 16:51 UTC

69 points

55 comments3 min readLW link

Self and No-Self

Vaniver29 Dec 2019 6:15 UTC

48 points

3 comments2 min readLW link

Prune

alkjash12 Jan 2018 22:50 UTC

82 points

11 comments4 min readLW link

(radimentary.wordpress.com)

No comments.