In­ter­nal Align­ment (Hu­man)

TagLast edit: 13 Jan 2021 22:54 UTC by plex

Internal Alignment is a broadly desirable state. By default, humans sometimes have internal conflict. You might frame that as conflict between subagents, or subprocesses within the human. You might instead frame it as a single agent making complicated decisions. The “internal alignment” hypothesis is that you can become much more productive/​happier/​fulfilled by getting yourself into alignment with yourself.

The shard the­ory of hu­man values

4 Sep 2022 4:28 UTC
225 points
59 comments24 min readLW link

Tidy­ing One’s Room

Zvi16 Aug 2018 13:50 UTC
35 points
3 comments4 min readLW link

Non-Co­er­cive Perfectionism

Matt Goldenberg26 Jan 2021 16:53 UTC
22 points
25 comments3 min readLW link

Notes on Integrity

David Gross3 Dec 2020 23:42 UTC
18 points
1 comment7 min readLW link

An­nounc­ing the Align­ment of Com­plex Sys­tems Re­search Group

4 Jun 2022 4:10 UTC
83 points
18 comments5 min readLW link

Ar­tifi­cial Mo­ral Ad­vi­sors: A New Per­spec­tive from Mo­ral Psychology

David Gross28 Aug 2022 16:37 UTC
25 points
1 comment1 min readLW link

In­ter­nal com­mu­ni­ca­tion framework

15 Nov 2022 12:41 UTC
37 points
14 comments12 min readLW link

Please don’t throw your mind away

TsviBT15 Feb 2023 21:41 UTC
308 points
41 comments18 min readLW link

In­te­grat­ing dis­agree­ing subagents

Kaj_Sotala14 May 2019 14:06 UTC
130 points
15 comments21 min readLW link

My Model Of EA Burnout

LoganStrohl25 Jan 2023 17:52 UTC
224 points
48 comments5 min readLW link