RSS

In­ter­nal Align­ment (Hu­man)

TagLast edit: 16 Aug 2020 19:26 UTC by Raemon

Internal Alignment. By default, humans sometimes have internal conflict. You might frame that as conflict between subagents, or subprocesses within the human. You might instead frame it as a single agent making complicated decisions. The “internal alignment” hypothesis is that you can become much more productive/​happier/​fulfilled by getting yourself into alignment with yourself.

Notes on Integrity

David Gross3 Dec 2020 23:42 UTC
18 points
1 comment7 min readLW link

The shard the­ory of hu­man values

4 Sep 2022 4:28 UTC
235 points
66 comments24 min readLW link2 reviews

If you are too stressed, walk away from the front lines

Neil 12 Jun 2023 14:26 UTC
42 points
14 comments5 min readLW link

Non-Co­er­cive Perfectionism

Matt Goldenberg26 Jan 2021 16:53 UTC
24 points
25 comments3 min readLW link

An­nounc­ing the Align­ment of Com­plex Sys­tems Re­search Group

4 Jun 2022 4:10 UTC
91 points
20 comments5 min readLW link

Trust de­vel­ops grad­u­ally via mak­ing bids and set­ting boundaries

Richard_Ngo19 May 2023 22:16 UTC
125 points
12 comments4 min readLW link

In­ter­nal com­mu­ni­ca­tion framework

15 Nov 2022 12:41 UTC
38 points
14 comments12 min readLW link

Please don’t throw your mind away

TsviBT15 Feb 2023 21:41 UTC
336 points
44 comments18 min readLW link

Ar­tifi­cial Mo­ral Ad­vi­sors: A New Per­spec­tive from Mo­ral Psychology

David Gross28 Aug 2022 16:37 UTC
25 points
1 comment1 min readLW link
(dl.acm.org)

Tidy­ing One’s Room

Zvi16 Aug 2018 13:50 UTC
35 points
3 comments4 min readLW link
(thezvi.wordpress.com)

My Model Of EA Burnout

LoganStrohl25 Jan 2023 17:52 UTC
237 points
49 comments5 min readLW link

In­te­grat­ing dis­agree­ing subagents

Kaj_Sotala14 May 2019 14:06 UTC
141 points
15 comments21 min readLW link