RSS

The Poin­t­ers Problem

TagLast edit: 6 Aug 2022 2:22 UTC by Noosphere89

The pointers problem refers to the fact that most humans would rather have an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing. It was introduced in a post with the same name.

The Poin­t­ers Prob­lem: Hu­man Values Are A Func­tion Of Hu­mans’ La­tent Variables

johnswentworth18 Nov 2020 17:47 UTC
111 points
44 comments11 min readLW link2 reviews

Don’t de­sign agents which ex­ploit ad­ver­sar­ial inputs

18 Nov 2022 1:48 UTC
62 points
62 comments12 min readLW link

Stable Poin­t­ers to Value II: En­vi­ron­men­tal Goals

abramdemski9 Feb 2018 6:03 UTC
18 points
2 comments4 min readLW link

Stable Poin­t­ers to Value III: Re­cur­sive Quantilization

abramdemski21 Jul 2018 8:06 UTC
19 points
4 comments4 min readLW link

Stable Poin­t­ers to Value: An Agent Embed­ded in Its Own Utility Function

abramdemski17 Aug 2017 0:22 UTC
15 points
9 comments5 min readLW link

Ro­bust Delegation

4 Nov 2018 16:38 UTC
111 points
10 comments1 min readLW link

[In­tro to brain-like-AGI safety] 9. Take­aways from neuro 2/​2: On AGI motivation

Steven Byrnes23 Mar 2022 12:48 UTC
38 points
6 comments21 min readLW link

Peo­ple care about each other even though they have im­perfect mo­ti­va­tional poin­t­ers?

TurnTrout8 Nov 2022 18:15 UTC
32 points
25 comments7 min readLW link

Don’t al­ign agents to eval­u­a­tions of plans

TurnTrout26 Nov 2022 21:16 UTC
43 points
48 comments18 min readLW link

Align­ment al­lows “non­ro­bust” de­ci­sion-in­fluences and doesn’t re­quire ro­bust grading

TurnTrout29 Nov 2022 6:23 UTC
57 points
41 comments15 min readLW link

Up­dat­ing Utility Functions

9 May 2022 9:44 UTC
36 points
6 comments8 min readLW link

The Poin­t­ers Prob­lem—Distilled

NinaR26 May 2022 22:44 UTC
9 points
0 comments2 min readLW link

The Poin­t­ers Prob­lem: Clar­ifi­ca­tions/​Variations

abramdemski5 Jan 2021 17:29 UTC
51 points
14 comments18 min readLW link

Hu­man sex­u­al­ity as an in­ter­est­ing case study of alignment

beren30 Dec 2022 13:37 UTC
37 points
26 comments3 min readLW link
No comments.