Alignment & Agency

9 Apr 2022 21:57 UTC

An Orthodox Case Against Utility Functions

abramdemski7 Apr 2020 19:18 UTC

154 points

66 comments8 min readLW link 2 reviews

The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables

johnswentworth18 Nov 2020 17:47 UTC

130 points

50 comments11 min readLW link 2 reviews

Alignment By Default

johnswentworth12 Aug 2020 18:54 UTC

177 points

101 comments11 min readLW link 2 reviews

An overview of 11 proposals for building safe advanced AI

evhub29 May 2020 20:38 UTC

220 points

37 comments38 min readLW link 2 reviews

The ground of optimization

Alex Flint20 Jun 2020 0:38 UTC

252 points

80 comments27 min readLW link 1 review

Search versus design

Alex Flint16 Aug 2020 16:53 UTC

109 points

40 comments36 min readLW link 1 review

Inner Alignment: Explain like I’m 12 Edition

Rafael Harth1 Aug 2020 15:24 UTC

185 points

47 comments13 min readLW link 2 reviews

Inaccessible information

paulfchristiano3 Jun 2020 5:10 UTC

84 points

17 comments14 min readLW link 2 reviews

(ai-alignment.com)

AGI safety from first principles: Introduction

Richard_Ngo28 Sep 2020 19:53 UTC

129 points

18 comments2 min readLW link 1 review

Is Success the Enemy of Freedom? (Full)

alkjash26 Oct 2020 20:25 UTC

313 points

69 comments9 min readLW link 1 review

(radimentary.wordpress.com)