RSS
Page 1

Wire­head­ing is in the eye of the beholder

Stuart_Armstrong
30 Jan 2019 18:23 UTC
25 points
9 comments1 min readLW link

Real­ity-Re­veal­ing and Real­ity-Mask­ing Puzzles

AnnaSalamon
16 Jan 2020 16:15 UTC
212 points
45 comments13 min readLW link

The Zet­telkas­ten Method

abramdemski
20 Sep 2019 13:15 UTC
127 points
59 comments40 min readLW link

Why Do You Keep Hav­ing This Prob­lem?

Davis_Kingsley
20 Jan 2020 8:33 UTC
41 points
12 comments1 min readLW link

Re-in­tro­duc­ing Selec­tion vs Con­trol for Op­ti­miza­tion (Op­ti­miz­ing and Good­hart Effects—Clar­ify­ing Thoughts, Part 1)

Davidmanheim
2 Jul 2019 15:36 UTC
29 points
5 comments4 min readLW link

[Question] Use-cases for com­pu­ta­tions, other than run­ning them?

johnswentworth
19 Jan 2020 20:52 UTC
28 points
4 comments2 min readLW link

ACDT: a hack-y acausal de­ci­sion theory

Stuart_Armstrong
15 Jan 2020 17:22 UTC
46 points
13 comments7 min readLW link

[Question] What types of com­pute/​pro­cess­ing could we dis­t­in­guish?

MoritzG
18 Jan 2020 10:04 UTC
2 points
7 comments1 min readLW link

[Question] What are be­liefs you wouldn’t want (or would feel ap­pre­hen­sive about be­ing) pub­lic if you had (or have) them?

Mati_Roy
15 Jan 2020 5:30 UTC
5 points
9 comments1 min readLW link

Defi­ni­tions of Causal Ab­strac­tion: Re­view­ing Beck­ers & Halpern

johnswentworth
7 Jan 2020 0:03 UTC
17 points
1 comment4 min readLW link

Risk and un­cer­tainty: A false di­chotomy?

MichaelA
18 Jan 2020 3:09 UTC
3 points
9 comments20 min readLW link

Be­ing a Ro­bust, Co­her­ent Agent (V2)

Raemon
11 Jan 2020 2:06 UTC
115 points
26 comments7 min readLW link2 nominations2 reviews

In­ner al­ign­ment re­quires mak­ing as­sump­tions about hu­man values

Matthew Barnett
20 Jan 2020 18:38 UTC
23 points
6 comments4 min readLW link

The Road to Mazedom

Zvi
18 Jan 2020 14:10 UTC
69 points
17 comments7 min readLW link
(thezvi.wordpress.com)

Be­com­ing Unusu­ally Truth-Oriented

abramdemski
3 Jan 2020 1:27 UTC
93 points
4 comments10 min readLW link

Go F*** Someone

Jacobian
15 Jan 2020 18:39 UTC
12 points
21 comments8 min readLW link

Mak­ing de­ci­sions when both morally and em­piri­cally uncertain

MichaelA
2 Jan 2020 7:20 UTC
14 points
14 comments20 min readLW link

AI Align­ment Open Thread Oc­to­ber 2019

habryka
4 Oct 2019 1:28 UTC
28 points
56 comments1 min readLW link

The Align­ment-Com­pe­tence Trade-Off, Part 1: Coal­i­tion Size and Sig­nal­ing Costs

Gentzel
15 Jan 2020 23:10 UTC
29 points
4 comments3 min readLW link
(theconsequentialist.wordpress.com)

The Rocket Align­ment Problem

Eliezer Yudkowsky
4 Oct 2018 0:38 UTC
162 points
41 comments15 min readLW link6 nominations2 reviews