RSS
Page 1

How the MtG Color Wheel Ex­plains AI Safety

Scott Garrabrant
15 Feb 2019 23:42 UTC
51 points
4 comments6 min readLW link

Align­ment Newslet­ter #45

rohinmshah
14 Feb 2019 2:10 UTC
25 points
2 commentsLW link

Hu­mans in­ter­pret­ing humans

Stuart_Armstrong
13 Feb 2019 19:03 UTC
10 points
1 commentLW link

An­chor­ing vs Taste: a model

Stuart_Armstrong
13 Feb 2019 19:03 UTC
11 points
0 commentsLW link

Nuances with as­crip­tion universality

evhub
12 Feb 2019 23:38 UTC
16 points
1 commentLW link

Learn­ing prefer­ences by look­ing at the world

rohinmshah
12 Feb 2019 22:25 UTC
45 points
10 comments7 min readLW link
(bair.berkeley.edu)

Would I think for ten thou­sand years?

Stuart_Armstrong
11 Feb 2019 19:37 UTC
25 points
12 commentsLW link

“Nor­ma­tive as­sump­tions” need not be complex

Stuart_Armstrong
11 Feb 2019 19:03 UTC
11 points
0 commentsLW link

Co­her­ent be­havi­our in the real world is an in­co­her­ent concept

ricraz
11 Feb 2019 17:00 UTC
29 points
8 comments8 min readLW link

Some Thoughts on Metaphilosophy

Wei_Dai
10 Feb 2019 0:28 UTC
53 points
20 commentsLW link

The Ar­gu­ment from Philo­soph­i­cal Difficulty

Wei_Dai
10 Feb 2019 0:28 UTC
47 points
18 commentsLW link

Re­in­force­ment Learn­ing in the Iter­ated Am­plifi­ca­tion Framework

William_S
9 Feb 2019 0:56 UTC
24 points
7 commentsLW link

HCH is not just Me­chan­i­cal Turk

William_S
9 Feb 2019 0:46 UTC
36 points
4 commentsLW link

Test Cases for Im­pact Reg­u­lari­sa­tion Methods

DanielFilan
6 Feb 2019 21:50 UTC
51 points
3 commentsLW link

Se­cu­rity amplification

paulfchristiano
6 Feb 2019 17:28 UTC
20 points
0 commentsLW link

Align­ment Newslet­ter #44

rohinmshah
6 Feb 2019 8:30 UTC
20 points
0 commentsLW link

When to use quantilization

RyanCarey
5 Feb 2019 17:17 UTC
52 points
5 commentsLW link

Con­clu­sion to the se­quence on value learning

rohinmshah
3 Feb 2019 21:05 UTC
44 points
13 commentsLW link

[Question] How does Gra­di­ent Des­cent In­ter­act with Good­hart?

Scott Garrabrant
2 Feb 2019 0:14 UTC
65 points
14 commentsLW link

Reli­a­bil­ity am­plifi­ca­tion

paulfchristiano
31 Jan 2019 21:12 UTC
21 points
3 commentsLW link