RSS
Page 1

What is Ab­strac­tion?

johnswentworth
6 Dec 2019 20:30 UTC
20 points
5 comments5 min readLW link

Un­der­stand­ing “Deep Dou­ble Des­cent”

evhub
6 Dec 2019 0:00 UTC
99 points
23 comments5 min readLW link

Values, Valence, and Alignment

G Gordon Worley III
5 Dec 2019 21:06 UTC
12 points
2 comments13 min readLW link

Or­a­cles: re­ject all deals—break su­per­ra­tional­ity, with superrationality

Stuart_Armstrong
5 Dec 2019 13:51 UTC
20 points
1 comment8 min readLW link

Seek­ing Power is Prov­ably In­stru­men­tally Con­ver­gent in MDPs

TurnTrout
5 Dec 2019 2:33 UTC
101 points
21 comments11 min readLW link
(arxiv.org)

[Question] What are some non-purely-sam­pling ways to do deep RL?

evhub
5 Dec 2019 0:09 UTC
15 points
8 comments2 min readLW link

Re­cent Progress in the The­ory of Neu­ral Networks

interstice
4 Dec 2019 23:11 UTC
60 points
9 comments9 min readLW link

[AN #76]: How dataset size af­fects ro­bust­ness, and bench­mark­ing safe ex­plo­ra­tion by mea­sur­ing con­straint violations

rohinmshah
4 Dec 2019 18:10 UTC
13 points
6 comments9 min readLW link
(mailchi.mp)

“Fully” acausal trade

Stuart_Armstrong
4 Dec 2019 16:39 UTC
16 points
2 comments1 min readLW link

A list of good heuris­tics that the case for AI x-risk fails

capybaralet
2 Dec 2019 19:26 UTC
16 points
11 comments2 min readLW link

What I talk about when I talk about AI x-risk: 3 core claims I want ma­chine learn­ing re­searchers to ad­dress.

capybaralet
2 Dec 2019 18:20 UTC
25 points
11 comments3 min readLW link

Coun­ter­fac­tu­als as a mat­ter of So­cial Convention

Chris_Leong
30 Nov 2019 10:35 UTC
11 points
4 comments2 min readLW link

Use­ful Does Not Mean Secure

Ben Pace
30 Nov 2019 2:05 UTC
48 points
12 comments11 min readLW link

Open-Box New­comb’s Prob­lem and the limi­ta­tions of the Era­sure framing

Chris_Leong
28 Nov 2019 11:32 UTC
6 points
25 comments3 min readLW link

[AN #75]: Solv­ing Atari and Go with learned game mod­els, and thoughts from a MIRI employee

rohinmshah
27 Nov 2019 18:10 UTC
38 points
1 comment10 min readLW link
(mailchi.mp)

A test for sym­bol ground­ing meth­ods: true zero-sum games

Stuart_Armstrong
26 Nov 2019 14:15 UTC
23 points
2 comments2 min readLW link

Thoughts on im­ple­ment­ing cor­rigible ro­bust alignment

steve2152
26 Nov 2019 14:06 UTC
16 points
1 comment6 min readLW link

Break­ing Or­a­cles: su­per­ra­tional­ity and acausal trade

Stuart_Armstrong
25 Nov 2019 10:40 UTC
23 points
15 comments1 min readLW link

Ul­tra-sim­plified re­search agenda

Stuart_Armstrong
22 Nov 2019 14:29 UTC
36 points
4 comments1 min readLW link

Analysing: Danger­ous mes­sages from fu­ture UFAI via Oracles

Stuart_Armstrong
22 Nov 2019 14:17 UTC
24 points
15 comments4 min readLW link