RSS

rohinmshah(Rohin Shah)

Karma: 8,263

PhD student at the Center for Human-Compatible AI. Creator of the Alignment Newsletter. http://​​rohinshah.com/​​

AI Align­ment 2018-19 Review

rohinmshah28 Jan 2020 2:19 UTC
115 points
6 comments35 min readLW link

The Align­ment Prob­lem: Ma­chine Learn­ing and Hu­man Values

rohinmshah6 Oct 2020 17:41 UTC
109 points
5 comments6 min readLW link
(www.amazon.com)

Refram­ing Su­per­in­tel­li­gence: Com­pre­hen­sive AI Ser­vices as Gen­eral Intelligence

rohinmshah8 Jan 2019 7:12 UTC
93 points
74 comments5 min readLW link2 nominations2 reviews
(www.fhi.ox.ac.uk)

Align­ment Newslet­ter One Year Retrospective

rohinmshah10 Apr 2019 6:58 UTC
90 points
31 comments21 min readLW link

Co­her­ence ar­gu­ments do not im­ply goal-di­rected behavior

rohinmshah3 Dec 2018 3:26 UTC
75 points
65 comments7 min readLW link

Align­ment Newslet­ter #13: 07/​02/​18

rohinmshah2 Jul 2018 16:10 UTC
70 points
12 comments8 min readLW link
(mailchi.mp)

Pre­face to the se­quence on value learning

rohinmshah30 Oct 2018 22:04 UTC
65 points
6 comments3 min readLW link

AI safety with­out goal-di­rected behavior

rohinmshah7 Jan 2019 7:48 UTC
59 points
15 comments4 min readLW link

[AN #69] Stu­art Rus­sell’s new book on why we need to re­place the stan­dard model of AI

rohinmshah19 Oct 2019 0:30 UTC
56 points
12 comments15 min readLW link
(mailchi.mp)

Align­ment Newslet­ter Three Year Retrospective

rohinmshah7 Apr 2021 14:39 UTC
54 points
0 comments5 min readLW link

[AN #58] Mesa op­ti­miza­tion: what it is, and why we should care

rohinmshah24 Jun 2019 16:10 UTC
50 points
9 comments8 min readLW link
(mailchi.mp)

In­tu­itions about goal-di­rected behavior

rohinmshah1 Dec 2018 4:25 UTC
46 points
15 comments6 min readLW link

Will hu­mans build goal-di­rected agents?

rohinmshah5 Jan 2019 1:33 UTC
46 points
43 comments5 min readLW link

[AN #127]: Re­think­ing agency: Carte­sian frames as a for­mal­iza­tion of ways to carve up the world into an agent and its environment

rohinmshah2 Dec 2020 18:20 UTC
46 points
0 comments13 min readLW link
(mailchi.mp)

Con­clu­sion to the se­quence on value learning

rohinmshah3 Feb 2019 21:05 UTC
45 points
20 comments5 min readLW link

[AN #140]: The­o­ret­i­cal mod­els that pre­dict scal­ing laws

rohinmshah4 Mar 2021 18:10 UTC
45 points
0 comments10 min readLW link
(mailchi.mp)

Fu­ture di­rec­tions for am­bi­tious value learning

rohinmshah11 Nov 2018 15:53 UTC
43 points
9 comments4 min readLW link

Learn­ing prefer­ences by look­ing at the world

rohinmshah12 Feb 2019 22:25 UTC
43 points
10 comments7 min readLW link
(bair.berkeley.edu)