RSS

rohinmshah(Rohin Shah)

Karma: 8,263

PhD student at the Center for Human-Compatible AI. Creator of the Alignment Newsletter. http://​​rohinshah.com/​​

[AN #146]: Plau­si­ble sto­ries of how we might fail to avert an ex­is­ten­tial catastrophe

rohinmshah14 Apr 2021 17:30 UTC
15 points
1 comment8 min readLW link
(mailchi.mp)

[AN #145]: Our three year an­niver­sary!

rohinmshah9 Apr 2021 17:48 UTC
19 points
0 comments8 min readLW link
(mailchi.mp)

Align­ment Newslet­ter Three Year Retrospective

rohinmshah7 Apr 2021 14:39 UTC
54 points
0 comments5 min readLW link

[AN #144]: How lan­guage mod­els can also be fine­tuned for non-lan­guage tasks

rohinmshah2 Apr 2021 17:20 UTC
19 points
0 comments6 min readLW link
(mailchi.mp)

[AN #143]: How to make em­bed­ded agents that rea­son prob­a­bil­is­ti­cally about their environments

rohinmshah24 Mar 2021 17:20 UTC
13 points
3 comments8 min readLW link
(mailchi.mp)

[AN #142]: The quest to un­der­stand a net­work well enough to reim­ple­ment it by hand

rohinmshah17 Mar 2021 17:10 UTC
34 points
4 comments8 min readLW link
(mailchi.mp)

[AN #141]: The case for prac­tic­ing al­ign­ment work on GPT-3 and other large models

rohinmshah10 Mar 2021 18:30 UTC
26 points
4 comments8 min readLW link
(mailchi.mp)

[AN #140]: The­o­ret­i­cal mod­els that pre­dict scal­ing laws

rohinmshah4 Mar 2021 18:10 UTC
45 points
0 comments10 min readLW link
(mailchi.mp)

[AN #139]: How the sim­plic­ity of re­al­ity ex­plains the suc­cess of neu­ral nets

rohinmshah24 Feb 2021 18:30 UTC
26 points
3 comments12 min readLW link
(mailchi.mp)

[AN #138]: Why AI gov­er­nance should find prob­lems rather than just solv­ing them

rohinmshah17 Feb 2021 18:50 UTC
12 points
0 comments9 min readLW link
(mailchi.mp)

[AN #137]: Quan­tify­ing the benefits of pre­train­ing on down­stream task performance

rohinmshah10 Feb 2021 18:10 UTC
18 points
0 comments8 min readLW link
(mailchi.mp)

[AN #136]: How well will GPT-N perform on down­stream tasks?

rohinmshah3 Feb 2021 18:10 UTC
21 points
2 comments9 min readLW link
(mailchi.mp)

[AN #135]: Five prop­er­ties of goal-di­rected systems

rohinmshah27 Jan 2021 18:10 UTC
33 points
0 comments8 min readLW link
(mailchi.mp)

[AN #134]: Un­der­speci­fi­ca­tion as a cause of frag­ility to dis­tri­bu­tion shift

rohinmshah21 Jan 2021 18:10 UTC
13 points
0 comments7 min readLW link
(mailchi.mp)

[AN #133]: Build­ing ma­chines that can co­op­er­ate (with hu­mans, in­sti­tu­tions, or other ma­chines)

rohinmshah13 Jan 2021 18:10 UTC
14 points
0 comments9 min readLW link
(mailchi.mp)

[AN #132]: Com­plex and sub­tly in­cor­rect ar­gu­ments as an ob­sta­cle to debate

rohinmshah6 Jan 2021 18:20 UTC
18 points
1 comment19 min readLW link
(mailchi.mp)

[AN #131]: For­mal­iz­ing the ar­gu­ment of ig­nored at­tributes in a util­ity function

rohinmshah31 Dec 2020 18:20 UTC
9 points
2 comments19 min readLW link
(mailchi.mp)

[AN #130]: A new AI x-risk pod­cast, and re­views of the field

rohinmshah24 Dec 2020 18:20 UTC
8 points
0 comments7 min readLW link
(mailchi.mp)

[AN #129]: Ex­plain­ing dou­ble de­scent by mea­sur­ing bias and variance

rohinmshah16 Dec 2020 18:10 UTC
14 points
1 comment7 min readLW link
(mailchi.mp)

[AN #128]: Pri­ori­tiz­ing re­search on AI ex­is­ten­tial safety based on its ap­pli­ca­tion to gov­er­nance demands

rohinmshah9 Dec 2020 18:20 UTC
16 points
2 comments10 min readLW link
(mailchi.mp)