RSS

Newsletters

Newslet­ters are col­lected sum­maries of re­cent events, posts, and aca­demic pa­pers.

Less Wrong’s most pro­lific newslet­ter is Ro­hin Shah’s weekly Align­ment Newslet­ter.

Fore­cast­ing Newslet­ter. June 2020.

NunoSempere
1 Jul 2020 9:46 UTC
26 points
0 comments8 min readLW link

[AN #102]: Meta learn­ing by GPT-3, and a list of full pro­pos­als for AI alignment

rohinmshah
3 Jun 2020 17:20 UTC
38 points
6 comments10 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #36

rohinmshah
12 Dec 2018 1:10 UTC
22 points
0 comments11 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #47

rohinmshah
4 Mar 2019 4:30 UTC
21 points
0 comments8 min readLW link
(mailchi.mp)

[AN #112]: Eng­ineer­ing a Safer World

rohinmshah
13 Aug 2020 17:20 UTC
22 points
1 comment12 min readLW link
(mailchi.mp)

Fore­cast­ing Newslet­ter: April 2020

NunoSempere
30 Apr 2020 16:41 UTC
21 points
3 comments6 min readLW link

Fore­cast­ing Newslet­ter: May 2020.

NunoSempere
31 May 2020 12:35 UTC
8 points
1 comment20 min readLW link

May gw­ern.net newsletter

gwern
1 Jun 2018 14:47 UTC
73 points
3 comments1 min readLW link
(www.gwern.net)

Ra­tion­al­ity Feed: Last Month’s Best Posts

deluks917
12 Feb 2018 13:18 UTC
64 points
1 comment3 min readLW link

Align­ment Newslet­ter #13: 07/​02/​18

rohinmshah
2 Jul 2018 16:10 UTC
74 points
12 comments8 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #16: 07/​23/​18

rohinmshah
23 Jul 2018 16:20 UTC
44 points
0 comments12 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #15: 07/​16/​18

rohinmshah
16 Jul 2018 16:10 UTC
42 points
0 comments15 min readLW link
(mailchi.mp)

[AN #58] Mesa op­ti­miza­tion: what it is, and why we should care

rohinmshah
24 Jun 2019 16:10 UTC
51 points
9 comments8 min readLW link
(mailchi.mp)

Ra­tion­al­ity Feed: Last Month’s Best Posts

deluks917
21 Mar 2018 14:12 UTC
48 points
2 comments2 min readLW link

[AN #59] How ar­gu­ments for AI risk have changed over time

rohinmshah
8 Jul 2019 17:20 UTC
43 points
4 comments7 min readLW link
(mailchi.mp)

The Align­ment Newslet­ter #1: 04/​09/​18

rohinmshah
9 Apr 2018 16:00 UTC
10 points
3 comments4 min readLW link

The Align­ment Newslet­ter #2: 04/​16/​18

rohinmshah
16 Apr 2018 16:00 UTC
8 points
0 comments5 min readLW link

The Align­ment Newslet­ter #3: 04/​23/​18

rohinmshah
23 Apr 2018 16:00 UTC
9 points
0 comments6 min readLW link

The Align­ment Newslet­ter #4: 04/​30/​18

rohinmshah
30 Apr 2018 16:00 UTC
8 points
0 comments3 min readLW link

The Align­ment Newslet­ter #5: 05/​07/​18

rohinmshah
7 May 2018 16:00 UTC
8 points
0 comments7 min readLW link

The Align­ment Newslet­ter #6: 05/​14/​18

rohinmshah
14 May 2018 16:00 UTC
8 points
0 comments2 min readLW link

The Align­ment Newslet­ter #7: 05/​21/​18

rohinmshah
21 May 2018 16:00 UTC
8 points
0 comments5 min readLW link

The Align­ment Newslet­ter #8: 05/​28/​18

rohinmshah
28 May 2018 16:00 UTC
8 points
0 comments6 min readLW link

The Align­ment Newslet­ter #9: 06/​04/​18

rohinmshah
4 Jun 2018 16:00 UTC
8 points
0 comments2 min readLW link

The Align­ment Newslet­ter #10: 06/​11/​18

rohinmshah
11 Jun 2018 16:00 UTC
16 points
0 comments9 min readLW link

The Align­ment Newslet­ter #11: 06/​18/​18

rohinmshah
18 Jun 2018 16:00 UTC
8 points
0 comments10 min readLW link

The Align­ment Newslet­ter #12: 06/​25/​18

rohinmshah
25 Jun 2018 16:00 UTC
15 points
0 comments3 min readLW link

Align­ment Newslet­ter #14

rohinmshah
9 Jul 2018 16:20 UTC
15 points
0 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #17

rohinmshah
30 Jul 2018 16:10 UTC
35 points
0 comments13 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #18

rohinmshah
6 Aug 2018 16:00 UTC
19 points
0 comments10 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #19

rohinmshah
14 Aug 2018 2:10 UTC
19 points
0 comments13 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #20

rohinmshah
20 Aug 2018 16:00 UTC
13 points
2 comments6 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #21

rohinmshah
27 Aug 2018 16:20 UTC
26 points
0 comments7 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #22

rohinmshah
3 Sep 2018 16:10 UTC
15 points
0 comments6 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #23

rohinmshah
10 Sep 2018 17:10 UTC
17 points
0 comments7 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #24

rohinmshah
17 Sep 2018 16:20 UTC
10 points
4 comments12 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #25

rohinmshah
24 Sep 2018 16:10 UTC
22 points
3 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #26

rohinmshah
2 Oct 2018 16:10 UTC
14 points
0 comments7 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #27

rohinmshah
9 Oct 2018 1:10 UTC
16 points
0 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #28

rohinmshah
15 Oct 2018 21:20 UTC
11 points
0 comments8 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #29

rohinmshah
22 Oct 2018 16:20 UTC
16 points
0 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #30

rohinmshah
29 Oct 2018 16:10 UTC
31 points
2 comments6 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #31

rohinmshah
5 Nov 2018 23:50 UTC
19 points
0 comments12 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #32

rohinmshah
12 Nov 2018 17:20 UTC
20 points
0 comments12 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #33

rohinmshah
19 Nov 2018 17:20 UTC
25 points
0 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #34

rohinmshah
26 Nov 2018 23:10 UTC
26 points
0 comments10 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #35

rohinmshah
4 Dec 2018 1:10 UTC
15 points
0 comments6 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #37

rohinmshah
17 Dec 2018 19:10 UTC
26 points
4 comments10 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #38

rohinmshah
25 Dec 2018 16:10 UTC
9 points
0 comments8 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #39

rohinmshah
1 Jan 2019 8:10 UTC
33 points
2 comments5 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #40

rohinmshah
8 Jan 2019 20:10 UTC
21 points
2 comments5 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #41

rohinmshah
17 Jan 2019 8:10 UTC
23 points
6 comments10 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #42

rohinmshah
22 Jan 2019 2:00 UTC
21 points
1 comment10 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #43

rohinmshah
29 Jan 2019 21:10 UTC
15 points
2 comments13 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #44

rohinmshah
6 Feb 2019 8:30 UTC
20 points
0 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #45

rohinmshah
14 Feb 2019 2:10 UTC
27 points
2 comments8 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #46

rohinmshah
22 Feb 2019 0:10 UTC
18 points
0 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #48

rohinmshah
11 Mar 2019 21:10 UTC
31 points
14 comments9 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #49

rohinmshah
20 Mar 2019 4:20 UTC
26 points
1 comment11 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #50

rohinmshah
28 Mar 2019 18:10 UTC
16 points
2 comments10 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #51

rohinmshah
3 Apr 2019 4:10 UTC
28 points
2 comments15 min readLW link
(mailchi.mp)

Align­ment Newslet­ter #52

rohinmshah
6 Apr 2019 1:20 UTC
20 points
1 comment8 min readLW link
(mailchi.mp)

Align­ment Newslet­ter One Year Retrospective

rohinmshah
10 Apr 2019 6:58 UTC
93 points
31 comments21 min readLW link

Align­ment Newslet­ter #53

rohinmshah
18 Apr 2019 17:20 UTC
22 points
0 comments8 min readLW link
(mailchi.mp)

[AN #54] Box­ing a finite-hori­zon AI sys­tem to keep it unambitious

rohinmshah
28 Apr 2019 5:20 UTC
21 points
0 comments8 min readLW link
(mailchi.mp)

[AN #55] Reg­u­la­tory mar­kets and in­ter­na­tional stan­dards as a means of en­sur­ing benefi­cial AI

rohinmshah
5 May 2019 2:20 UTC
18 points
2 comments8 min readLW link
(mailchi.mp)

[AN #56] Should ML re­searchers stop run­ning ex­per­i­ments be­fore mak­ing hy­pothe­ses?

rohinmshah
21 May 2019 2:20 UTC
22 points
8 comments9 min readLW link
(mailchi.mp)

[AN #57] Why we should fo­cus on ro­bust­ness in AI safety, and the analo­gous prob­lems in programming

rohinmshah
5 Jun 2019 23:20 UTC
28 points
15 comments7 min readLW link
(mailchi.mp)

[AN #60] A new AI challenge: Minecraft agents that as­sist hu­man play­ers in cre­ative mode

rohinmshah
22 Jul 2019 17:00 UTC
25 points
6 comments9 min readLW link
(mailchi.mp)

[AN #61] AI policy and gov­er­nance, from two peo­ple in the field

rohinmshah
5 Aug 2019 17:00 UTC
11 points
0 comments9 min readLW link
(mailchi.mp)

[AN #62] Are ad­ver­sar­ial ex­am­ples caused by real but im­per­cep­ti­ble fea­tures?

rohinmshah
22 Aug 2019 17:10 UTC
28 points
10 comments9 min readLW link
(mailchi.mp)

[AN #63] How ar­chi­tec­ture search, meta learn­ing, and en­vi­ron­ment de­sign could lead to gen­eral intelligence

rohinmshah
10 Sep 2019 19:10 UTC
24 points
12 comments8 min readLW link
(mailchi.mp)

[AN #64]: Us­ing Deep RL and Re­ward Uncer­tainty to In­cen­tivize Prefer­ence Learning

rohinmshah
16 Sep 2019 17:10 UTC
11 points
8 comments7 min readLW link
(mailchi.mp)

[AN #65]: Learn­ing use­ful skills by watch­ing hu­mans “play”

rohinmshah
23 Sep 2019 17:30 UTC
12 points
0 comments9 min readLW link
(mailchi.mp)

[AN #66]: De­com­pos­ing ro­bust­ness into ca­pa­bil­ity ro­bust­ness and al­ign­ment robustness

rohinmshah
30 Sep 2019 18:00 UTC
12 points
1 comment7 min readLW link
(mailchi.mp)

[AN #67]: Creat­ing en­vi­ron­ments in which to study in­ner al­ign­ment failures

rohinmshah
7 Oct 2019 17:10 UTC
17 points
0 comments8 min readLW link
(mailchi.mp)

[AN #68]: The at­tain­able util­ity the­ory of impact

rohinmshah
14 Oct 2019 17:00 UTC
19 points
0 comments8 min readLW link
(mailchi.mp)

[AN #69] Stu­art Rus­sell’s new book on why we need to re­place the stan­dard model of AI

rohinmshah
19 Oct 2019 0:30 UTC
64 points
12 comments15 min readLW link
(mailchi.mp)

[AN #70]: Agents that help hu­mans who are still learn­ing about their own preferences

rohinmshah
23 Oct 2019 17:10 UTC
18 points
0 comments9 min readLW link
(mailchi.mp)

[AN #71]: Avoid­ing re­ward tam­per­ing through cur­rent-RF optimization

rohinmshah
30 Oct 2019 17:10 UTC
13 points
0 comments7 min readLW link
(mailchi.mp)

[AN #72]: Align­ment, ro­bust­ness, method­ol­ogy, and sys­tem build­ing as re­search pri­ori­ties for AI safety

rohinmshah
6 Nov 2019 18:10 UTC
28 points
4 comments10 min readLW link
(mailchi.mp)

[AN #73]: De­tect­ing catas­trophic failures by learn­ing how agents tend to break

rohinmshah
13 Nov 2019 18:10 UTC
11 points
0 comments7 min readLW link
(mailchi.mp)

[AN #74]: Separat­ing benefi­cial AI into com­pe­tence, al­ign­ment, and cop­ing with impacts

rohinmshah
20 Nov 2019 18:20 UTC
19 points
0 comments7 min readLW link
(mailchi.mp)

[AN #75]: Solv­ing Atari and Go with learned game mod­els, and thoughts from a MIRI employee

rohinmshah
27 Nov 2019 18:10 UTC
39 points
1 comment10 min readLW link
(mailchi.mp)

[AN #76]: How dataset size af­fects ro­bust­ness, and bench­mark­ing safe ex­plo­ra­tion by mea­sur­ing con­straint violations

rohinmshah
4 Dec 2019 18:10 UTC
14 points
6 comments9 min readLW link
(mailchi.mp)

[AN #77]: Dou­ble de­scent: a unifi­ca­tion of statis­ti­cal the­ory and mod­ern ML practice

rohinmshah
18 Dec 2019 18:30 UTC
21 points
4 comments14 min readLW link
(mailchi.mp)

[AN #78] For­mal­iz­ing power and in­stru­men­tal con­ver­gence, and the end-of-year AI safety char­ity comparison

rohinmshah
26 Dec 2019 1:10 UTC
26 points
10 comments9 min readLW link
(mailchi.mp)

[AN #79]: Re­cur­sive re­ward mod­el­ing as an al­ign­ment tech­nique in­te­grated with deep RL

rohinmshah
1 Jan 2020 18:00 UTC
12 points
0 comments12 min readLW link
(mailchi.mp)

[AN #80]: Why AI risk might be solved with­out ad­di­tional in­ter­ven­tion from longtermists

rohinmshah
2 Jan 2020 18:20 UTC
34 points
93 comments10 min readLW link
(mailchi.mp)

[AN #81]: Univer­sal­ity as a po­ten­tial solu­tion to con­cep­tual difficul­ties in in­tent alignment

rohinmshah
8 Jan 2020 18:00 UTC
22 points
4 comments11 min readLW link
(mailchi.mp)

[AN #82]: How OpenAI Five dis­tributed their train­ing computation

rohinmshah
15 Jan 2020 18:20 UTC
20 points
0 comments8 min readLW link
(mailchi.mp)

[AN #83]: Sam­ple-effi­cient deep learn­ing with ReMixMatch

rohinmshah
22 Jan 2020 18:10 UTC
16 points
4 comments11 min readLW link
(mailchi.mp)

[AN #84] Re­view­ing AI al­ign­ment work in 2018-19

rohinmshah
29 Jan 2020 18:30 UTC
24 points
0 comments6 min readLW link
(mailchi.mp)

[AN #85]: The nor­ma­tive ques­tions we should be ask­ing for AI al­ign­ment, and a sur­pris­ingly good chatbot

rohinmshah
5 Feb 2020 18:20 UTC
16 points
2 comments7 min readLW link
(mailchi.mp)

[AN #86]: Im­prov­ing de­bate and fac­tored cog­ni­tion through hu­man experiments

rohinmshah
12 Feb 2020 18:10 UTC
16 points
0 comments9 min readLW link
(mailchi.mp)

[AN #87]: What might hap­pen as deep learn­ing scales even fur­ther?

rohinmshah
19 Feb 2020 18:20 UTC
30 points
0 comments4 min readLW link
(mailchi.mp)

[AN #88]: How the prin­ci­pal-agent liter­a­ture re­lates to AI risk

rohinmshah
27 Feb 2020 9:10 UTC
20 points
0 comments9 min readLW link
(mailchi.mp)

[AN #89]: A unify­ing for­mal­ism for prefer­ence learn­ing algorithms

rohinmshah
4 Mar 2020 18:20 UTC
17 points
0 comments9 min readLW link
(mailchi.mp)

[AN #90]: How search land­scapes can con­tain self-re­in­forc­ing feed­back loops

rohinmshah
11 Mar 2020 17:30 UTC
12 points
6 comments8 min readLW link
(mailchi.mp)

[AN #91]: Con­cepts, im­ple­men­ta­tions, prob­lems, and a bench­mark for im­pact measurement

rohinmshah
18 Mar 2020 17:10 UTC
16 points
10 comments13 min readLW link
(mailchi.mp)

[AN #92]: Learn­ing good rep­re­sen­ta­tions with con­trastive pre­dic­tive coding

rohinmshah
25 Mar 2020 17:20 UTC
19 points
1 comment10 min readLW link
(mailchi.mp)

[AN #93]: The Precipice we’re stand­ing at, and how we can back away from it

rohinmshah
1 Apr 2020 17:10 UTC
25 points
0 comments7 min readLW link
(mailchi.mp)

[AN #94]: AI al­ign­ment as trans­la­tion be­tween hu­mans and machines

rohinmshah
8 Apr 2020 17:10 UTC
11 points
0 comments7 min readLW link
(mailchi.mp)

[AN #95]: A frame­work for think­ing about how to make AI go well

rohinmshah
15 Apr 2020 17:10 UTC
20 points
2 comments10 min readLW link
(mailchi.mp)

[AN #96]: Buck and I dis­cuss/​ar­gue about AI Alignment

rohinmshah
22 Apr 2020 17:20 UTC
17 points
4 comments10 min readLW link
(mailchi.mp)

[AN #97]: Are there his­tor­i­cal ex­am­ples of large, ro­bust dis­con­ti­nu­ities?

rohinmshah
29 Apr 2020 17:30 UTC
15 points
0 comments10 min readLW link
(mailchi.mp)

[AN #98]: Un­der­stand­ing neu­ral net train­ing by see­ing which gra­di­ents were helpful

rohinmshah
6 May 2020 17:10 UTC
20 points
3 comments9 min readLW link
(mailchi.mp)

[AN #99]: Dou­bling times for the effi­ciency of AI algorithms

rohinmshah
13 May 2020 17:20 UTC
30 points
0 comments10 min readLW link
(mailchi.mp)

[AN #100]: What might go wrong if you learn a re­ward func­tion while acting

rohinmshah
20 May 2020 17:30 UTC
33 points
2 comments12 min readLW link
(mailchi.mp)

[AN #101]: Why we should rigor­ously mea­sure and fore­cast AI progress

rohinmshah
27 May 2020 17:20 UTC
15 points
0 comments10 min readLW link
(mailchi.mp)

[AN #103]: ARCHES: an agenda for ex­is­ten­tial safety, and com­bin­ing nat­u­ral lan­guage with deep RL

rohinmshah
10 Jun 2020 17:20 UTC
26 points
1 comment10 min readLW link
(mailchi.mp)

[AN #104]: The per­ils of in­ac­cessible in­for­ma­tion, and what we can learn about AI al­ign­ment from COVID

rohinmshah
18 Jun 2020 17:10 UTC
19 points
5 comments8 min readLW link
(mailchi.mp)

[AN #105]: The eco­nomic tra­jec­tory of hu­man­ity, and what we might mean by optimization

rohinmshah
24 Jun 2020 17:30 UTC
24 points
3 comments11 min readLW link
(mailchi.mp)

[AN #106]: Eval­u­at­ing gen­er­al­iza­tion abil­ity of learned re­ward models

rohinmshah
1 Jul 2020 17:20 UTC
14 points
2 comments11 min readLW link
(mailchi.mp)

[AN #107]: The con­ver­gent in­stru­men­tal sub­goals of goal-di­rected agents

rohinmshah
16 Jul 2020 6:47 UTC
13 points
1 comment8 min readLW link
(mailchi.mp)

[AN #108]: Why we should scru­ti­nize ar­gu­ments for AI risk

rohinmshah
16 Jul 2020 6:47 UTC
19 points
6 comments12 min readLW link
(mailchi.mp)

[AN #109]: Teach­ing neu­ral nets to gen­er­al­ize the way hu­mans would

rohinmshah
22 Jul 2020 17:10 UTC
17 points
3 comments9 min readLW link
(mailchi.mp)

[AN #110]: Learn­ing fea­tures from hu­man feed­back to en­able re­ward learning

rohinmshah
29 Jul 2020 17:20 UTC
13 points
2 comments10 min readLW link
(mailchi.mp)

Rus­sian x-risks newslet­ter, sum­mer 2019

avturchin
7 Sep 2019 9:50 UTC
41 points
5 comments4 min readLW link

Ra­tional Feed: Last Month’s Best Posts

deluks917
2 May 2018 18:19 UTC
43 points
0 comments2 min readLW link

Fore­cast­ing Newslet­ter: July 2020.

NunoSempere
1 Aug 2020 17:08 UTC
22 points
4 comments22 min readLW link

June gw­ern.net newsletter

gwern
4 Jul 2018 22:59 UTC
36 points
0 comments1 min readLW link
(www.gwern.net)

[AN #111]: The Cir­cuits hy­pothe­ses for deep learning

rohinmshah
5 Aug 2020 17:40 UTC
23 points
0 comments9 min readLW link
(mailchi.mp)

Call for con­trib­u­tors to the Align­ment Newsletter

rohinmshah
21 Aug 2019 18:21 UTC
39 points
0 comments4 min readLW link

Novem­ber 2018 gw­ern.net newsletter

gwern
1 Dec 2018 13:57 UTC
35 points
0 comments1 min readLW link
(www.gwern.net)

June 2019 gw­ern.net newsletter

gwern
1 Jul 2019 14:35 UTC
30 points
0 comments1 min readLW link
(www.gwern.net)