What failure looks like

paulfchristianoMar 17, 2019, 8:18 PM
434 points
55 comments8 min readLW link2 reviews

Align­ment Re­search Field Guide

abramdemskiMar 8, 2019, 7:57 PM
278 points
9 comments17 min readLW link2 reviews

You Get About Five Words

RaemonMar 12, 2019, 8:30 PM
247 points
81 comments1 min readLW link6 reviews

Rest Days vs Re­cov­ery Days

UnrealMar 19, 2019, 10:37 PM
222 points
36 comments6 min readLW link1 review

Per­son­al­ized Medicine For Real

sarahconstantinMar 4, 2019, 10:40 PM
214 points
16 comments5 min readLW link
(srconstantin.wordpress.com)

Subagents, akra­sia, and co­her­ence in humans

Kaj_SotalaMar 25, 2019, 2:24 PM
141 points
31 comments16 min readLW link

The Amish, and Strate­gic Norms around Technology

RaemonMar 24, 2019, 10:16 PM
138 points
18 comments3 min readLW link2 reviews

The Main Sources of AI Risk?

Mar 21, 2019, 6:28 PM
126 points
26 comments2 min readLW link

Subagents, in­tro­spec­tive aware­ness, and blending

Kaj_SotalaMar 2, 2019, 12:53 PM
110 points
19 comments9 min readLW link

Karma-Change Notifications

jimrandomhMar 2, 2019, 2:52 AM
92 points
44 comments1 min readLW link

What I’ve Learned From My Par­ents’ Ar­ranged Marriage

squidiousMar 26, 2019, 6:40 AM
91 points
16 comments5 min readLW link
(opalsandbonobos.blogspot.com)

mAIry’s room: AI rea­son­ing to solve philo­soph­i­cal problems

Stuart_ArmstrongMar 5, 2019, 8:24 PM
87 points
41 comments6 min readLW link2 reviews

Plans are Re­cur­sive & Why This is Important

RubyMar 10, 2019, 1:58 AM
81 points
11 comments10 min readLW link

Privacy

ZviMar 15, 2019, 8:20 PM
79 points
78 comments6 min readLW link
(thezvi.wordpress.com)

Com­par­i­son of de­ci­sion the­o­ries (with a fo­cus on log­i­cal-coun­ter­fac­tual de­ci­sion the­o­ries)

riceissaMar 16, 2019, 9:15 PM
78 points
11 comments10 min readLW link

Ac­tive Cu­ri­os­ity vs Open Curiosity

UnrealMar 15, 2019, 4:54 PM
76 points
24 comments3 min readLW link

Dependability

UnrealMar 26, 2019, 10:49 PM
75 points
39 comments8 min readLW link

In My Culture

Duncan Sabien (Deactivated)Mar 7, 2019, 7:22 AM
68 points
59 comments24 min readLW link2 reviews
(medium.com)

Three ways that “Suffi­ciently op­ti­mized agents ap­pear co­her­ent” can be false

Wei DaiMar 5, 2019, 9:52 PM
65 points
3 comments3 min readLW link

Boe­ing 737 MAX MCAS as an agent cor­rigi­bil­ity failure

ShmiMar 16, 2019, 1:46 AM
60 points
3 comments1 min readLW link

Declar­a­tive Mathematics

johnswentworthMar 21, 2019, 7:05 PM
59 points
10 comments3 min readLW link

How to Un­der­stand and Miti­gate Risk

Matt GoldenbergMar 12, 2019, 10:14 AM
55 points
30 comments16 min readLW link

Do you like bul­let points?

RaemonMar 26, 2019, 4:30 AM
52 points
38 comments2 min readLW link

Mo­ti­va­tion: You Have to Win in the Mo­ment

RubyMar 1, 2019, 12:26 AM
50 points
20 comments6 min readLW link

[Question] Un­der­stand­ing in­for­ma­tion cascades

Mar 13, 2019, 10:55 AM
50 points
42 comments3 min readLW link

[Question] How much fund­ing and re­searchers were in AI, and AI Safety, in 2018?

RaemonMar 3, 2019, 9:46 PM
41 points
11 comments1 min readLW link

Re­nam­ing “Front­page”

RaemonMar 9, 2019, 1:23 AM
41 points
16 comments4 min readLW link

[Fic­tion] IO.SYS

DataPacRatMar 10, 2019, 9:23 PM
40 points
4 comments22 min readLW link

Parfit’s Es­cape (Filk)

Gordon Seidoh WorleyMar 29, 2019, 2:31 AM
39 points
0 comments1 min readLW link

‘This Waifu Does Not Ex­ist’: 100,000 StyleGAN & GPT-2 samples

gwernMar 1, 2019, 4:29 AM
39 points
6 commentsLW link
(www.thiswaifudoesnotexist.net)

Please use real names, es­pe­cially for Align­ment Fo­rum?

Wei DaiMar 29, 2019, 2:54 AM
39 points
14 comments1 min readLW link

[Question] What would you need to be mo­ti­vated to an­swer “hard” LW ques­tions?

RaemonMar 28, 2019, 8:07 PM
38 points
37 comments3 min readLW link

Some thoughts af­ter read­ing Ar­tifi­cial In­tel­li­gence: A Modern Approach

swift_spiralMar 19, 2019, 11:39 PM
38 points
4 comments2 min readLW link

[Question] Did the re­cent black­mail dis­cus­sion change your be­liefs?

DagonMar 24, 2019, 4:06 PM
36 points
7 comments1 min readLW link

[Question] What’s wrong with these analo­gies for un­der­stand­ing In­formed Over­sight and IDA?

Wei DaiMar 20, 2019, 9:11 AM
35 points
3 comments1 min readLW link

How dan­ger­ous is it to ride a bi­cy­cle with­out a helmet?

habrykaMar 9, 2019, 2:58 AM
34 points
30 comments4 min readLW link

Sim­plified prefer­ences needed; sim­plified prefer­ences sufficient

Stuart_ArmstrongMar 5, 2019, 7:39 PM
33 points
6 comments3 min readLW link

[Question] What so­cieties have ever had le­gal or ac­cepted black­mail?

clone of saturnMar 17, 2019, 9:16 AM
33 points
23 comments1 min readLW link

Has “poli­tics is the mind-kil­ler” been a mind-kil­ler?

SonnieBaileyMar 17, 2019, 3:05 AM
31 points
26 comments3 min readLW link

[Question] What are CAIS’ bold­est near/​medium-term pre­dic­tions?

Bird ConceptMar 28, 2019, 1:14 PM
31 points
17 comments1 min readLW link

In­sights from Munkres’ Topology

Rafael HarthMar 17, 2019, 4:52 PM
30 points
0 comments14 min readLW link

Find­ing the variables

Stuart_ArmstrongMar 4, 2019, 7:37 PM
30 points
1 comment4 min readLW link

De­sign­ing agent in­cen­tives to avoid side effects

Mar 11, 2019, 8:55 PM
29 points
0 comments2 min readLW link
(medium.com)

Align­ment Newslet­ter #48

Rohin ShahMar 11, 2019, 9:10 PM
29 points
14 comments9 min readLW link
(mailchi.mp)

[Question] Willing to share some words that changed your be­liefs/​be­hav­ior?

Duncan Sabien (Deactivated)Mar 23, 2019, 2:08 AM
28 points
4 comments1 min readLW link

Book re­view: My Hid­den Chimp

BuckyMar 4, 2019, 9:55 AM
28 points
0 comments8 min readLW link

AI Safety Pr­ereq­ui­sites Course: Ba­sic ab­stract rep­re­sen­ta­tions of computation

RAISEMar 13, 2019, 7:38 PM
28 points
2 comments1 min readLW link

A the­ory of hu­man values

Stuart_ArmstrongMar 13, 2019, 3:22 PM
28 points
13 comments7 min readLW link

A cog­ni­tive in­ter­ven­tion for wrist pain

rmoehnMar 17, 2019, 5:26 AM
28 points
24 comments6 min readLW link

Hu­mans aren’t agents—what then for value learn­ing?

Charlie SteinerMar 15, 2019, 10:01 PM
28 points
16 comments3 min readLW link