All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 201820192020 2021 2022 2023 2024 2025 2026

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

What failure looks like

paulfchristiano17 Mar 2019 20:18 UTC

466 points

56 comments8 min readLW link 2 reviews

Alignment Research Field Guide

abramdemski8 Mar 2019 19:57 UTC

290 points

11 comments17 min readLW link 2 reviews

You Get About Five Words

Raemon12 Mar 2019 20:30 UTC

286 points

82 comments1 min readLW link 6 reviews

Rest Days vs Recovery Days

Unreal19 Mar 2019 22:37 UTC

243 points

36 comments6 min readLW link 1 review

Personalized Medicine For Real

sarahconstantin4 Mar 2019 22:40 UTC

218 points

16 comments5 min readLW link

(srconstantin.wordpress.com)

Subagents, akrasia, and coherence in humans

Kaj_Sotala25 Mar 2019 14:24 UTC

143 points

31 comments16 min readLW link

The Amish, and Strategic Norms around Technology

Raemon24 Mar 2019 22:16 UTC

142 points

18 comments3 min readLW link 2 reviews

The Main Sources of AI Risk?

Daniel Kokotajlo and Wei Dai

21 Mar 2019 18:28 UTC

136 points

29 comments2 min readLW link

Subagents, introspective awareness, and blending

Kaj_Sotala2 Mar 2019 12:53 UTC

114 points

19 comments9 min readLW link

What I’ve Learned From My Parents’ Arranged Marriage

squidious26 Mar 2019 6:40 UTC

100 points

16 comments5 min readLW link

(opalsandbonobos.blogspot.com)

Karma-Change Notifications

jimrandomh2 Mar 2019 2:52 UTC

92 points

44 comments1 min readLW link

mAIry’s room: AI reasoning to solve philosophical problems

Stuart_Armstrong5 Mar 2019 20:24 UTC

87 points

41 comments6 min readLW link 2 reviews

Plans are Recursive & Why This is Important

Ruby10 Mar 2019 1:58 UTC

86 points

11 comments10 min readLW link

Comparison of decision theories (with a focus on logical-counterfactual decision theories)

riceissa16 Mar 2019 21:15 UTC

82 points

20 comments10 min readLW link

Privacy

Zvi15 Mar 2019 20:20 UTC

79 points

78 comments6 min readLW link

(thezvi.wordpress.com)

In My Culture

Duncan Sabien (Inactive)7 Mar 2019 7:22 UTC

78 points

60 comments24 min readLW link 2 reviews

(medium.com)

Active Curiosity vs Open Curiosity

Unreal15 Mar 2019 16:54 UTC

76 points

24 comments3 min readLW link

Dependability

Unreal26 Mar 2019 22:49 UTC

75 points

39 comments8 min readLW link

Three ways that “Sufficiently optimized agents appear coherent” can be false

Wei Dai5 Mar 2019 21:52 UTC

65 points

3 comments3 min readLW link

Boeing 737 MAX MCAS as an agent corrigibility failure

Shmi16 Mar 2019 1:46 UTC

60 points

3 comments1 min readLW link

Declarative Mathematics

johnswentworth21 Mar 2019 19:05 UTC

59 points

10 comments3 min readLW link

How to Understand and Mitigate Risk

Trinley Goldenberg12 Mar 2019 10:14 UTC

55 points

30 comments16 min readLW link

Do you like bullet points?

Raemon26 Mar 2019 4:30 UTC

52 points

38 comments2 min readLW link

Motivation: You Have to Win in the Moment

Ruby1 Mar 2019 0:26 UTC

50 points

20 comments6 min readLW link

[Question] Understanding information cascades

Bird Concept and Ben Pace

13 Mar 2019 10:55 UTC

50 points

42 comments3 min readLW link

Renaming “Frontpage”

Raemon9 Mar 2019 1:23 UTC

42 points

16 comments4 min readLW link

Parfit’s Escape (Filk)

Gordon Seidoh Worley29 Mar 2019 2:31 UTC

41 points

1 comment1 min readLW link

[Question] How much funding and researchers were in AI, and AI Safety, in 2018?

Raemon3 Mar 2019 21:46 UTC

41 points

11 comments1 min readLW link

[Fiction] IO.SYS

DataPacRat10 Mar 2019 21:23 UTC

40 points

4 comments22 min readLW link

‘This Waifu Does Not Exist’: 100,000 StyleGAN & GPT-2 samples

gwern1 Mar 2019 4:29 UTC

39 points

6 comments1 min readLW link

(www.thiswaifudoesnotexist.net)

Please use real names, especially for Alignment Forum?

Wei Dai29 Mar 2019 2:54 UTC

39 points

14 comments1 min readLW link

[Question] What would you need to be motivated to answer “hard” LW questions?

Raemon28 Mar 2019 20:07 UTC

38 points

37 comments3 min readLW link

Some thoughts after reading Artificial Intelligence: A Modern Approach

swift_spiral19 Mar 2019 23:39 UTC

38 points

4 comments2 min readLW link

[Question] Did the recent blackmail discussion change your beliefs?

Dagon24 Mar 2019 16:06 UTC

36 points

7 comments1 min readLW link

[Question] What’s wrong with these analogies for understanding Informed Oversight and IDA?

Wei Dai20 Mar 2019 9:11 UTC

35 points

3 comments1 min readLW link

How dangerous is it to ride a bicycle without a helmet?

habryka9 Mar 2019 2:58 UTC

34 points

30 comments4 min readLW link

Simplified preferences needed; simplified preferences sufficient

Stuart_Armstrong5 Mar 2019 19:39 UTC

33 points

6 comments3 min readLW link

[Question] What societies have ever had legal or accepted blackmail?

clone of saturn17 Mar 2019 9:16 UTC

33 points

23 comments1 min readLW link

Insights from Munkres’ Topology

Rafael Harth17 Mar 2019 16:52 UTC

31 points

0 comments14 min readLW link

Has “politics is the mind-killer” been a mind-killer?

SonnieBailey17 Mar 2019 3:05 UTC

31 points

26 comments3 min readLW link

A cognitive intervention for wrist pain

rmoehn17 Mar 2019 5:26 UTC

31 points

24 comments6 min readLW link

[Question] What are CAIS’ boldest near/medium-term predictions?

Bird Concept28 Mar 2019 13:14 UTC

31 points

17 comments1 min readLW link

Finding the variables

Stuart_Armstrong4 Mar 2019 19:37 UTC

30 points

1 comment4 min readLW link

Designing agent incentives to avoid side effects

Vika and TurnTrout

11 Mar 2019 20:55 UTC

29 points

0 comments2 min readLW link

(medium.com)

Alignment Newsletter #48

Rohin Shah11 Mar 2019 21:10 UTC

29 points

14 comments9 min readLW link

(mailchi.mp)

[Question] Willing to share some words that changed your beliefs/behavior?

Duncan Sabien (Inactive)23 Mar 2019 2:08 UTC

28 points

4 comments1 min readLW link

Book review: My Hidden Chimp

Bucky4 Mar 2019 9:55 UTC

28 points

0 comments8 min readLW link

AI Safety Prerequisites Course: Basic abstract representations of computation

RAISE13 Mar 2019 19:38 UTC

28 points

2 comments1 min readLW link

A theory of human values

Stuart_Armstrong13 Mar 2019 15:22 UTC

28 points

13 comments7 min readLW link

Humans aren’t agents—what then for value learning?

Charlie Steiner15 Mar 2019 22:01 UTC

28 points

16 comments3 min readLW link