All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

AGI Ruin: A List of Lethalities

Eliezer Yudkowsky5 Jun 2022 22:05 UTC

955 points

711 comments30 min readLW link 3 reviews

Where I agree and disagree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC

907 points

224 comments18 min readLW link 2 reviews

It’s Probably Not Lithium

Natália28 Jun 2022 21:24 UTC

445 points

186 comments28 min readLW link 1 review

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

elspood21 Jun 2022 23:55 UTC

369 points

42 comments7 min readLW link 1 review

What Are You Tracking In Your Head?

johnswentworth28 Jun 2022 19:30 UTC

289 points

84 comments4 min readLW link 1 review

A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res15 Jun 2022 13:10 UTC

275 points

55 comments10 min readLW link 1 review

Comment reply: my low-quality thoughts on why CFAR didn’t get farther with a “real/efficacious art of rationality”

AnnaSalamon9 Jun 2022 2:12 UTC

274 points

80 comments17 min readLW link 1 review

Humans are very reliable agents

alyssavance16 Jun 2022 22:02 UTC

270 points

35 comments3 min readLW link

Slow motion videos as AI risk intuition pumps

Andrew_Critch14 Jun 2022 19:31 UTC

242 points

41 comments2 min readLW link 1 review

Contra Hofstadter on GPT-3 Nonsense

rictic15 Jun 2022 21:53 UTC

238 points

24 comments2 min readLW link

AGI Safety FAQ / all-dumb-questions-allowed thread

Aryeh Englander7 Jun 2022 5:47 UTC

227 points

526 comments4 min readLW link

The prototypical catastrophic AI action is getting root access to its datacenter

Buck2 Jun 2022 23:46 UTC

181 points

13 comments2 min readLW link 1 review

The inordinately slow spread of good AGI conversations in ML

Rob Bensinger21 Jun 2022 16:09 UTC

173 points

62 comments8 min readLW link

Announcing the Inverse Scaling Prize ($250k Prize Pool)

Ethan Perez, Ian McKenzie and Sam Bowman

27 Jun 2022 15:58 UTC

171 points

14 comments7 min readLW link

AI Could Defeat All Of Us Combined

HoldenKarnofsky9 Jun 2022 15:50 UTC

170 points

42 comments17 min readLW link

(www.cold-takes.com)

Godzilla Strategies

johnswentworth11 Jun 2022 15:44 UTC

166 points

72 comments3 min readLW link

On A List of Lethalities

Zvi13 Jun 2022 12:30 UTC

165 points

50 comments54 min readLW link 1 review

(thezvi.wordpress.com)

A transparency and interpretability tech tree

evhub16 Jun 2022 23:44 UTC

163 points

11 comments18 min readLW link 1 review

Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc

johnswentworth4 Jun 2022 5:41 UTC

162 points

56 comments2 min readLW link 1 review

Why all the fuss about recursive self-improvement?

So8res12 Jun 2022 20:53 UTC

159 points

62 comments7 min readLW link 1 review

Nonprofit Boards are Weird

HoldenKarnofsky23 Jun 2022 14:40 UTC

158 points

26 comments20 min readLW link 1 review

(www.cold-takes.com)

Limits to Legibility

Jan_Kulveit29 Jun 2022 17:42 UTC

158 points

11 comments5 min readLW link 1 review

[Question] why assume AGIs will optimize for fixed goals?

nostalgebraist10 Jun 2022 1:28 UTC

157 points

60 comments4 min readLW link 2 reviews

LessWrong Has Agree/Disagree Voting On All New Comment Threads

Ben Pace24 Jun 2022 0:43 UTC

154 points

219 comments2 min readLW link 1 review

Staying Split: Sabatini and Social Justice

Duncan Sabien (Inactive)8 Jun 2022 8:32 UTC

153 points

28 comments21 min readLW link

Steam

abramdemski20 Jun 2022 17:38 UTC

153 points

13 comments5 min readLW link 1 review

Public beliefs vs. Private beliefs

Eli Tyre1 Jun 2022 21:33 UTC

146 points

30 comments5 min readLW link

A descriptive, not prescriptive, overview of current AI Alignment Research

Jan, Logan Riggs, jacquesthibs and janus

6 Jun 2022 21:59 UTC

139 points

21 comments7 min readLW link

Contra EY: Can AGI destroy us without trial & error?

nsokolsky13 Jun 2022 18:26 UTC

137 points

72 comments15 min readLW link

Announcing the LessWrong Curated Podcast

Ben Pace and Solenoid_Entity

22 Jun 2022 22:16 UTC

137 points

27 comments1 min readLW link

AI-Written Critiques Help Humans Notice Flaws

paulfchristiano25 Jun 2022 17:22 UTC

137 points

5 comments3 min readLW link

(openai.com)

Will Capabilities Generalise More?

Ramana Kumar29 Jun 2022 17:12 UTC

133 points

39 comments4 min readLW link

Confused why a “capabilities research is good for alignment progress” position isn’t discussed more

Kaj_Sotala2 Jun 2022 21:41 UTC

131 points

27 comments4 min readLW link

Intergenerational trauma impeding cooperative existential safety efforts

Andrew_Critch3 Jun 2022 8:13 UTC

130 points

29 comments3 min readLW link

“Pivotal Acts” means something specific

Raemon7 Jun 2022 21:56 UTC

128 points

23 comments2 min readLW link

Let’s See You Write That Corrigibility Tag

Eliezer Yudkowsky19 Jun 2022 21:11 UTC

125 points

70 comments1 min readLW link

CFAR Handbook: Introduction

CFAR!Duncan28 Jun 2022 16:53 UTC

122 points

12 comments1 min readLW link

Scott Aaronson is joining OpenAI to work on AI safety

peterbarnett18 Jun 2022 4:06 UTC

117 points

31 comments1 min readLW link

(scottaaronson.blog)

Leaving Google, Joining the Nucleic Acid Observatory

jefftk10 Jun 2022 17:00 UTC

114 points

4 comments3 min readLW link

(www.jefftk.com)

Conversation with Eliezer: What do you want the system to do?

Orpheus1625 Jun 2022 17:36 UTC

114 points

38 comments2 min readLW link

Who models the models that model models? An exploration of GPT-3′s in-context model fitting ability

Lovre7 Jun 2022 19:37 UTC

112 points

16 comments9 min readLW link

Relationship Advice Repository

Ruby20 Jun 2022 14:39 UTC

110 points

36 comments38 min readLW link

wrapper-minds are the enemy

nostalgebraist17 Jun 2022 1:58 UTC

105 points

43 comments8 min readLW link

The Mountain Troll

lsusr11 Jun 2022 9:14 UTC

105 points

26 comments2 min readLW link

Yes, AI research will be substantially curtailed if a lab causes a major disaster

lc14 Jun 2022 22:17 UTC

104 points

31 comments2 min readLW link

Units of Exchange

CFAR!Duncan28 Jun 2022 16:53 UTC

101 points

28 comments11 min readLW link

My current take on Internal Family Systems “parts”

Kaj_Sotala26 Jun 2022 17:40 UTC

97 points

11 comments3 min readLW link

(kajsotala.fi)

Pivotal outcomes and pivotal processes

Andrew_Critch17 Jun 2022 23:43 UTC

97 points

31 comments4 min readLW link

Announcing Epoch: A research organization investigating the road to Transformative AI

Jsevillamol, Pablo Villalobos, Tamay, lennart, Marius Hobbhahn and anson.ho

27 Jun 2022 13:55 UTC

97 points

2 comments2 min readLW link

(epochai.org)

Contest: An Alien Message

DaemonicSigil27 Jun 2022 5:54 UTC

95 points

100 comments1 min readLW link