All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Social Dark Matter

Duncan Sabien (Inactive)16 Nov 2023 20:00 UTC

388 points

131 comments34 min readLW link 2 reviews

Shallow review of live agendas in alignment & safety

technicalities and Stag

27 Nov 2023 11:10 UTC

351 points

73 comments29 min readLW link 1 review

AI Timelines

habryka, Daniel Kokotajlo, Ajeya Cotra and Ege Erdil

10 Nov 2023 5:28 UTC

302 points

144 comments51 min readLW link 2 reviews

The 101 Space You Will Always Have With You

Screwtape29 Nov 2023 4:56 UTC

296 points

23 comments6 min readLW link 1 review

The 6D effect: When companies take risks, one email can be very powerful.

scasper4 Nov 2023 20:08 UTC

289 points

42 comments3 min readLW link

OpenAI: The Battle of the Board

Zvi22 Nov 2023 17:30 UTC

281 points

83 comments11 min readLW link

(thezvi.wordpress.com)

OpenAI: Facts from a Weekend

Zvi20 Nov 2023 15:30 UTC

272 points

166 comments9 min readLW link

(thezvi.wordpress.com)

What are the results of more parental supervision and less outdoor play?

juliawise25 Nov 2023 12:52 UTC

235 points

31 comments5 min readLW link

Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense

So8res24 Nov 2023 17:37 UTC

213 points

85 comments5 min readLW link 1 review

The other side of the tidal wave

KatjaGrace3 Nov 2023 5:40 UTC

206 points

88 comments1 min readLW link

(worldspiritsockpuppet.com)

Thinking By The Clock

Screwtape8 Nov 2023 7:40 UTC

200 points

29 comments8 min readLW link 1 review

Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk

1a3orn2 Nov 2023 18:20 UTC

194 points

79 comments23 min readLW link

Loudly Give Up, Don’t Quietly Fade

Screwtape13 Nov 2023 23:30 UTC

194 points

13 comments6 min readLW link 1 review

Sam Altman fired from OpenAI

LawrenceC17 Nov 2023 20:42 UTC

192 points

75 comments1 min readLW link

(openai.com)

You can just spontaneously call people you haven’t met in years

lc13 Nov 2023 5:21 UTC

183 points

22 comments1 min readLW link

How to (hopefully ethically) make money off of AGI

habryka, Zvi, Cosmos and NoahK

6 Nov 2023 23:35 UTC

179 points

95 comments32 min readLW link 1 review

Vote on Interesting Disagreements

Ben Pace7 Nov 2023 21:35 UTC

159 points

131 comments1 min readLW link

Moral Reality Check (a short story)

jessicata26 Nov 2023 5:03 UTC

155 points

45 comments21 min readLW link 1 review

(unstableontology.com)

Does davidad’s uploading moonshot work?

Bird Concept, lisathiergart, Anders_Sandberg, davidad and Arenamontanus

3 Nov 2023 2:21 UTC

146 points

35 comments25 min readLW link

My thoughts on the social response to AI risk

Matthew Barnett1 Nov 2023 21:17 UTC

146 points

37 comments10 min readLW link

One Day Sooner

Screwtape2 Nov 2023 19:00 UTC

138 points

8 comments8 min readLW link 1 review

EA orgs’ legal structure inhibits risk taking and information sharing on the margin

Elizabeth5 Nov 2023 19:13 UTC

136 points

17 comments4 min readLW link

Apocalypse insurance, and the hardline libertarian take on AI risk

So8res28 Nov 2023 2:09 UTC

136 points

40 comments7 min readLW link 1 review

Integrity in AI Governance and Advocacy

habryka and Olive Branch

3 Nov 2023 19:52 UTC

135 points

57 comments23 min readLW link

8 examples informing my pessimism on uploading without reverse engineering

Steven Byrnes3 Nov 2023 20:03 UTC

127 points

12 comments12 min readLW link

Deception Chess: Game #1

Zane, aphyer, Alex A and AdamYedidia

3 Nov 2023 21:13 UTC

118 points

22 comments8 min readLW link 1 review

Never Drop A Ball

Screwtape23 Nov 2023 4:15 UTC

116 points

8 comments6 min readLW link 1 review

The Soul Key

Richard_Ngo4 Nov 2023 17:51 UTC

114 points

10 comments8 min readLW link 1 review

(www.narrativeark.xyz)

How much to update on recent AI governance moves?

habryka and So8res

16 Nov 2023 23:46 UTC

112 points

5 comments29 min readLW link

Experiences and learnings from both sides of the AI safety job market

Marius Hobbhahn15 Nov 2023 15:40 UTC

111 points

4 comments18 min readLW link

Learning-theoretic agenda reading list

Vanessa Kosoy9 Nov 2023 17:25 UTC

108 points

1 comment2 min readLW link 1 review

Stuxnet, not Skynet: Humanity’s disempowerment by AI

Roko4 Nov 2023 22:23 UTC

107 points

24 comments6 min readLW link

My techno-optimism [By Vitalik Buterin]

habryka27 Nov 2023 23:53 UTC

107 points

17 comments2 min readLW link

(www.lesswrong.com)

New LessWrong feature: Dialogue Matching

Bird Concept16 Nov 2023 21:27 UTC

107 points

22 comments3 min readLW link

Picking Mentors For Research Programmes

Raymond Douglas10 Nov 2023 13:01 UTC

105 points

8 comments4 min readLW link

Some Rules for an Algebra of Bayes Nets

johnswentworth and David Lorell

16 Nov 2023 23:53 UTC

101 points

48 comments14 min readLW link 1 review

On the Executive Order

Zvi1 Nov 2023 14:20 UTC

100 points

4 comments30 min readLW link

(thezvi.wordpress.com)

Kids or No kids

Kids or no kids14 Nov 2023 18:37 UTC

100 points

10 comments13 min readLW link

Coup probes: Catching catastrophes with probes trained off-policy

Fabien Roger17 Nov 2023 17:58 UTC

95 points

9 comments11 min readLW link 1 review

Untrusted smart models and trusted dumb models

Buck4 Nov 2023 3:06 UTC

92 points

17 comments6 min readLW link 1 review

Growth and Form in a Toy Model of Superposition

Liam Carroll and Edmund Lau

8 Nov 2023 11:08 UTC

92 points

7 comments14 min readLW link

Saying the quiet part out loud: trading off x-risk for personal immortality

disturbance2 Nov 2023 17:43 UTC

92 points

89 comments5 min readLW link

Public Call for Interest in Mathematical Alignment

Davidmanheim22 Nov 2023 13:22 UTC

90 points

9 comments1 min readLW link

Large Language Models can Strategically Deceive their Users when Put Under Pressure.

ReaderM15 Nov 2023 16:36 UTC

90 points

9 comments2 min readLW link 1 review

(arxiv.org)

Self-Referential Probabilistic Logic Admits the Payor’s Lemma

yudhister28 Nov 2023 10:27 UTC

85 points

14 comments6 min readLW link

Announcing New Beginner-friendly Book on AI Safety and Risk

Darren McKee25 Nov 2023 15:57 UTC

85 points

3 comments1 min readLW link

New report: “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Joe Carlsmith15 Nov 2023 17:16 UTC

83 points

28 comments30 min readLW link 1 review

Agent Boundaries Aren’t Markov Blankets. [Unless they’re non-causal; see comments.]

abramdemski20 Nov 2023 18:23 UTC

83 points

11 comments2 min readLW link

Interpretability with Sparse Autoencoders (Colab exercises)

CallumMcDougall29 Nov 2023 12:56 UTC

83 points

9 comments4 min readLW link

My Criticism of Singular Learning Theory

Joar Skalse19 Nov 2023 15:19 UTC

83 points

56 comments12 min readLW link