All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

A case for courage, when speaking of AI danger

So8res27 Jun 2025 2:15 UTC

532 points

130 comments6 min readLW link

New Endorsements for “If Anyone Builds It, Everyone Dies”

Malo18 Jun 2025 16:30 UTC

488 points

55 comments4 min readLW link

(intelligence.org)

the void

nostalgebraist11 Jun 2025 3:19 UTC

427 points

108 comments1 min readLW link

(nostalgebraist.tumblr.com)

A deep critique of AI 2027’s bad timeline models

titotal19 Jun 2025 13:29 UTC

378 points

40 comments39 min readLW link

(titotal.substack.com)

Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)

LawrenceC11 Jun 2025 19:27 UTC

318 points

19 comments16 min readLW link

Foom & Doom 1: “Brain in a box in a basement”

Steven Byrnes23 Jun 2025 17:18 UTC

303 points

125 comments30 min readLW link

Distillation Robustifies Unlearning

Bruce W. Lee, Addie Foote, alexinf, leni, Jacob G-W, Harish Kamath, Bryce Woodworth, cloud and TurnTrout

13 Jun 2025 13:45 UTC

239 points

43 comments8 min readLW link

(arxiv.org)

Do Not Tile the Lightcone with Your Confused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC

236 points

27 comments5 min readLW link

(boundedlyrational.substack.com)

Intelligence Is Not Magic, But Your Threshold For “Magic” Is Pretty Low

Expertium15 Jun 2025 15:23 UTC

227 points

27 comments1 min readLW link

“Flaky breakthroughs” pervade inner work — but almost no one tracks them

Chris Lakin4 Jun 2025 19:02 UTC

218 points

45 comments2 min readLW link

(chrislakin.blog)

Mech interp is not pre-paradigmatic

Lee Sharkey10 Jun 2025 13:39 UTC

213 points

15 comments13 min readLW link

AI companies’ eval reports mostly don’t support their claims

Zach Stein-Perlman9 Jun 2025 13:00 UTC

212 points

13 comments4 min readLW link

The Value Proposition of Romantic Relationships

johnswentworth2 Jun 2025 13:51 UTC

210 points

43 comments13 min readLW link

Consider chilling out in 2028

Valentine21 Jun 2025 17:07 UTC

208 points

144 comments13 min readLW link

Futarchy’s fundamental flaw

dynomight13 Jun 2025 22:08 UTC

186 points

52 comments9 min readLW link

(dynomight.net)

My pitch for the AI Village

Daniel Kokotajlo24 Jun 2025 15:00 UTC

183 points

35 comments5 min readLW link

Read the Pricing First

Max Niederman10 Jun 2025 2:22 UTC

178 points

14 comments1 min readLW link

Foom & Doom 2: Technical alignment is hard

Steven Byrnes23 Jun 2025 17:19 UTC

175 points

68 comments28 min readLW link

Estrogen: A trip report

cube_flipper15 Jun 2025 13:15 UTC

167 points

42 comments27 min readLW link

(smoothbrains.net)

Endometriosis is an incredibly interesting disease

Abhishaike Mahajan14 Jun 2025 22:14 UTC

167 points

5 comments16 min readLW link

(www.owlposting.com)

X explains Z% of the variance in Y

Leon Lang20 Jun 2025 12:17 UTC

160 points

36 comments9 min readLW link

Comparing risk from internally-deployed AI to insider and outsider threats from humans

Buck23 Jun 2025 17:47 UTC

157 points

22 comments3 min readLW link

Broad-Spectrum Cancer Treatments

sarahconstantin3 Jun 2025 19:40 UTC

150 points

10 comments7 min readLW link

(sarahconstantin.substack.com)

Making deals with early schemers

Julian Stastny, Olli Järviniemi and Buck

20 Jun 2025 18:21 UTC

133 points

42 comments15 min readLW link

The Industrial Explosion

rosehadshar and Tom Davidson

26 Jun 2025 14:41 UTC

131 points

70 comments15 min readLW link

(www.forethought.org)

Model Organisms for Emergent Misalignment

Anna Soligo, Edward Turner, Mia Taylor, Senthooran Rajamanoharan and Neel Nanda

16 Jun 2025 15:46 UTC

120 points

19 comments5 min readLW link

Proposal for making credible commitments to AIs.

Cleo Nardo27 Jun 2025 19:43 UTC

112 points

45 comments2 min readLW link

METR’s Observations of Reward Hacking in Recent Frontier Models

Daniel Kokotajlo9 Jun 2025 18:03 UTC

100 points

9 comments11 min readLW link

(metr.org)

RTFB: The RAISE Act

Zvi16 Jun 2025 12:50 UTC

99 points

8 comments8 min readLW link

(thezvi.wordpress.com)

The Mirror Trap

Cameron Berg6 Jun 2025 22:30 UTC

94 points

13 comments4 min readLW link

“It isn’t magic”

Ben (Berlin)23 Jun 2025 14:00 UTC

92 points

17 comments2 min readLW link

Prover-Estimator Debate: A New Scalable Oversight Protocol

Jonah Brown-Cohen and Geoffrey Irving

17 Jun 2025 13:53 UTC

89 points

19 comments5 min readLW link

On working 80%

adrische7 Jun 2025 17:58 UTC

87 points

7 comments3 min readLW link

(github.com)

Maybe Social Anxiety Is Just You Failing At Mind Control

25Hour11 Jun 2025 23:49 UTC

85 points

21 comments16 min readLW link

Why we’re still doing normal school

juliawise14 Jun 2025 12:40 UTC

85 points

0 comments3 min readLW link

Genomic emancipation

TsviBT21 Jun 2025 8:15 UTC

83 points

14 comments26 min readLW link

Help the AI 2027 team make an online AGI wargame

Jonas V27 Jun 2025 1:02 UTC

82 points

10 comments1 min readLW link

Situational Awareness: A One-Year Retrospective

Nathan Delisle23 Jun 2025 19:15 UTC

82 points

5 comments12 min readLW link

Some reprogenetics-related projects you could help with

TsviBT15 Jun 2025 20:25 UTC

80 points

1 comment4 min readLW link

When does training a model change its goals?

Vivek Hebbar and ryan_greenblatt

12 Jun 2025 18:43 UTC

79 points

3 comments15 min readLW link

Unfaithful Reasoning Can Fool Chain-of-Thought Monitoring

Benjamin Arnav, Pablo Bernabeu-Pérez, Tim Kostolansky, HanneWhitt, Nathan Helm-Burger and Mary Phuong

2 Jun 2025 19:08 UTC

78 points

17 comments3 min readLW link

Ghiblification for Privacy

jefftk10 Jun 2025 0:30 UTC

78 points

47 comments1 min readLW link

(www.jefftk.com)

Convergent Linear Representations of Emergent Misalignment

Anna Soligo, Edward Turner, Senthooran Rajamanoharan and Neel Nanda

16 Jun 2025 15:47 UTC

77 points

1 comment8 min readLW link

Agentic Misalignment: How LLMs Could be Insider Threats

Aengus Lynch, Benjamin Wright, Ethan Perez and evhub

20 Jun 2025 22:34 UTC

77 points

13 comments6 min readLW link

Busking with Kids

jefftk9 Jun 2025 0:30 UTC

76 points

0 comments1 min readLW link

(www.jefftk.com)

Analyzing A Critique Of The AI 2027 Timeline Forecasts

Zvi24 Jun 2025 18:50 UTC

76 points

38 comments30 min readLW link

(thezvi.wordpress.com)

Jankily controlling superintelligence

ryan_greenblatt27 Jun 2025 14:05 UTC

70 points

4 comments7 min readLW link

Thought Crime: Backdoors & Emergent Misalignment in Reasoning Models

James Chua and Owain_Evans

16 Jun 2025 16:43 UTC

69 points

2 comments8 min readLW link

Why “training against scheming” is hard

Marius Hobbhahn24 Jun 2025 19:08 UTC

66 points

2 comments12 min readLW link

Musings on AI Companies of 2025-2026 (Jun 2025)

Vladimir_Nesov20 Jun 2025 17:14 UTC

66 points

4 comments3 min readLW link