All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

All JanFebMar Apr May Jun Jul

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

My journey to the microwave alternate timeline

Malmesbury10 Feb 2026 17:59 UTC

788 points

58 comments10 min readLW link

Here’s to the Polypropylene Makers

jefftk27 Feb 2026 4:00 UTC

564 points

19 comments2 min readLW link

(www.jefftk.com)

Did Claude 3 Opus align itself via gradient hacking?

Fiora Starlight21 Feb 2026 22:24 UTC

394 points

49 comments20 min readLW link

Life at the Frontlines of Demographic Collapse

Martin Sustrik14 Feb 2026 6:30 UTC

290 points

52 comments8 min readLW link

(www.250bpm.com)

Anthropic’s “Hot Mess” paper overstates its case (and the blog post is worse)

RobertM4 Feb 2026 6:30 UTC

288 points

28 comments6 min readLW link

models have some pretty funny attractor states

aryaj, Senthooran Rajamanoharan and Neel Nanda

12 Feb 2026 21:14 UTC

277 points

38 comments18 min readLW link

Why You Don’t Believe in Xhosa Prophecies

Jan_Kulveit13 Feb 2026 2:25 UTC

270 points

28 comments4 min readLW link

Gyre

vgel17 Feb 2026 0:38 UTC

264 points

24 comments8 min readLW link

(vgel.me)

Post-AGI Economics As If Nothing Ever Happens

Jan_Kulveit4 Feb 2026 17:39 UTC

258 points

43 comments8 min readLW link

(boundedlyrational.substack.com)

The Spectre haunting the “AI Safety” Community

Gabriel Alfour21 Feb 2026 11:14 UTC

236 points

28 comments6 min readLW link

(cognition.cafe)

Open sourcing a browser extension that shows when people are wrong on the internet

lc24 Feb 2026 16:36 UTC

228 points

34 comments2 min readLW link

(github.com)

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

eleweek7 Feb 2026 16:11 UTC

218 points

13 comments13 min readLW link

(psychotechnology.substack.com)

The World Keeps Getting Saved and You Don’t Notice

Bogoed16 Feb 2026 1:01 UTC

216 points

21 comments2 min readLW link

The optimal age to freeze eggs is 19

GeneSmith8 Feb 2026 9:44 UTC

196 points

49 comments6 min readLW link

The persona selection model

Sam Marks23 Feb 2026 22:56 UTC

183 points

54 comments43 min readLW link

(alignment.anthropic.com)

Responsible Scaling Policy v3

HoldenKarnofsky24 Feb 2026 20:20 UTC

179 points

83 comments36 min readLW link

Persona Parasitology

Raymond Douglas16 Feb 2026 16:22 UTC

177 points

38 comments11 min readLW link

What We Learned from Briefing 140+ Lawmakers on the Threat from AI

leticiagarcia12 Feb 2026 19:53 UTC

174 points

7 comments14 min readLW link

(substack.com)

You’re an AI Expert – Not an Influencer

Max Winga17 Feb 2026 21:03 UTC

171 points

25 comments6 min readLW link

(maxwinga.substack.com)

Stone Age Billionaire Can’t Words Good

Eneasz9 Feb 2026 18:51 UTC

170 points

95 comments12 min readLW link

(deathisbad.substack.com)

Conditional Kickstarter for the “Don’t Build It” March

Raemon2 Feb 2026 22:58 UTC

165 points

35 comments4 min readLW link

Are there lessons from high-reliability engineering for AGI safety?

Steven Byrnes2 Feb 2026 15:26 UTC

162 points

16 comments8 min readLW link

Why we should expect ruthless sociopath ASI

Steven Byrnes18 Feb 2026 17:28 UTC

161 points

66 comments8 min readLW link

Prompt injection in Google Translate reveals base model behaviors behind task-specific fine-tuning

megasilverfist7 Feb 2026 13:56 UTC

160 points

27 comments3 min readLW link

Anthropic: “Statement from Dario Amodei on our discussions with the Department of War”

Matrice Jacobine26 Feb 2026 23:45 UTC

159 points

22 comments3 min readLW link

(www.anthropic.com)

Weight-Sparse Circuits May Be Interpretable Yet Unfaithful

jacob_drori9 Feb 2026 23:25 UTC

137 points

5 comments8 min readLW link

Frontier AI companies probably can’t leave the US

Anders Cairns Woodruff26 Feb 2026 18:18 UTC

137 points

19 comments7 min readLW link

(blog.redwoodresearch.org)

On Goal-Models

Richard_Ngo2 Feb 2026 18:44 UTC

136 points

15 comments4 min readLW link

Changing the world for the worse

mingyuan22 Feb 2026 23:55 UTC

132 points

17 comments3 min readLW link

(mingyuan.substack.com)

Honey, I shrunk the brain

Andy_McKenzie7 Feb 2026 0:01 UTC

128 points

1 comment5 min readLW link

(neurobiology.substack.com)

Solemn Courage

aysja4 Feb 2026 23:09 UTC

128 points

1 comment6 min readLW link

The nature of LLM algorithmic progress (v2)

Steven Byrnes5 Feb 2026 19:17 UTC

125 points

28 comments13 min readLW link

You May Already Be Canadian

jefftk19 Feb 2026 16:00 UTC

123 points

14 comments1 min readLW link

(www.jefftk.com)

It Is Reasonable To Research How To Use Model Internals In Training

Neel Nanda8 Feb 2026 3:44 UTC

122 points

15 comments4 min readLW link

Irrationality is Socially Strategic

Valentine18 Feb 2026 13:28 UTC

120 points

18 comments13 min readLW link

Superintelligence Alignment Seminar (1 month focused upskilling)

Mateusz Bagiński17 Feb 2026 17:03 UTC

118 points

13 comments3 min readLW link

Opus 4.6 Reasoning Doesn’t Verbalize Alignment Faking, but Behavior Persists

Daan Henselmans, Arno Libert and LennardZ

9 Feb 2026 12:55 UTC

118 points

13 comments8 min readLW link

The brain is a machine that runs an algorithm

Steven Byrnes17 Feb 2026 19:36 UTC

116 points

18 comments4 min readLW link

Claude Opus 4.6 is Driven

HunterJay6 Feb 2026 4:15 UTC

113 points

1 comment5 min readLW link

Smokey, This is not ’Nam Or: [Already] over the [red] line!

Davidmanheim8 Feb 2026 12:24 UTC

110 points

22 comments4 min readLW link

The ML ontology and the alignment ontology

Richard_Ngo24 Feb 2026 4:39 UTC

110 points

9 comments4 min readLW link

New ARENA material: 8 exercise sets on alignment science & interpretability

CallumMcDougall27 Feb 2026 17:37 UTC

105 points

1 comment7 min readLW link

Whack-a-Mole is Not a Winnable Game

Sable26 Feb 2026 2:40 UTC

102 points

26 comments18 min readLW link

(affablyevil.substack.com)

Gemini’s Hypothetical Present

jefftk13 Feb 2026 2:20 UTC

101 points

9 comments2 min readLW link

(www.jefftk.com)

If you don’t feel deeply confused about AGI risk, something’s wrong

Dave Banerjee21 Feb 2026 15:34 UTC

99 points

18 comments5 min readLW link

(open.substack.com)

Long-term risks from ideological fanaticism

David Althaus, Jamie_Harris, Vanessa Sarre, Clare and _will_

12 Feb 2026 23:26 UTC

99 points

12 comments84 min readLW link

Voting Results for the 2024 Review

RobertM7 Feb 2026 3:48 UTC

98 points

0 comments1 min readLW link

Exclusive: Hegseth gives Anthropic until Friday to back down on AI safeguards

Matrice Jacobine24 Feb 2026 19:19 UTC

95 points

9 comments3 min readLW link

(www.axios.com)

What secret goals does Claude think it has?

loops25 Feb 2026 19:22 UTC

94 points

11 comments4 min readLW link

Aligning to Virtues

Richard_Ngo16 Feb 2026 0:37 UTC

93 points

36 comments4 min readLW link