All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

All Jan Feb Mar AprMayJun

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

How Go Players Disempower Themselves to AI

Ashe Vazquez Nuñez1 May 2026 23:24 UTC

692 points

77 comments8 min readLW link

The Owned Ones

Eliezer Yudkowsky12 May 2026 17:56 UTC

367 points

51 comments6 min readLW link

Irretrievability; or, Murphy’s Curse of Oneshotness upon ASI

Eliezer Yudkowsky4 May 2026 22:11 UTC

367 points

132 comments22 min readLW link

Women should be able to open things

KatjaGrace21 May 2026 3:50 UTC

338 points

134 comments2 min readLW link

(worldspiritsockpuppet.com)

Mnemonic portraits for 19,023 human genes

Brinedew28 May 2026 22:16 UTC

336 points

27 comments15 min readLW link

It’s nice of you to worry about me, but I really do have a life

Viliam4 May 2026 21:14 UTC

331 points

61 comments4 min readLW link

Models finding software vulnerabilities is not the primary source of cybersecurity risk

lc14 May 2026 3:39 UTC

308 points

23 comments2 min readLW link

Bad Problems Don’t Stop Being Bad Because Somebody’s Wrong About Fault Analysis

Linch9 May 2026 1:30 UTC

264 points

74 comments3 min readLW link

x-risk-themed

kave6 May 2026 15:16 UTC

218 points

20 comments3 min readLW link

(kaverennedy.substack.com)

Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations

Subhash Kantamneni, kitft, Euan Ong and Sam Marks

7 May 2026 20:21 UTC

213 points

35 comments8 min readLW link

A relatively brief explanation of Boltzmann Brains

Eliezer Yudkowsky16 May 2026 21:19 UTC

206 points

154 comments4 min readLW link

MATS 9 Retrospective & Advice

beyarkay (Boyd Kane)15 May 2026 12:30 UTC

198 points

11 comments18 min readLW link

(boydkane.com)

Empowerment, corrigibility, etc. are simple abstractions (of a messed-up ontology)

Steven Byrnes11 May 2026 17:48 UTC

188 points

70 comments16 min readLW link

Trees are mostly made of air and a generalizable lesson for AI safety

Zephaniah Roe29 May 2026 4:08 UTC

166 points

28 comments4 min readLW link

A Year Late, Claude Finally Beats Pokémon

Julian Bradshaw16 May 2026 7:05 UTC

162 points

12 comments9 min readLW link

[Linkpost] Interpreting Language Model Parameters

Lucius Bushnaq, Dan Braun, Oliver Clive-Griffin, Bart Bussmann, Nathan Hu, mivanitskiy, Linda Linsefors and Lee Sharkey

5 May 2026 17:37 UTC

162 points

2 comments2 min readLW link

(www.goodfire.ai)

Dairy cows make their misery expensive (but their calves can’t)

Elizabeth3 May 2026 19:20 UTC

159 points

1 comment6 min readLW link

(acesounderglass.com)

Cognitive Security as an AI Safety Cause Area

jsteinhardt25 May 2026 18:30 UTC

155 points

16 comments2 min readLW link

The Iliad Intensive Course Materials

Leon Lang, David Udell and Alexander Gietelink Oldenziel

11 May 2026 18:55 UTC

152 points

4 comments13 min readLW link

(docs.google.com)

Automated Alignment is Harder Than You Think

Aleksandr Bowkis, Marie_DB, Jacob Pfau and Geoffrey Irving

14 May 2026 22:01 UTC

143 points

5 comments3 min readLW link

(arxiv.org)

The Darwinian Honeymoon—Why I am not as impressed by human progress as I used to be

Elias Schmied10 May 2026 15:55 UTC

138 points

23 comments4 min readLW link

theory uplift differentially benefits safety & is underleveraged

yudhister20 May 2026 21:43 UTC

132 points

14 comments1 min readLW link

You Are Not Immune To Mode Collapse

J Bostock2 May 2026 19:57 UTC

127 points

18 comments4 min readLW link

(jbostock.substack.com)

Taking woo seriously but not literally

Kaj_Sotala4 May 2026 13:36 UTC

123 points

27 comments23 min readLW link

(kajsotala.substack.com)

Donating 80% While It Still Counts

jefftk26 May 2026 1:30 UTC

123 points

8 comments6 min readLW link

(www.jefftk.com)

Contra Wentworth on Physical Attractiveness for Men

Gretta Duleba26 May 2026 23:20 UTC

122 points

25 comments8 min readLW link

Convergent Abstraction Hypothesis

Jan_Kulveit15 May 2026 0:04 UTC

122 points

20 comments6 min readLW link

Negation Neglect: When models fail to learn negations in training

harrymayne, Lev McKinney and Owain_Evans

18 May 2026 18:37 UTC

119 points

37 comments8 min readLW link

Claude, Author of the Humanitas

Linch26 May 2026 16:05 UTC

118 points

41 comments16 min readLW link

Optimisation: Selective versus Predictive

Raymond Douglas12 May 2026 14:03 UTC

117 points

15 comments3 min readLW link

Voters are surprisingly open to talking about AI risk

less_raichu13 May 2026 14:08 UTC

116 points

11 comments3 min readLW link

Incriminating misaligned AI models via distillation

Alek Westover, SebastianP, Alex Mallen, Jozdien, Alexa Pan, Julian Stastny and Vivek Hebbar

15 May 2026 21:43 UTC

115 points

12 comments5 min readLW link

Many individual CEVs are probably quite bad

Viliam6 May 2026 20:18 UTC

109 points

32 comments3 min readLW link

Synthetic Persona Pretraining: Alignment from Token Zero

Julian Minder, Raghav Singhal, Viktor Moskvoretskii, Stefan Krsteski, ashtonanderson, rolandaydin and Robert West

20 May 2026 14:16 UTC

109 points

26 comments17 min readLW link

Implications Of Predicting The Next Token

jdp19 May 2026 22:17 UTC

108 points

6 comments31 min readLW link

(minihf.com)

Risk from fitness-seeking AIs: mechanisms and mitigations

Alex Mallen1 May 2026 17:42 UTC

107 points

0 comments32 min readLW link

The AI Industrial Explosion — Part 1: Maximum growth rates with current production methods

djbinder4 May 2026 15:32 UTC

106 points

11 comments12 min readLW link

(defensesindepth.bio)

International Law Cannot Prevent Extinction Either

Sausage Vector Machine9 May 2026 22:34 UTC

102 points

16 comments5 min readLW link

Try, even if they have you cold

WalterL7 May 2026 17:19 UTC

102 points

14 comments2 min readLW link

Don’t be too Clever to Take Obvious Advice

Hide15 May 2026 3:01 UTC

95 points

26 comments2 min readLW link

(hidefromit.substack.com)

Who Got Breasts First and How We Got Them

rba11 May 2026 13:11 UTC

94 points

28 comments10 min readLW link

Your rights when flying to Europe

Yair Halberstadt5 May 2026 19:17 UTC

92 points

14 comments5 min readLW link

Taxing Small Cars To Improve MPG

jefftk24 May 2026 21:50 UTC

91 points

11 comments2 min readLW link

(www.jefftk.com)

Will we really put data centers in space?

Avi Parrack and fin

22 May 2026 23:51 UTC

91 points

23 comments5 min readLW link

(www.forethought.org)

Claude is Now Alignment-Pretrained

RogerDearnaley13 May 2026 23:19 UTC

87 points

9 comments1 min readLW link

(www.anthropic.com)

Bringing More Expertise to Bear on Alignment

Edmund Lau, Geoffrey Irving, Cameron Holmes and David Africa

8 May 2026 10:29 UTC

87 points

1 comment8 min readLW link

Mechanistic estimation for wide random MLPs

Jacob_Hilton7 May 2026 16:20 UTC

85 points

5 comments5 min readLW link

(www.alignment.org)

There is no evidence you should reapply sunscreen every 2 hours.

Hide6 May 2026 9:19 UTC

85 points

14 comments9 min readLW link

(hidefromit.substack.com)

What am I, if not an AI?

makiba21 May 2026 13:14 UTC

84 points

14 comments7 min readLW link

The ballad of TIGIT

Abhishaike Mahajan27 May 2026 17:04 UTC

84 points

1 comment9 min readLW link