Strat­egy of von Neu­mann and strat­egy of Rosenbergs

avturchin6 Feb 2026 22:50 UTC
5 points
4 comments2 min readLW link

Data-Cen­tric In­ter­pretabil­ity for LLM-based Multi-Agent Re­in­force­ment Learning

6 Feb 2026 19:27 UTC
10 points
0 comments4 min readLW link

Parks Aren’t Nature

Sable6 Feb 2026 18:27 UTC
50 points
11 comments8 min readLW link
(affablyevil.substack.com)

Claude Code #4: From The Be­fore Times

Zvi6 Feb 2026 18:01 UTC
42 points
1 comment23 min readLW link
(thezvi.wordpress.com)

Ro­bust Finite Poli­cies are Non­triv­ially Structured

Winter Cross6 Feb 2026 17:47 UTC
26 points
1 comment11 min readLW link

In (highly con­tin­gent!) defense of in­ter­pretabil­ity-in-the-loop ML training

Steven Byrnes6 Feb 2026 16:32 UTC
85 points
11 comments3 min readLW link

Spec­tral Sig­na­tures of Grad­ual Disempowerment

Jonas Hallgren6 Feb 2026 15:08 UTC
36 points
4 comments17 min readLW link
(equilibria1.substack.com)

De­mands Are All You Need: Prompt Im­per­a­tive­ness Dras­ti­cally Re­duces Hedg­ing In LLMs (n=900, Co­hen’s d = 2.67)

fluxxrider6 Feb 2026 13:22 UTC
6 points
0 comments16 min readLW link

[Question] If all hu­mans were turned into high-fidelity mind up­loads to­mor­row, would we be self-sus­tain­ing?

Erich_Grunewald6 Feb 2026 8:35 UTC
11 points
2 comments1 min readLW link

AI bench­mark­ing has a Y-axis prob­lem

Lizka6 Feb 2026 7:45 UTC
79 points
3 comments7 min readLW link

Claude Opus 4.6 is Driven

HunterJay6 Feb 2026 4:15 UTC
113 points
1 comment5 min readLW link

Why ASI Might Pre­serve Its Progenitors

Luke J. Dawes6 Feb 2026 2:54 UTC
2 points
0 comments12 min readLW link

How Dario Amodei’s “The Ado­les­cence of Tech­nol­ogy” Dele­gi­t­imizes AI X-Risk Concerns

6 Feb 2026 2:07 UTC
12 points
6 comments50 min readLW link
(doomdebates.com)

[Question] Good­fire and Train­ing on Interpretability

Satya Benson6 Feb 2026 1:45 UTC
32 points
5 comments1 min readLW link

Plan ’Straya

William the Kiwi 6 Feb 2026 0:14 UTC
16 points
5 comments5 min readLW link

TT Self Study Jour­nal # 6

TristanTrim5 Feb 2026 23:41 UTC
5 points
3 comments3 min readLW link

The Sim­plest Case for AI Catastrophe

Linch5 Feb 2026 23:18 UTC
77 points
9 comments10 min readLW link
(linch.substack.com)

TT’s Look­ing-for-Work Strategy

TristanTrim5 Feb 2026 21:40 UTC
4 points
0 comments1 min readLW link

Agent Eco­nomics: a BOTEC on feasibility

Margot5 Feb 2026 20:15 UTC
28 points
0 comments6 min readLW link
(forum.effectivealtruism.org)

Molt­book as a set­ting to an­a­lyze Power Seek­ing behaviour

Rahul N5 Feb 2026 20:07 UTC
11 points
0 comments1 min readLW link
(propensitylabs.substack.com)

The na­ture of LLM al­gorith­mic progress (v2)

Steven Byrnes5 Feb 2026 19:17 UTC
116 points
27 comments13 min readLW link

Biotech Startup Stats

sarahconstantin5 Feb 2026 18:40 UTC
21 points
0 comments4 min readLW link
(sarahconstantin.substack.com)

On The Lies De­pres­sion Tells

sonicrocketman5 Feb 2026 17:13 UTC
25 points
2 comments3 min readLW link
(brianschrader.com)

Speedrun­ning a Mech In­terp Re­search Setup (Re­mote GPU, Torch, Trans­formerLens, Cuda, SSH, VS Code)

J Rosser5 Feb 2026 16:45 UTC
38 points
3 comments4 min readLW link

What’s the con­crete plan to be­come an in­cred­ibly agen­tic per­son?

Peter Berggren5 Feb 2026 16:27 UTC
12 points
3 comments3 min readLW link

AI #154: Claw Your Way To The Top

Zvi5 Feb 2026 16:10 UTC
38 points
2 comments43 min readLW link
(thezvi.wordpress.com)

Prepar­ing for a Warn­ing Shot

Noah Birnbaum5 Feb 2026 15:10 UTC
43 points
5 comments4 min readLW link

A Pro­posal for TruesightBench

David Africa5 Feb 2026 14:33 UTC
14 points
0 comments4 min readLW link

Scratch­ing the sore: how plea­sure re­lates to suffering

Vadim Golub5 Feb 2026 12:05 UTC
−1 points
35 comments2 min readLW link

What’s the Point of the Math?

Ashe Vazquez Nuñez5 Feb 2026 11:30 UTC
46 points
3 comments5 min readLW link

Short List of Public Ra­tion­al­ist On­line Dis­cus­sion Groups in 2026

Shoshannah Tekofsky5 Feb 2026 10:33 UTC
14 points
2 comments1 min readLW link

Idea: the in­tel­li­gence ex­plo­sion convention

wdmacaskill5 Feb 2026 9:11 UTC
20 points
0 comments9 min readLW link
(www.forethought.org)

Is Note-tak­ing a fa­vor or a bur­den to my fu­ture-self?

CstineSublime5 Feb 2026 6:22 UTC
18 points
17 comments1 min readLW link

Epi­sodic mem­ory in AI agents poses new safety risks

Chad DeChant5 Feb 2026 5:28 UTC
13 points
1 comment10 min readLW link

Find­ing Cruxes: Help Real­ity Punch You In the Face

Raemon5 Feb 2026 2:11 UTC
71 points
0 comments8 min readLW link

How to train any mul­ti­a­gent sys­tems end-to-end from AI feedback

Ed Li5 Feb 2026 2:00 UTC
1 point
0 comments1 min readLW link

In Search of Lost Time—A Review

eniteris5 Feb 2026 1:46 UTC
17 points
1 comment10 min readLW link

Solemn Courage

aysja4 Feb 2026 23:09 UTC
128 points
1 comment6 min readLW link

p-val­ues are good actually

speck14474 Feb 2026 22:04 UTC
9 points
8 comments3 min readLW link

Chess bots do not have goals

zulupineapple4 Feb 2026 21:11 UTC
2 points
10 comments1 min readLW link

Prevent­ing the apoc­a­lypse with power dis­tri­bu­tion theory

Rationalist112354 Feb 2026 18:44 UTC
2 points
0 comments4 min readLW link

Post-AGI Eco­nomics As If Noth­ing Ever Happens

Jan_Kulveit4 Feb 2026 17:39 UTC
254 points
43 comments8 min readLW link
(boundedlyrational.substack.com)

Vibestemics

Gordon Seidoh Worley4 Feb 2026 16:40 UTC
13 points
10 comments5 min readLW link
(www.uncertainupdates.com)

Kimi K2.5

Zvi4 Feb 2026 15:30 UTC
33 points
0 comments10 min readLW link
(thezvi.wordpress.com)

Ralph-wig­gum is Bad and An­thropic Should Fix It

d4hines4 Feb 2026 15:26 UTC
27 points
11 comments1 min readLW link

Who does a right to com­pute ac­tu­ally pro­tect?

TFD4 Feb 2026 15:09 UTC
25 points
0 comments5 min readLW link
(www.thefloatingdroid.com)

Rec­on­cil­ing Shan­non and Bayes.

Laureana Bonaparte4 Feb 2026 14:33 UTC
−24 points
1 comment1 min readLW link
(wallstreetweather.org)

An­thropic’s “Hot Mess” pa­per over­states its case (and the blog post is worse)

RobertM4 Feb 2026 6:30 UTC
288 points
28 comments6 min readLW link

A Black Box Made Less Opaque (part 2)

Matthew McDonnell4 Feb 2026 4:12 UTC
6 points
0 comments15 min readLW link

Thoughts on Toby Ords’ AI Scal­ing Series

Srdjan Miletic4 Feb 2026 0:41 UTC
10 points
1 comment4 min readLW link
(www.dissent.blog)