How AI Is Learn­ing to Think in Secret

Nicholas Andresen6 Jan 2026 16:31 UTC
382 points
32 comments18 min readLW link
(nickandresen.substack.com)

AI found 12 of 12 OpenSSL zero-days (while curl can­cel­led its bug bounty)

Stanislav Fort27 Jan 2026 20:21 UTC
359 points
24 comments8 min readLW link

In My Misan­thropy Era

jenn4 Jan 2026 18:34 UTC
336 points
153 comments8 min readLW link
(jenn.site)

Canada Lost Its Measles Elimi­na­tion Sta­tus Be­cause We Don’t Have Enough Nurses Who Speak Low German

jenn25 Jan 2026 18:33 UTC
325 points
24 comments7 min readLW link
(www.jenn.site)

Ada Palmer: In­vent­ing the Renaissance

Martin Sustrik26 Jan 2026 4:40 UTC
301 points
20 comments13 min readLW link
(www.250bpm.com)

2025 in AI predictions

jessicata2 Jan 2026 4:29 UTC
245 points
19 comments11 min readLW link

“The first two weeks are the hard­est”: my first digi­tal declutter

mingyuan18 Jan 2026 22:04 UTC
219 points
11 comments2 min readLW link
(mingyuan.substack.com)

How to Hire a Team

Gretta Duleba29 Jan 2026 22:39 UTC
206 points
13 comments5 min readLW link

AlgZoo: un­in­ter­preted mod­els with fewer than 1,500 parameters

Jacob_Hilton26 Jan 2026 17:30 UTC
181 points
7 comments10 min readLW link
(www.alignment.org)

Claude’s new constitution

21 Jan 2026 19:37 UTC
176 points
47 comments6 min readLW link
(www.anthropic.com)

Back­yard cat fight shows Schel­ling points pre­ex­ist language

jchan14 Jan 2026 14:10 UTC
172 points
25 comments3 min readLW link

Prece­dents for the Un­prece­dented: His­tor­i­cal Analo­gies for Thir­teen Ar­tifi­cial Su­per­in­tel­li­gence Risks

James_Miller16 Jan 2026 18:43 UTC
165 points
15 comments63 min readLW link

Why I Tran­si­tioned: A Response

quinoa marisa20 Jan 2026 2:06 UTC
156 points
47 comments10 min readLW link

On Own­ing Galaxies

Simon Lermen6 Jan 2026 18:16 UTC
154 points
62 comments3 min readLW link
(simonlermen.substack.com)

Deep learn­ing as pro­gram synthesis

Zach Furman20 Jan 2026 15:35 UTC
150 points
33 comments41 min readLW link

Dario Amodei – The Ado­les­cence of Technology

habryka26 Jan 2026 19:10 UTC
147 points
62 comments73 min readLW link
(www.darioamodei.com)

The inau­gu­ral Red­wood Re­search podcast

4 Jan 2026 22:11 UTC
146 points
10 comments142 min readLW link

Does Pen­tagon Pizza The­ory Work?

rba22 Jan 2026 19:24 UTC
140 points
11 comments5 min readLW link
(goflaw.substack.com)

Why we are ex­cited about con­fes­sion!

14 Jan 2026 20:37 UTC
138 points
32 comments9 min readLW link
(alignment.openai.com)

What Wash­ing­ton Says About AGI

Zephaniah Roe17 Jan 2026 5:43 UTC
134 points
7 comments6 min readLW link

Re­cent LLMs can do 2-hop and 3-hop la­tent (no-CoT) rea­son­ing on nat­u­ral facts

ryan_greenblatt1 Jan 2026 13:36 UTC
129 points
11 comments3 min readLW link

The Possessed Machines (sum­mary)

L Rudolf L25 Jan 2026 20:47 UTC
128 points
31 comments9 min readLW link
(possessedmachines.com)

Light­cone is hiring a gen­er­al­ist, a de­signer, and a cam­pus op­er­a­tions co-lead

habryka17 Jan 2026 1:47 UTC
118 points
0 comments5 min readLW link

Ben­tham’s Bul­l­dog is wrong about AI risk

Max Harms29 Jan 2026 16:33 UTC
109 points
37 comments33 min readLW link

Taiwan war timelines might be shorter than AI timelines

Baram Sosis1 Jan 2026 22:30 UTC
108 points
21 comments5 min readLW link

Pre­train­ing on Aligned AI Data Dra­mat­i­cally Re­duces Misal­ign­ment—Even After Post-Training

RogerDearnaley19 Jan 2026 21:24 UTC
106 points
12 comments11 min readLW link
(arxiv.org)

Why AIs aren’t power-seek­ing yet

Eli Tyre11 Jan 2026 7:07 UTC
105 points
16 comments7 min readLW link

Notable Progress Has Been Made in Whole Brain Emulation

Dom Polsinelli25 Jan 2026 19:07 UTC
103 points
15 comments6 min readLW link
(open.substack.com)

Lies, Damned Lies, and Proofs: For­mal Meth­ods are not Slopless

12 Jan 2026 22:32 UTC
102 points
10 comments7 min readLW link

To be well-cal­ibrated is to be punctual

moridinamael25 Jan 2026 18:10 UTC
97 points
17 comments2 min readLW link

Every Bench­mark is Broken

Jonathan Gabor24 Jan 2026 2:42 UTC
95 points
0 comments4 min readLW link
(jonathanpgabor.substack.com)

Fit­ness-Seek­ers: Gen­er­al­iz­ing the Re­ward-Seek­ing Threat Model

Alex Mallen29 Jan 2026 19:42 UTC
92 points
5 comments17 min readLW link

Test your in­ter­pretabil­ity tech­niques by de-cen­sor­ing Chi­nese models

15 Jan 2026 16:33 UTC
91 points
14 comments20 min readLW link

IABIED Book Re­view: Core Ar­gu­ments and Counterarguments

Stephen McAleese24 Jan 2026 14:25 UTC
90 points
39 comments25 min readLW link

Col­lege Was Not That Ter­rible Now That I’m Not That Crazy

Zack_M_Davis1 Jan 2026 23:14 UTC
90 points
9 comments44 min readLW link
(zackmdavis.net)

Split Per­son­al­ity Train­ing: Re­veal­ing La­tent Knowl­edge Through Alter­nate Per­son­al­ities (Re­search Re­port)

Florian_Dietz12 Jan 2026 12:29 UTC
87 points
41 comments26 min readLW link

Ten­sor-Trans­former Var­i­ants are Sur­pris­ingly Performant

Logan Riggs12 Jan 2026 19:43 UTC
87 points
15 comments4 min readLW link

We need a bet­ter way to eval­u­ate emer­gent misalignment

11 Jan 2026 16:21 UTC
86 points
9 comments6 min readLW link

36,000 AI Agents Are Now Speedrun­ning Civilization

Michaël Trazzi30 Jan 2026 21:21 UTC
86 points
27 comments1 min readLW link

Over­sight As­sis­tants: Turn­ing Com­pute into Understanding

jsteinhardt6 Jan 2026 0:50 UTC
85 points
7 comments9 min readLW link
(bounded-regret.ghost.io)

Re­fusals that could be­come catastrophic

Fabien Roger30 Jan 2026 4:12 UTC
84 points
12 comments7 min readLW link

Are We in a Con­tinual Learn­ing Over­hang?

Samuel Knoche29 Jan 2026 17:09 UTC
83 points
5 comments14 min readLW link

The truth be­hind the 2026 J.P. Mor­gan Health­care Conference

Abhishaike Mahajan17 Jan 2026 17:28 UTC
83 points
35 comments9 min readLW link
(www.owlposting.com)

Strong, bi­par­ti­san lead­er­ship for re­sis­tance to Trump.

Raemon11 Jan 2026 23:08 UTC
82 points
85 comments2 min readLW link

When the LLM isn’t the one who’s wrong

Julian Bradshaw18 Jan 2026 21:37 UTC
81 points
9 comments2 min readLW link

Over­whelming Superintelligence

Raemon1 Jan 2026 20:51 UTC
80 points
30 comments1 min readLW link

Reflec­tions on TA-ing Har­vard’s first AI safety course

Roy Rinberg15 Jan 2026 16:28 UTC
79 points
4 comments9 min readLW link

Public in­tel­lec­tu­als need to say what they ac­tu­ally believe

Aaron Bergman7 Jan 2026 21:22 UTC
79 points
12 comments14 min readLW link
(www.aaronbergman.net)

Why Mo­ti­vated Rea­son­ing?

johnswentworth14 Jan 2026 19:55 UTC
78 points
20 comments5 min readLW link

Open Prob­lems With Claude’s Constitution

Zvi28 Jan 2026 14:20 UTC
75 points
1 comment24 min readLW link
(thezvi.wordpress.com)