A case for courage, when speak­ing of AI danger

So8res27 Jun 2025 2:15 UTC
530 points
129 comments6 min readLW link

New En­dorse­ments for “If Any­one Builds It, Every­one Dies”

Malo18 Jun 2025 16:30 UTC
488 points
55 comments4 min readLW link
(intelligence.org)

the void

nostalgebraist11 Jun 2025 3:19 UTC
396 points
107 comments1 min readLW link
(nostalgebraist.tumblr.com)

A deep cri­tique of AI 2027’s bad timeline models

titotal19 Jun 2025 13:29 UTC
372 points
40 comments39 min readLW link
(titotal.substack.com)

Be­ware Gen­eral Claims about “Gen­er­al­iz­able Rea­son­ing Ca­pa­bil­ities” (of Modern AI Sys­tems)

LawrenceC11 Jun 2025 19:27 UTC
297 points
19 comments16 min readLW link

Foom & Doom 1: “Brain in a box in a base­ment”

Steven Byrnes23 Jun 2025 17:18 UTC
282 points
120 comments29 min readLW link

Distil­la­tion Ro­bus­tifies Unlearning

13 Jun 2025 13:45 UTC
236 points
43 comments8 min readLW link
(arxiv.org)

Do Not Tile the Light­cone with Your Con­fused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC
229 points
27 comments5 min readLW link
(boundedlyrational.substack.com)

In­tel­li­gence Is Not Magic, But Your Thresh­old For “Magic” Is Pretty Low

Expertium15 Jun 2025 15:23 UTC
215 points
27 comments1 min readLW link

Mech in­terp is not pre-paradigmatic

Lee Sharkey10 Jun 2025 13:39 UTC
211 points
15 comments13 min readLW link

AI com­pa­nies’ eval re­ports mostly don’t sup­port their claims

Zach Stein-Perlman9 Jun 2025 13:00 UTC
207 points
13 comments4 min readLW link

The Value Propo­si­tion of Ro­man­tic Relationships

johnswentworth2 Jun 2025 13:51 UTC
204 points
43 comments13 min readLW link

“Flaky break­throughs” per­vade in­ner work — but al­most no one tracks them

Chris Lakin4 Jun 2025 19:02 UTC
203 points
44 comments2 min readLW link
(chrislakin.blog)

Con­sider chilling out in 2028

Valentine21 Jun 2025 17:07 UTC
189 points
143 comments13 min readLW link

Futarchy’s fun­da­men­tal flaw

dynomight13 Jun 2025 22:08 UTC
178 points
49 comments9 min readLW link
(dynomight.net)

My pitch for the AI Village

Daniel Kokotajlo24 Jun 2025 15:00 UTC
178 points
35 comments5 min readLW link

Read the Pric­ing First

Max Niederman10 Jun 2025 2:22 UTC
174 points
14 comments1 min readLW link

Estro­gen: A trip report

cube_flipper15 Jun 2025 13:15 UTC
167 points
42 comments27 min readLW link
(smoothbrains.net)

En­dometri­o­sis is an in­cred­ibly in­ter­est­ing disease

Abhishaike Mahajan14 Jun 2025 22:14 UTC
166 points
5 comments16 min readLW link
(www.owlposting.com)

Foom & Doom 2: Tech­ni­cal al­ign­ment is hard

Steven Byrnes23 Jun 2025 17:19 UTC
165 points
65 comments28 min readLW link

X ex­plains Z% of the var­i­ance in Y

Leon Lang20 Jun 2025 12:17 UTC
160 points
36 comments9 min readLW link

Com­par­ing risk from in­ter­nally-de­ployed AI to in­sider and out­sider threats from humans

Buck23 Jun 2025 17:47 UTC
150 points
22 comments3 min readLW link

Broad-Spec­trum Cancer Treatments

sarahconstantin3 Jun 2025 19:40 UTC
146 points
10 comments7 min readLW link
(sarahconstantin.substack.com)

The In­dus­trial Explosion

26 Jun 2025 14:41 UTC
128 points
70 comments15 min readLW link
(www.forethought.org)

Mak­ing deals with early schemers

20 Jun 2025 18:21 UTC
127 points
41 comments15 min readLW link

Model Or­ganisms for Emer­gent Misalignment

16 Jun 2025 15:46 UTC
118 points
19 comments5 min readLW link

Pro­posal for mak­ing cred­ible com­mit­ments to AIs.

Cleo Nardo27 Jun 2025 19:43 UTC
107 points
45 comments2 min readLW link

METR’s Ob­ser­va­tions of Re­ward Hack­ing in Re­cent Fron­tier Models

Daniel Kokotajlo9 Jun 2025 18:03 UTC
100 points
9 comments11 min readLW link
(metr.org)

RTFB: The RAISE Act

Zvi16 Jun 2025 12:50 UTC
97 points
8 comments8 min readLW link
(thezvi.wordpress.com)

The Mir­ror Trap

Cameron Berg6 Jun 2025 22:30 UTC
94 points
13 comments4 min readLW link

“It isn’t magic”

Ben (Berlin)23 Jun 2025 14:00 UTC
92 points
17 comments2 min readLW link

Prover-Es­ti­ma­tor De­bate: A New Scal­able Over­sight Protocol

17 Jun 2025 13:53 UTC
89 points
19 comments5 min readLW link

On work­ing 80%

adrische7 Jun 2025 17:58 UTC
87 points
7 comments3 min readLW link
(github.com)

Why we’re still do­ing nor­mal school

juliawise14 Jun 2025 12:40 UTC
85 points
0 comments3 min readLW link

Ge­nomic emancipation

TsviBT21 Jun 2025 8:15 UTC
83 points
14 comments26 min readLW link

Agen­tic Misal­ign­ment: How LLMs Could be In­sider Threats

20 Jun 2025 22:34 UTC
83 points
13 comments6 min readLW link

Si­tu­a­tional Aware­ness: A One-Year Retrospective

Nathan Delisle23 Jun 2025 19:15 UTC
82 points
4 comments12 min readLW link

Maybe So­cial Anx­iety Is Just You Failing At Mind Control

25Hour11 Jun 2025 23:49 UTC
81 points
21 comments16 min readLW link

Help the AI 2027 team make an on­line AGI wargame

Jonas V27 Jun 2025 1:02 UTC
81 points
10 comments1 min readLW link

Some re­pro­ge­net­ics-re­lated pro­jects you could help with

TsviBT15 Jun 2025 20:25 UTC
80 points
1 comment4 min readLW link

When does train­ing a model change its goals?

12 Jun 2025 18:43 UTC
78 points
3 comments15 min readLW link

Un­faith­ful Rea­son­ing Can Fool Chain-of-Thought Monitoring

2 Jun 2025 19:08 UTC
78 points
17 comments3 min readLW link

Con­ver­gent Lin­ear Rep­re­sen­ta­tions of Emer­gent Misalignment

16 Jun 2025 15:47 UTC
76 points
1 comment8 min readLW link

Busk­ing with Kids

jefftk9 Jun 2025 0:30 UTC
76 points
0 comments1 min readLW link
(www.jefftk.com)

An­a­lyz­ing A Cri­tique Of The AI 2027 Timeline Forecasts

Zvi24 Jun 2025 18:50 UTC
76 points
38 comments30 min readLW link
(thezvi.wordpress.com)

Ghiblifi­ca­tion for Privacy

jefftk10 Jun 2025 0:30 UTC
75 points
47 comments1 min readLW link
(www.jefftk.com)

Jankily con­trol­ling superintelligence

ryan_greenblatt27 Jun 2025 14:05 UTC
70 points
4 comments7 min readLW link

Thought Crime: Back­doors & Emer­gent Misal­ign­ment in Rea­son­ing Models

16 Jun 2025 16:43 UTC
68 points
2 comments8 min readLW link

Why “train­ing against schem­ing” is hard

Marius Hobbhahn24 Jun 2025 19:08 UTC
66 points
2 comments12 min readLW link

Mus­ings on AI Com­pa­nies of 2025-2026 (Jun 2025)

Vladimir_Nesov20 Jun 2025 17:14 UTC
66 points
4 comments3 min readLW link