The Prac­ti­cal Value of Flawed Models: A Re­sponse to tito­tal’s AI 2027 Critique

Michelle_Ma25 Jun 2025 22:15 UTC
7 points
1 comment6 min readLW link

I Tested LLM Agents on Sim­ple Safety Rules. They Failed in Sur­pris­ing and In­for­ma­tive Ways.

Ram Potham25 Jun 2025 21:39 UTC
9 points
12 comments6 min readLW link

Tech for Thinking

sarahconstantin25 Jun 2025 21:30 UTC
60 points
9 comments7 min readLW link
(sarahconstantin.substack.com)

Me­mory De­cod­ing Jour­nal Club: Sys­tems con­soli­da­tion re­or­ga­nizes hip­pocam­pal en­gram cir­cuitry

Devin Ward25 Jun 2025 21:21 UTC
3 points
0 comments1 min readLW link

Mak­ing Sense of Con­scious­ness Part 1: Per­cep­tual Awareness

sarahconstantin25 Jun 2025 21:10 UTC
19 points
0 comments9 min readLW link
(sarahconstantin.substack.com)

Dou­ble Pod­cast Drop on AI Safety

jacobhaimes25 Jun 2025 20:11 UTC
5 points
0 comments1 min readLW link

Is there a loom­ing Cul­tural Om­ni­cide?

Jared M.25 Jun 2025 18:18 UTC
24 points
7 comments5 min readLW link

A Method­ol­o­gist’s Apology

adamShimi25 Jun 2025 16:52 UTC
13 points
0 comments9 min readLW link
(formethods.substack.com)

Me­la­tonin Self-Ex­per­i­ment Results

silentbob25 Jun 2025 15:58 UTC
60 points
6 comments8 min readLW link

In­ter­stel­lar travel will prob­a­bly doom the long-term future

Jordan Stone25 Jun 2025 15:32 UTC
29 points
6 comments16 min readLW link

Sum­mary of John Halstead’s Book-Length Re­port on Ex­is­ten­tial Risks From Cli­mate Change

Bentham's Bulldog25 Jun 2025 15:14 UTC
44 points
14 comments21 min readLW link

Lurk­ing in the Noise

J Bostock25 Jun 2025 13:36 UTC
37 points
2 comments4 min readLW link

New Paper: Am­bigu­ous On­line Learning

Vanessa Kosoy25 Jun 2025 9:14 UTC
30 points
2 comments1 min readLW link
(arxiv.org)

Emer­gence of Si­mu­la­tors and Agents

25 Jun 2025 6:59 UTC
21 points
0 comments5 min readLW link

Defin­ing Cor­rigible and Use­ful Goals

Rubi J. Hudson25 Jun 2025 3:51 UTC
38 points
2 comments24 min readLW link

Mul­tispecies Me­tage­nomic Calibration

jefftk25 Jun 2025 2:50 UTC
12 points
0 comments1 min readLW link
(www.jefftk.com)

A regime-change power-vac­uum con­jec­ture about group belief

TsviBT24 Jun 2025 23:16 UTC
41 points
16 comments3 min readLW link

Ap­ply to be a men­tor in SPAR!

agucova24 Jun 2025 23:00 UTC
10 points
0 comments1 min readLW link

Machines of Faith­ful Obedience

Boaz Barak24 Jun 2025 22:06 UTC
41 points
19 comments10 min readLW link

Gra­di­ent Des­cent on To­ken In­put Embeddings

KAP24 Jun 2025 20:24 UTC
8 points
0 comments6 min readLW link

A crisis simu­la­tion changed how I think about AI risk

sjadler24 Jun 2025 20:04 UTC
5 points
0 comments2 min readLW link
(open.substack.com)

Towards a the­ory of lo­cal altruism

DMMF24 Jun 2025 19:39 UTC
11 points
1 comment5 min readLW link
(notnottalmud.substack.com)

Why “train­ing against schem­ing” is hard

Marius Hobbhahn24 Jun 2025 19:08 UTC
66 points
2 comments12 min readLW link

An­a­lyz­ing A Cri­tique Of The AI 2027 Timeline Forecasts

Zvi24 Jun 2025 18:50 UTC
76 points
38 comments30 min readLW link
(thezvi.wordpress.com)

What does 10x-ing effec­tive com­pute get you?

ryan_greenblatt24 Jun 2025 18:33 UTC
55 points
10 comments12 min readLW link

My pitch for the AI Village

Daniel Kokotajlo24 Jun 2025 15:00 UTC
178 points
35 comments5 min readLW link

An Anal­ogy for Interpretability

Roman Malov24 Jun 2025 14:56 UTC
13 points
2 comments2 min readLW link

The V&V method—A step to­wards safer AGI

Yoav Hollander24 Jun 2025 13:42 UTC
20 points
1 comment1 min readLW link
(blog.foretellix.com)

Try o3-pro in ChatGPT for $1 (is AI a bub­ble?)

Hauke Hillebrandt24 Jun 2025 11:15 UTC
15 points
2 comments4 min readLW link

Belief in con­ti­nu­ity of per­son­hood can be money-pumped

Filip Sondej24 Jun 2025 9:39 UTC
3 points
6 comments1 min readLW link

What can be learned from scary de­mos? A snitch­ing case study

Fabien Roger24 Jun 2025 8:40 UTC
22 points
1 comment7 min readLW link

How to Host a Find Me Party

Commander Zander24 Jun 2025 0:56 UTC
13 points
1 comment2 min readLW link

Lo­cal Speech Recog­ni­tion with Whisper

jefftk24 Jun 2025 0:30 UTC
11 points
0 comments2 min readLW link
(www.jefftk.com)

Twig—Fic­tion Review

Commander Zander24 Jun 2025 0:04 UTC
17 points
0 comments1 min readLW link

The Lazarus Pro­ject—TV Review

Commander Zander23 Jun 2025 23:53 UTC
6 points
0 comments1 min readLW link

Fron­tier AI Labs: the Call Op­tion to AGI

ykevinzhang23 Jun 2025 21:02 UTC
22 points
2 comments10 min readLW link

The Rose Test: a fun way to feel with your guts (not just log­i­cally un­der­stand) why AI-safety mat­ters right now (and get new adepts)

lovagrus23 Jun 2025 21:02 UTC
10 points
0 comments3 min readLW link

Knowl­edge Ex­trac­tion Plan (KEP): An al­ter­na­tive to reck­less scaling

aswani23 Jun 2025 20:58 UTC
1 point
0 comments1 min readLW link

His­tory re­peats it­self: Fron­tier AI labs act­ing like Ama­zon in the early 2000s

i_am_nuts23 Jun 2025 20:56 UTC
8 points
0 comments3 min readLW link
(iamnuts.substack.com)

Open Thread—Sum­mer 2025

habryka23 Jun 2025 20:54 UTC
23 points
70 comments1 min readLW link

Child­hood and Ed­u­ca­tion #10: Behaviors

Zvi23 Jun 2025 20:40 UTC
26 points
4 comments15 min readLW link
(thezvi.wordpress.com)

Com­pressed Com­pu­ta­tion is (prob­a­bly) not Com­pu­ta­tion in Superposition

23 Jun 2025 19:35 UTC
57 points
9 comments10 min readLW link

Si­tu­a­tional Aware­ness: A One-Year Retrospective

Nathan Delisle23 Jun 2025 19:15 UTC
82 points
4 comments12 min readLW link

Rec­og­niz­ing Optimality

jimmy23 Jun 2025 17:55 UTC
20 points
0 comments7 min readLW link

Com­par­ing risk from in­ter­nally-de­ployed AI to in­sider and out­sider threats from humans

Buck23 Jun 2025 17:47 UTC
150 points
22 comments3 min readLW link

Foom & Doom 2: Tech­ni­cal al­ign­ment is hard

Steven Byrnes23 Jun 2025 17:19 UTC
165 points
65 comments28 min readLW link

Foom & Doom 1: “Brain in a box in a base­ment”

Steven Byrnes23 Jun 2025 17:18 UTC
282 points
120 comments29 min readLW link

The Lies of Big Bug

Bentham's Bulldog23 Jun 2025 16:03 UTC
−4 points
2 comments4 min readLW link

AI com­pa­nies aren’t plan­ning to se­cure crit­i­cal model weights

Zach Stein-Perlman23 Jun 2025 16:00 UTC
15 points
0 comments1 min readLW link

“It isn’t magic”

Ben (Berlin)23 Jun 2025 14:00 UTC
92 points
17 comments2 min readLW link