Yet Another IABIED Review

PeterMcCluskey28 Sep 2025 21:36 UTC
15 points
0 comments7 min readLW link
(bayesianinvestor.com)

A non-re­view of “If Any­one Builds It, Every­one Dies”

boazbarak28 Sep 2025 17:34 UTC
125 points
50 comments4 min readLW link

Trans­gen­der Sticker Fallacy

ymeskhout28 Sep 2025 16:54 UTC
110 points
25 comments7 min readLW link
(www.ymeskhout.com)

Solv­ing the prob­lem of need­ing to give a talk

Kaj_Sotala28 Sep 2025 15:34 UTC
60 points
3 comments8 min readLW link

Les­sons from or­ga­niz­ing a tech­ni­cal AI safety bootcamp

28 Sep 2025 13:48 UTC
16 points
3 comments16 min readLW link

The Risk of Hu­man Disconnection

Priyanka Bharadwaj28 Sep 2025 2:14 UTC
5 points
0 comments3 min readLW link

A Re­ply to MacAskill on “If Any­one Builds It, Every­one Dies”

Rob Bensinger27 Sep 2025 23:03 UTC
55 points
21 comments17 min readLW link

The Sen­si­ble Way For­ward for AI Alignment

Davey Morse27 Sep 2025 21:00 UTC
−9 points
0 comments3 min readLW link

Book Re­view: The System

Julius27 Sep 2025 20:49 UTC
14 points
2 comments16 min readLW link
(thegreymatter.substack.com)

Learn­ings from AI safety course so far

boazbarak27 Sep 2025 18:17 UTC
103 points
5 comments3 min readLW link

My Weirdest Ex­pe­rience Wasn’t

Bridgett Kay27 Sep 2025 18:01 UTC
24 points
3 comments3 min readLW link
(dxmrevealed.wordpress.com)

Mak­ing sense of pa­ram­e­ter-space decomposition

Malmesbury27 Sep 2025 17:37 UTC
45 points
0 comments19 min readLW link

AI Safety Field Growth Anal­y­sis 2025

Stephen McAleese27 Sep 2025 17:03 UTC
29 points
13 comments3 min readLW link

2025 Petrov day speech

nick lacombe27 Sep 2025 15:07 UTC
9 points
0 comments1 min readLW link
(nikthink.net)

LLMs Suck at Deep Think­ing Part 3 - Try­ing to Prove It (fixed)

Taylor G. Lunt27 Sep 2025 14:54 UTC
17 points
6 comments15 min readLW link

Our Beloved Monsters

Tomás B.27 Sep 2025 13:25 UTC
71 points
4 comments11 min readLW link

Rank­ing the endgames of AI development

Sean Herrington27 Sep 2025 11:47 UTC
17 points
4 comments5 min readLW link

An N=1 ob­ser­va­tional study on in­ter­pretabil­ity of Nat­u­ral Gen­eral In­tel­li­gence (NGI)

dr_s27 Sep 2025 9:28 UTC
12 points
3 comments6 min readLW link

Day #14 Hunger Strike, on livestream, In protest of Su­per­in­tel­li­gent AI

samuelshadrach27 Sep 2025 9:16 UTC
2 points
0 comments2 min readLW link

[CS 2881r] [Week 3] Ad­ver­sar­ial Ro­bust­ness, Jailbreaks, Prompt In­jec­tion, Security

egeckr27 Sep 2025 1:31 UTC
2 points
0 comments26 min readLW link

Nar­ra­tive Struc­ture And The Prin­ci­ple Of Least Action

sonicrocketman27 Sep 2025 1:31 UTC
1 point
1 comment3 min readLW link
(brianschrader.com)

Ex­plor­ing be­lief states in LLM chains of thought

emanuelr27 Sep 2025 1:09 UTC
4 points
2 comments7 min readLW link

Re­hears­ing the Fu­ture: Table­top Ex­er­cises for Risks, and Readiness

27 Sep 2025 0:50 UTC
9 points
0 comments3 min readLW link

AI Safety Isn’t So Unique

Baram Sosis27 Sep 2025 0:36 UTC
11 points
1 comment9 min readLW link

An­thropic Eco­nomic In­dex report

anaguma26 Sep 2025 23:49 UTC
4 points
0 comments4 min readLW link
(www.anthropic.com)

Some­one Will Build It

entirelyalive26 Sep 2025 23:39 UTC
−1 points
0 comments12 min readLW link

Rea­sons to sell fron­tier lab equity to donate now rather than later

26 Sep 2025 23:07 UTC
217 points
32 comments12 min readLW link

Com­par­a­tive Anal­y­sis of Black Box Meth­ods for De­tect­ing Eval­u­a­tion Aware­ness in LLMs

Igor Ivanov26 Sep 2025 21:56 UTC
12 points
0 comments14 min readLW link

Mechanism de­sign of yet an­other me­dian world

Greenless Mirror26 Sep 2025 21:51 UTC
3 points
0 comments10 min readLW link

Me­tac­u­lus is Hiring a Head of Con­sult­ing Services

ChristianWilliams26 Sep 2025 21:43 UTC
7 points
0 comments2 min readLW link
(apply.workable.com)

The Illus­trated Petrov Day Ceremony

Raemon26 Sep 2025 21:01 UTC
93 points
11 comments2 min readLW link

Ex­per­i­ments with Futarchy

Ben S.26 Sep 2025 18:27 UTC
4 points
0 comments7 min readLW link
(news.manifold.markets)

Hu­man in the Loop: on Los­ing Con­trol of Au­tonomous Systems

Nostradamus_226 Sep 2025 18:27 UTC
3 points
0 comments9 min readLW link
(terminalvel0city.substack.com)

Syn­the­siz­ing Stan­dalone World-Models, Part 4: Me­ta­phys­i­cal Justifications

Thane Ruthenis26 Sep 2025 18:00 UTC
23 points
6 comments4 min readLW link

On keep­ing chains of thought monitorable

Oscar26 Sep 2025 16:30 UTC
9 points
0 comments3 min readLW link

IABIED Misc. Dis­cus­sion Thread

WilliamKiely26 Sep 2025 16:22 UTC
5 points
5 comments1 min readLW link

Eco­nomics Roundup #6

Zvi26 Sep 2025 14:10 UTC
20 points
5 comments15 min readLW link
(thezvi.wordpress.com)

The AI Village in Numbers

Shoshannah Tekofsky26 Sep 2025 13:40 UTC
6 points
0 comments4 min readLW link
(theaidigest.org)

What Hap­pened After My Rat Group Backed Ka­mala Harris

Blake26 Sep 2025 12:39 UTC
37 points
3 comments1 min readLW link

[Question] Feed­back re­quest: Is the time right for an AI Safety stack ex­change?

lennie26 Sep 2025 9:14 UTC
22 points
0 comments4 min readLW link

Con­strained Belief Up­dates and Geo­met­ric Struc­tures in Trans­former Rep­re­sen­ta­tions for the RRXOR Process

bgradowhite26 Sep 2025 1:25 UTC
4 points
0 comments11 min readLW link

[CS 2881r AI Safety] [Week 2] Modern LLM Training

jusyc26 Sep 2025 1:25 UTC
1 point
0 comments4 min readLW link

CFAR up­date, and New CFAR workshops

AnnaSalamon25 Sep 2025 21:12 UTC
197 points
45 comments8 min readLW link

Mak­ing Sense of Con­scious­ness Part 5: Con­scious­ness and the Self

sarahconstantin25 Sep 2025 21:10 UTC
13 points
0 comments10 min readLW link
(sarahconstantin.substack.com)

Wi­den­ing AI Safety’s tal­ent pipeline by meet­ing peo­ple where they are

25 Sep 2025 20:50 UTC
30 points
3 comments8 min readLW link

Syn­the­siz­ing Stan­dalone World-Models, Part 3: Dataset-Assembly

Thane Ruthenis25 Sep 2025 19:21 UTC
13 points
0 comments2 min readLW link

Why you should eat meat—even if you hate fac­tory farming

KatWoods25 Sep 2025 15:39 UTC
299 points
88 comments10 min readLW link

What GPT-oss Leaks About OpenAI’s Train­ing Data

Lennart Finke25 Sep 2025 15:33 UTC
26 points
5 comments6 min readLW link

The real AI de­ploys itself

David Scott Krueger (formerly: capybaralet)25 Sep 2025 14:11 UTC
76 points
8 comments3 min readLW link
(therealartificialintelligence.substack.com)

AI #135: OpenAI Shows Us The Money

Zvi25 Sep 2025 13:40 UTC
23 points
2 comments44 min readLW link
(thezvi.wordpress.com)