CFAR up­date, and New CFAR workshops

AnnaSalamon25 Sep 2025 21:12 UTC
197 points
45 comments8 min readLW link

Mak­ing Sense of Con­scious­ness Part 5: Con­scious­ness and the Self

sarahconstantin25 Sep 2025 21:10 UTC
13 points
0 comments10 min readLW link
(sarahconstantin.substack.com)

Wi­den­ing AI Safety’s tal­ent pipeline by meet­ing peo­ple where they are

25 Sep 2025 20:50 UTC
30 points
3 comments8 min readLW link

Syn­the­siz­ing Stan­dalone World-Models, Part 3: Dataset-Assembly

Thane Ruthenis25 Sep 2025 19:21 UTC
13 points
0 comments2 min readLW link

Why you should eat meat—even if you hate fac­tory farming

KatWoods25 Sep 2025 15:39 UTC
299 points
88 comments10 min readLW link

What GPT-oss Leaks About OpenAI’s Train­ing Data

Lennart Finke25 Sep 2025 15:33 UTC
26 points
5 comments6 min readLW link

The real AI de­ploys itself

David Scott Krueger (formerly: capybaralet)25 Sep 2025 14:11 UTC
76 points
8 comments3 min readLW link
(therealartificialintelligence.substack.com)

AI #135: OpenAI Shows Us The Money

Zvi25 Sep 2025 13:40 UTC
23 points
2 comments44 min readLW link
(thezvi.wordpress.com)

Cel­e­brate Petrov day as if the but­ton had been pressed

Flying buttress25 Sep 2025 10:33 UTC
17 points
0 comments1 min readLW link

Un­der­stand­ing the state of fron­tier AI in China

Mitchell_Porter25 Sep 2025 10:16 UTC
11 points
3 comments3 min readLW link

Petrov Day at Lighthaven

jimrandomh25 Sep 2025 8:29 UTC
20 points
0 comments1 min readLW link

Some Thoughts on Mech Interp

d4hines25 Sep 2025 6:10 UTC
13 points
1 comment8 min readLW link

IABIED is on the NYT best­sel­ler list

Alice Blair25 Sep 2025 2:32 UTC
124 points
5 comments1 min readLW link

AI and the Hid­den Price of Comfort

nickgpop25 Sep 2025 2:16 UTC
6 points
8 comments7 min readLW link

Nate Soares — If Any­one Builds It, Every­one Dies: Why Su­per­hu­man AI Would Kill Us All—with Jon Wolfsthal — at The Wharf

habryka25 Sep 2025 0:53 UTC
11 points
0 comments2 min readLW link

AGI Com­pa­nies Won’t Profit From AGI

LTM24 Sep 2025 22:04 UTC
4 points
7 comments7 min readLW link
(routecause.substack.com)

Schem­ing Toy En­vi­ron­ment: “In­com­pe­tent Client”

Ariel_24 Sep 2025 21:03 UTC
17 points
2 comments32 min readLW link

Syn­the­siz­ing Stan­dalone World-Models, Part 2: Shift­ing Structures

Thane Ruthenis24 Sep 2025 19:02 UTC
16 points
5 comments10 min readLW link

Alibaba won the AI wars, we just don’t see it yet

Misha Ramendik24 Sep 2025 18:45 UTC
−10 points
0 comments2 min readLW link

The Aut­o­fac Era

Gordon Seidoh Worley24 Sep 2025 18:20 UTC
29 points
18 comments7 min readLW link
(uncertainupdates.substack.com)

“Shut It Down” is sim­pler than “Con­trol­led Take­off”

Raemon24 Sep 2025 17:21 UTC
97 points
29 comments5 min readLW link

AISN #63: Cal­ifor­nia’s SB-53 Passes the Legislature

24 Sep 2025 17:02 UTC
6 points
0 comments4 min readLW link
(newsletter.safe.ai)

OpenAI Shows Us The Money

Zvi24 Sep 2025 15:30 UTC
40 points
8 comments9 min readLW link
(thezvi.wordpress.com)

Launch­ing the $10,000 Ex­is­ten­tial Hope Meme Prize

elte24 Sep 2025 15:00 UTC
8 points
3 comments1 min readLW link

The Chi­nese Room re-vis­ited: How LLM’s have real (but differ­ent) un­der­stand­ing of words

James Diacoumis24 Sep 2025 14:06 UTC
6 points
0 comments9 min readLW link
(jamesdiacoumis.substack.com)

An ar­gu­ment for dis­cussing AI safety in per­son be­ing underused

Kabir Kumar24 Sep 2025 11:36 UTC
17 points
1 comment2 min readLW link

How sin­gle­ton con­tra­dicts longtermism

kapedalex24 Sep 2025 11:10 UTC
3 points
1 comment1 min readLW link

Berkeley Petrov Day

Darmani24 Sep 2025 7:59 UTC
6 points
0 comments1 min readLW link

EU and Monopoly on Violence

Martin Sustrik24 Sep 2025 7:51 UTC
118 points
3 comments5 min readLW link
(www.250bpm.com)

Misal­ign­ment and Role­play­ing: Are Misal­igned LLMs Act­ing Out Sci-Fi Sto­ries?

Mark Keavney24 Sep 2025 2:09 UTC
30 points
4 comments13 min readLW link

A Pos­si­ble Fu­ture: De­cen­tral­ized AGI Proliferation

Dev.Errata23 Sep 2025 22:24 UTC
11 points
7 comments2 min readLW link

Mu­nich, Bavaria “If Any­one Builds It” read­ing group

hilll23 Sep 2025 22:03 UTC
11 points
0 comments1 min readLW link

Prague “If Any­one Builds It” read­ing group

Marek Dědič23 Sep 2025 21:49 UTC
14 points
0 comments1 min readLW link

Dra­co­nian mea­sures can in­crease the risk of ir­re­vo­ca­ble catastrophe

dsj23 Sep 2025 21:40 UTC
22 points
2 comments2 min readLW link
(thedavidsj.substack.com)

[Ques­tion] What the dis­con­ti­nu­ity is, if not FOOM?

TAG23 Sep 2025 19:30 UTC
18 points
14 comments3 min readLW link

Sa­muel Shadrach Interviewed

samuelshadrach23 Sep 2025 18:58 UTC
9 points
0 comments1 min readLW link

State­ment of Sup­port for “If Any­one Builds It, Every­one Dies”

Liron23 Sep 2025 17:51 UTC
67 points
34 comments1 min readLW link

Notes on fatal­ities from AI takeover

ryan_greenblatt23 Sep 2025 17:18 UTC
55 points
60 comments8 min readLW link

Zendo for large groups

philh23 Sep 2025 17:10 UTC
13 points
1 comment1 min readLW link
(reasonableapproximation.net)

Syn­the­siz­ing Stan­dalone World-Models, Part 1: Ab­strac­tion Hierarchies

Thane Ruthenis23 Sep 2025 17:01 UTC
23 points
10 comments23 min readLW link

A Com­pat­i­bil­ist Defi­ni­tion of Santa Claus

Shiva's Right Foot23 Sep 2025 16:57 UTC
18 points
9 comments1 min readLW link

Ethics-Based Re­fusals Without Ethics-Based Re­fusal Training

1a3orn23 Sep 2025 16:35 UTC
91 points
2 comments19 min readLW link

Why Smarter Doesn’t Mean Kin­der: Orthog­o­nal­ity and In­stru­men­tal Convergence

Alexander Müller23 Sep 2025 16:06 UTC
6 points
0 comments6 min readLW link

More Re­ac­tions to If Any­one Builds It, Every­one Dies

Zvi23 Sep 2025 16:00 UTC
33 points
20 comments20 min readLW link
(thezvi.wordpress.com)

On­tolog­i­cal Cluelessness

23 Sep 2025 14:31 UTC
14 points
12 comments4 min readLW link

We are likely in an AI over­hang, and this is bad.

Gabriel Alfour23 Sep 2025 14:15 UTC
55 points
16 comments1 min readLW link
(cognition.cafe)

Prompt op­ti­miza­tion can en­able AI con­trol research

23 Sep 2025 12:46 UTC
35 points
3 comments9 min readLW link

Two Math­e­mat­i­cal Per­spec­tives on AI Hal­lu­ci­na­tions and Uncertainty

LorenzoPacchiardi23 Sep 2025 11:06 UTC
0 points
1 comment3 min readLW link

Ac­celerando as a “Slow, Rea­son­ably Nice Take­off” Story

Raemon23 Sep 2025 2:15 UTC
71 points
19 comments30 min readLW link

On failure, and keep­ing doors open; clos­ing thoughts

jimmy23 Sep 2025 1:11 UTC
7 points
0 comments10 min readLW link