Remembrancy

Algon25 Oct 2025 22:47 UTC
11 points
0 comments3 min readLW link

Pyg­mal­ion’s Wafer

Charlie Sanders25 Oct 2025 20:17 UTC
8 points
2 comments4 min readLW link
(www.dailymicrofiction.com)

De­bat­ing theism

Ivan25 Oct 2025 18:35 UTC
−21 points
0 comments25 min readLW link

[Question] Why is OpenAI re­leas­ing prod­ucts like Sora and At­las?

J Thomas Moros25 Oct 2025 17:59 UTC
16 points
10 comments1 min readLW link

Ori­gins and dan­gers of fu­ture AI ca­pa­bil­ity denial

Patrick Spencer25 Oct 2025 16:13 UTC
68 points
18 comments10 min readLW link

Do you com­pletely trust that you are com­pletely in the shit? - de­spair and in­for­ma­tion -

P. João25 Oct 2025 14:42 UTC
−2 points
17 comments3 min readLW link

Assess­ing Far UVC Positioning

jefftk25 Oct 2025 14:00 UTC
20 points
3 comments2 min readLW link
(www.jefftk.com)

Mus­ings on Re­ported Cost of Com­pute (Oct 2025)

Vladimir_Nesov24 Oct 2025 20:42 UTC
103 points
11 comments2 min readLW link

Re­gard­less of X, you can still just sign su­per­in­tel­li­gence-state­ment.org if you agree

Ishual24 Oct 2025 20:30 UTC
58 points
0 comments3 min readLW link

The Fu­ture of In­ter­pretabil­ity is Geometric

sbaumohl24 Oct 2025 18:32 UTC
23 points
0 comments5 min readLW link

New State­ment Calls For Not Build­ing Su­per­in­tel­li­gence For Now

Zvi24 Oct 2025 17:40 UTC
80 points
3 comments7 min readLW link
(thezvi.wordpress.com)

Notes on “Ex­plain­ing AI Ex­plain­abil­ity”

Eleni Angelou24 Oct 2025 17:22 UTC
20 points
0 comments6 min readLW link

Can Rea­son­ing Models Obfus­cate Rea­son­ing? Stress-Test­ing Chain-of-Thought Monitorability

24 Oct 2025 17:21 UTC
17 points
1 comment5 min readLW link

I will not sign up for cryonics

Syd Lonreiro_24 Oct 2025 16:56 UTC
−18 points
5 comments1 min readLW link

Dol­lars in poli­ti­cal giv­ing are less fun­gible than you might think

lincolnquirk24 Oct 2025 15:54 UTC
6 points
1 comment5 min readLW link
(lincolnquirk.substack.com)

Can AI Agents with Diver­gent In­ter­ests Learn To Prevent Civ­i­liza­tional Failures?

joao_abrantes24 Oct 2025 15:08 UTC
1 point
0 comments1 min readLW link

LW Re­acts pack for Dis­cord/​Slack/​etc

plex24 Oct 2025 13:20 UTC
65 points
13 comments1 min readLW link
(drive.google.com)

AI Timelines and Points of no return

Gabriel Alfour24 Oct 2025 11:15 UTC
36 points
8 comments1 min readLW link
(cognition.cafe)

In­tro­duc­ing Con­trolArena: A library for run­ning AI con­trol experiments

Mojmir24 Oct 2025 9:51 UTC
13 points
0 comments3 min readLW link
(www.aisi.gov.uk)

Can we steer AI mod­els to­ward safer ac­tions by mak­ing these in­stru­men­tally use­ful?

Francesca Gomez24 Oct 2025 9:18 UTC
5 points
0 comments2 min readLW link
(www.wiserhuman.ai)

Plan 1 and Plan 2

Towards_Keeperhood24 Oct 2025 8:18 UTC
50 points
22 comments3 min readLW link

Guys I might be an e/​acc

Taylor G. Lunt24 Oct 2025 3:25 UTC
14 points
29 comments4 min readLW link

How an AI com­pany CEO could quietly take over the world

Alex Kastner23 Oct 2025 23:33 UTC
52 points
13 comments11 min readLW link

Wor­lds Where Iter­a­tive De­sign Suc­ceeds?

Max Harms23 Oct 2025 22:14 UTC
23 points
5 comments8 min readLW link

Au­to­mated real time mon­i­tor­ing and or­ches­tra­tion of cod­ing agents

23 Oct 2025 22:12 UTC
8 points
0 comments2 min readLW link
(fulcrumresearch.ai)

Re­minder: Mo­ral­ity is unsolved

Jesper L.23 Oct 2025 21:42 UTC
27 points
45 comments3 min readLW link

The main way I’ve seen peo­ple turn ide­olog­i­cally crazy [Linkpost]

Noosphere8923 Oct 2025 20:09 UTC
123 points
22 comments8 min readLW link
(andymasley.substack.com)

Em­piri­cal Par­tial Derivatives

sonicrocketman23 Oct 2025 17:54 UTC
8 points
0 comments3 min readLW link
(brianschrader.com)

Build­ing a differ­ent kind of per­sonal intelligence

Rebecca Dai23 Oct 2025 17:45 UTC
7 points
0 comments9 min readLW link
(rebeccadai.substack.com)

Beliefs about for­mal meth­ods and AI safety

Quinn23 Oct 2025 16:43 UTC
32 points
0 comments5 min readLW link

AI #139: The Over­reach Machines

Zvi23 Oct 2025 15:30 UTC
35 points
5 comments52 min readLW link
(thezvi.wordpress.com)

Should AI Devel­op­ers Re­move Dis­cus­sion of AI Misal­ign­ment from AI Train­ing Data?

Alek Westover23 Oct 2025 15:12 UTC
43 points
3 comments9 min readLW link

Se­cureBio is Hiring Soft­ware Engineers

jefftk23 Oct 2025 14:10 UTC
21 points
0 comments1 min readLW link
(www.jefftk.com)

Is ter­mi­nal lu­cidity real?

Ariel Zeleznikow-Johnston23 Oct 2025 11:40 UTC
20 points
0 comments1 min readLW link
(open.substack.com)

A Con­crete Roadmap to­wards Safety Cases based on Chain-of-Thought Monitoring

Wuschel Schulz23 Oct 2025 11:34 UTC
37 points
5 comments4 min readLW link
(arxiv.org)

Differ­ences in Align­ment Be­havi­our be­tween Sin­gle-Agent and Multi-Agent AI Systems

23 Oct 2025 11:17 UTC
7 points
3 comments5 min readLW link

LW Psychosis

Annabelle23 Oct 2025 8:12 UTC
18 points
10 comments3 min readLW link

An­nounc­ing the Fu­turekind Win­ter Fel­low­ship 2025/​6

Aditya S23 Oct 2025 5:40 UTC
1 point
0 comments4 min readLW link

Learn­ing to In­ter­pret Weight Differ­ences in Lan­guage Models

avichal23 Oct 2025 3:55 UTC
89 points
2 comments5 min readLW link
(arxiv.org)

AGI’s Last Bottlenecks

adamk23 Oct 2025 3:28 UTC
17 points
2 comments9 min readLW link

State­ment on Su­per­in­tel­li­gence—FLI Open Letter

plex22 Oct 2025 22:26 UTC
59 points
0 comments1 min readLW link
(superintelligence-statement.org)

The Doomers Were Right

Algon22 Oct 2025 22:18 UTC
204 points
26 comments3 min readLW link

Tech­ni­cal Ac­cel­er­a­tion Meth­ods for AI Safety: Sum­mary from Oc­to­ber 2025 Symposium

Martin Leitgab22 Oct 2025 21:33 UTC
25 points
2 comments6 min readLW link

Why AI al­ign­ment mat­ters today

Mislav Jurić22 Oct 2025 21:27 UTC
6 points
0 comments4 min readLW link

Any cor­rigi­bil­ity naysay­ers out­side of MIRI?

Max Harms22 Oct 2025 21:26 UTC
28 points
24 comments1 min readLW link

Which side of the AI safety com­mu­nity are you in?

Max Tegmark22 Oct 2025 21:17 UTC
141 points
88 comments2 min readLW link

Ho­mo­mor­phi­cally en­crypted con­scious­ness and its implications

jessicata22 Oct 2025 20:27 UTC
35 points
48 comments12 min readLW link
(unstableontology.com)

Dead-switches as AI safety tools

Jesper L.22 Oct 2025 19:57 UTC
2 points
6 comments5 min readLW link

Con­sider donat­ing to AI safety cham­pion Scott Wiener

Eric Neyman22 Oct 2025 18:40 UTC
133 points
9 comments18 min readLW link
(ericneyman.wordpress.com)

Pos­tra­tional­ity: An Oral History

Gordon Seidoh Worley22 Oct 2025 16:10 UTC
44 points
4 comments30 min readLW link
(www.uncertainupdates.com)