Ome­las Is Perfectly Misread

Tobias H2 Oct 2025 23:11 UTC
197 points
49 comments5 min readLW link

Nice-ish, smooth take­off (with im­perfect safe­guards) prob­a­bly kills most “clas­sic hu­mans” in a few decades.

Raemon2 Oct 2025 21:03 UTC
143 points
19 comments12 min readLW link

The Origami Men

Tomás B.6 Oct 2025 15:25 UTC
136 points
8 comments16 min readLW link

Check­ing in on AI-2027

Baybar2 Oct 2025 18:46 UTC
119 points
21 comments4 min readLW link

Do One New Thing A Day To Solve Your Problems

Algon3 Oct 2025 17:08 UTC
102 points
5 comments2 min readLW link

Where does Son­net 4.5′s de­sire to “not get too com­fortable” come from?

Kaj_Sotala4 Oct 2025 10:19 UTC
91 points
16 comments64 min readLW link

Grad­ual Disem­pow­er­ment Monthly Roundup

Raymond Douglas6 Oct 2025 15:36 UTC
90 points
7 comments6 min readLW link

Mak­ing Your Pain Worse can Get You What You Want

Logan Riggs5 Oct 2025 0:19 UTC
76 points
3 comments3 min readLW link

Bend­ing The Curve

Zvi7 Oct 2025 20:00 UTC
72 points
6 comments21 min readLW link
(thezvi.wordpress.com)

Elic­it­ing se­cret knowl­edge from lan­guage models

2 Oct 2025 20:57 UTC
67 points
3 comments2 min readLW link
(arxiv.org)

The Coun­ter­fac­tual Quiet AGI Timeline

Davidmanheim5 Oct 2025 9:09 UTC
64 points
5 comments9 min readLW link

Maybe so­cial me­dia al­gorithms don’t suck

Algon5 Oct 2025 18:47 UTC
64 points
18 comments3 min readLW link

“In­tel­li­gence” → “Re­lentless, Creative Re­source­ful­ness”

Raemon7 Oct 2025 0:28 UTC
63 points
28 comments17 min readLW link

“Pes­simiza­tion” is Just Or­di­nary Failure

J Bostock1 Oct 2025 13:48 UTC
56 points
2 comments6 min readLW link

Re­cent AI Experiences

abramdemski3 Oct 2025 19:32 UTC
54 points
1 comment6 min readLW link

Petri: An open-source au­dit­ing tool to ac­cel­er­ate AI safety research

Sam Marks7 Oct 2025 20:39 UTC
54 points
0 comments1 min readLW link
(alignment.anthropic.com)

</​rant> </​un­char­i­ta­ble> </​psy­chol­o­giz­ing>

Raemon1 Oct 2025 21:20 UTC
53 points
11 comments2 min readLW link

We’ve au­to­mated x-risk-pilling people

Mikhail Samin3 Oct 2025 10:26 UTC
51 points
27 comments1 min readLW link
(whycare.aisgf.us)

How to Feel More Alive

Logan Riggs2 Oct 2025 15:45 UTC
47 points
2 comments4 min readLW link

LLMs one-box when in a “hos­tile telepath” ver­sion of New­comb’s Para­dox, ex­cept for the one that beat the predictor

Kaj_Sotala6 Oct 2025 8:44 UTC
47 points
6 comments17 min readLW link

IABIED and Memetic Engineering

Error3 Oct 2025 1:01 UTC
47 points
5 comments4 min readLW link

No, That’s Not What the Flight Costs

Max Niederman2 Oct 2025 17:55 UTC
45 points
15 comments1 min readLW link
(maxniederman.com)

Lec­tures on statis­ti­cal learn­ing the­ory for al­ign­ment researchers

Vanessa Kosoy1 Oct 2025 8:36 UTC
41 points
1 comment1 min readLW link
(www.youtube.com)

Claude Son­net 4.5 Is A Very Good Model

Zvi1 Oct 2025 18:00 UTC
40 points
2 comments24 min readLW link
(thezvi.wordpress.com)

Sora and The Big Bright Screen Slop Machine

Zvi3 Oct 2025 11:40 UTC
38 points
1 comment35 min readLW link
(thezvi.wordpress.com)

You Should Get a Reusable Mask

jefftk8 Oct 2025 2:40 UTC
37 points
3 comments1 min readLW link
(www.jefftk.com)

Some Biol­ogy Re­lated Things I Found Interesting

Morpheus2 Oct 2025 12:18 UTC
37 points
9 comments2 min readLW link

Re­plac­ing RL w/​ Pa­ram­e­ter-based Evolu­tion­ary Strategies

Logan Riggs8 Oct 2025 1:02 UTC
37 points
3 comments3 min readLW link

Do Things for as Many Rea­sons as Possible

Philipreal6 Oct 2025 0:28 UTC
35 points
1 comment2 min readLW link

An­ti­so­cial me­dia: AI’s kil­ler app?

David Scott Krueger (formerly: capybaralet)3 Oct 2025 0:00 UTC
35 points
8 comments5 min readLW link
(therealartificialintelligence.substack.com)

[Question] Gen­er­al­iza­tion and the Mul­ti­ple Stage Fal­lacy?

Zack_M_Davis7 Oct 2025 6:20 UTC
34 points
6 comments3 min readLW link

AI #136: A Song and Dance

Zvi2 Oct 2025 13:10 UTC
33 points
3 comments47 min readLW link
(thezvi.wordpress.com)

AI and Cheap Weapons

Felix C.1 Oct 2025 17:31 UTC
31 points
3 comments23 min readLW link

Base64Bench: How good are LLMs at base64, and why care about it?

richbc5 Oct 2025 18:07 UTC
31 points
6 comments11 min readLW link

We won’t get AIs smart enough to solve al­ign­ment but too dumb to rebel

Joe Rogero6 Oct 2025 21:49 UTC
28 points
16 comments5 min readLW link

LLMs are badly misaligned

Joe Rogero5 Oct 2025 14:00 UTC
27 points
25 comments3 min readLW link

Mak­ing Sense of Con­scious­ness Part 6: Per­cep­tions of Disembodiment

sarahconstantin3 Oct 2025 20:40 UTC
27 points
0 comments8 min readLW link
(sarahconstantin.substack.com)

How the NanoGPT Speedrun WR dropped by 20% in 3 months

larry-dial5 Oct 2025 1:05 UTC
26 points
9 comments9 min readLW link

Med­i­cal Roundup #5

Zvi6 Oct 2025 15:10 UTC
26 points
2 comments26 min readLW link
(thezvi.wordpress.com)

Ex­cerpts from my neu­ro­science to-do list

Steven Byrnes6 Oct 2025 21:05 UTC
26 points
1 comment4 min readLW link

Tel­ling the Differ­ence Between Me­mories & Log­i­cal Guesses

Logan Riggs7 Oct 2025 5:46 UTC
25 points
3 comments4 min readLW link

What I’ve Learnt About How to Sleep

Algon4 Oct 2025 20:52 UTC
25 points
7 comments2 min readLW link

But what kind of stuff can you just do?

Bastiaan1 Oct 2025 16:58 UTC
25 points
5 comments1 min readLW link

An­thropic’s JumpReLU train­ing method is re­ally good

3 Oct 2025 15:23 UTC
22 points
0 comments2 min readLW link

Good­ness is harder to achieve than competence

Joe Rogero3 Oct 2025 21:32 UTC
22 points
0 comments3 min readLW link

Good is a smaller tar­get than smart

Joe Rogero3 Oct 2025 21:04 UTC
21 points
0 comments2 min readLW link

In­tent al­ign­ment seems incoherent

Joe Rogero7 Oct 2025 23:01 UTC
20 points
1 comment6 min readLW link

The quo­ta­tion mark

Maxwell Peterson5 Oct 2025 23:23 UTC
19 points
8 comments13 min readLW link

My Brush with Su­per­hu­man Persuasion

Ben S.1 Oct 2025 17:50 UTC
18 points
13 comments9 min readLW link
(thebsdetector.substack.com)

Re­search Robots: When AIs Ex­per­i­ment on Us

Shoshannah Tekofsky7 Oct 2025 12:10 UTC
18 points
0 comments7 min readLW link
(theaidigest.org)