When does Claude sab­o­tage code? An Agen­tic Misal­ign­ment fol­low-up

Nathan Delisle9 Nov 2025 23:11 UTC
18 points
0 comments5 min readLW link

Man­i­fest X DC Open­ing Bene­dic­tion - Mak­ing Friends Along the Way

JohnofCharleston9 Nov 2025 23:10 UTC
42 points
0 comments4 min readLW link

Seed Of Oa­sis

0xA9 Nov 2025 22:00 UTC
1 point
0 comments170 min readLW link

Learn­ing in­for­ma­tion which is full of spiders

Screwtape9 Nov 2025 21:05 UTC
59 points
8 comments9 min readLW link

In­tro­spec­tion or con­fu­sion?

Victor Godet9 Nov 2025 20:53 UTC
43 points
3 comments4 min readLW link

Re­learn­ing how to be human

mingyuan9 Nov 2025 20:20 UTC
18 points
0 comments3 min readLW link
(mingyuan.substack.com)

Condensation

abramdemski9 Nov 2025 19:08 UTC
156 points
16 comments16 min readLW link

Om­ni­science one bit at a time: Chap­ter 1

Dentosal9 Nov 2025 18:10 UTC
23 points
0 comments3 min readLW link

We’re Not The Cen­ter of the Mo­ral Universe

Bentham's Bulldog9 Nov 2025 16:46 UTC
−6 points
9 comments7 min readLW link

Grad­ual Disem­pow­er­ment Monthly Roundup #2

Raymond Douglas9 Nov 2025 14:43 UTC
46 points
1 comment6 min readLW link

We’re Already Liv­ing in a Sci-Fi World

David Bravo9 Nov 2025 14:20 UTC
7 points
0 comments4 min readLW link

AI hasn’t seen wide­spread adop­tion be­cause the labs are fo­cus­ing on au­tomat­ing AI R&D

beyarkay (Boyd Kane)9 Nov 2025 14:02 UTC
10 points
4 comments2 min readLW link
(boydkane.com)

Struc­tural Es­ti­mates of Hu­man Computation

Nicolas Villarreal9 Nov 2025 13:46 UTC
−11 points
0 comments6 min readLW link

Heroic re­spon­si­bil­ity is morally neutral

Dumbledore's Army9 Nov 2025 13:15 UTC
18 points
2 comments1 min readLW link

Prob­lems I’ve Tried to Legibilize

Wei Dai9 Nov 2025 10:27 UTC
143 points
24 comments2 min readLW link

The Gen­eral So­cial Sur­vey and the ACX Survey

Screwtape9 Nov 2025 7:54 UTC
15 points
1 comment5 min readLW link

There should be unicorns

Mikhail Samin9 Nov 2025 7:38 UTC
17 points
0 comments2 min readLW link
(mikhailsamin.substack.com)

One Shot Sin­ga­long­ing is an at­ti­tude, not a skill or a song-difficulty-level*

Raemon9 Nov 2025 7:29 UTC
54 points
11 comments7 min readLW link

Where Our Eng­ineer­ing Ed­u­ca­tion Went Wrong

L.M.Sherlock9 Nov 2025 6:02 UTC
26 points
2 comments9 min readLW link
(lmsherlock.substack.com)

A son­net, a ses­tina, a villanelle

mingyuan9 Nov 2025 5:20 UTC
27 points
0 comments2 min readLW link
(mingyuan.substack.com)

n-ary Huff­man coding

Adam Scherlis9 Nov 2025 5:05 UTC
18 points
0 comments3 min readLW link
(adam.scherl.is)

Liou­ville’s The­o­rem and the Se­cond Law

Algon9 Nov 2025 0:00 UTC
26 points
4 comments2 min readLW link

In­so­far As I Think LLMs “Don’t Really Un­der­stand Things”, What Do I Mean By That?

johnswentworth8 Nov 2025 23:37 UTC
90 points
15 comments3 min readLW link

Why AC is cheap, but AC re­pair is a luxury

Annapurna8 Nov 2025 23:01 UTC
3 points
0 comments1 min readLW link
(a16z.substack.com)

My­opia Mythology

abramdemski8 Nov 2025 22:22 UTC
38 points
3 comments3 min readLW link

Om­nis­cal­ing to MNIST

cloud8 Nov 2025 19:42 UTC
100 points
3 comments10 min readLW link

Can Models be Eval­u­a­tion Aware Without Ex­plicit Ver­bal­iza­tion?

8 Nov 2025 18:26 UTC
26 points
10 comments8 min readLW link

Cake vs Lack of Cake

Notelrac8 Nov 2025 18:10 UTC
1 point
0 comments2 min readLW link

Un­ex­pected Things that are People

Ben Goldhaber8 Nov 2025 17:12 UTC
209 points
11 comments4 min readLW link

A hu­man­ist cri­tique of tech­nolog­i­cal determinism

8 Nov 2025 15:27 UTC
10 points
0 comments6 min readLW link

Five very good rea­sons to not write down liter­ally ev­ery sin­gle thought you have

ceselder8 Nov 2025 10:22 UTC
18 points
2 comments4 min readLW link

Re­view: Par­si­fal at the SF Opera

Adam Scherlis8 Nov 2025 8:25 UTC
10 points
0 comments6 min readLW link
(adam.scherl.is)

Es­ca­la­tion and per­cep­tion

TsviBT8 Nov 2025 8:12 UTC
69 points
0 comments12 min readLW link

The Snaw

Screwtape8 Nov 2025 6:42 UTC
23 points
5 comments2 min readLW link

Au­gus­tine of Hippo’s Hand­book on Faith, Hope, and Love in Latin (or: Claude as Pan­doc++)

DanielFilan8 Nov 2025 6:31 UTC
8 points
2 comments1 min readLW link
(danielfilan.com)

Com­par­ing Payor & Löb

abramdemski8 Nov 2025 5:40 UTC
54 points
1 comment3 min readLW link

Mourn­ing a life with­out AI

Nikola Jurkovic8 Nov 2025 4:44 UTC
194 points
63 comments6 min readLW link
(nikolajurkovic.substack.com)

Two Times I Was Sur­prised By My Own Values

johnswentworth8 Nov 2025 3:56 UTC
13 points
1 comment3 min readLW link

The solu­tion to akra­sia ap­par­ently isn’t not hav­ing any goals

Dentosal8 Nov 2025 3:53 UTC
1 point
0 comments3 min readLW link

An­thropic & Dario’s dream

Simon Lermen8 Nov 2025 1:19 UTC
55 points
1 comment5 min readLW link

Against “You can just do things”

Zephaniah Roe8 Nov 2025 0:58 UTC
61 points
9 comments3 min readLW link

Agent Foun­da­tions: Paradig­ma­tiz­ing in Math and Science

TristanTrim8 Nov 2025 0:37 UTC
3 points
0 comments9 min readLW link

From Ther­mo­dy­nam­ics to Sora: A Com­pre­hen­sive In­tro­duc­tion to Denois­ing Diffu­sion for Video Generation

phenomanon7 Nov 2025 23:36 UTC
5 points
0 comments15 min readLW link

Pythia

plex7 Nov 2025 23:31 UTC
99 points
31 comments4 min readLW link

Start an AI safety group with the Path­fin­der Fellowship

Topaz7 Nov 2025 21:05 UTC
2 points
0 comments1 min readLW link

AI is not in­evitable.

David Scott Krueger7 Nov 2025 20:31 UTC
29 points
2 comments3 min readLW link
(therealartificialintelligence.substack.com)

An­nounc­ing “Com­pu­ta­tional Func­tion­al­ism De­bate” (so­lic­it­ing paid feed­back): Test your in­tu­itions about consciousness

ChrisPercy7 Nov 2025 20:12 UTC
4 points
0 comments3 min readLW link

The Hawley-Blu­men­thal AI Risk Eval­u­a­tion Act

David Abecassis7 Nov 2025 19:09 UTC
42 points
0 comments2 min readLW link
(techgov.intelligence.org)

Sec­u­lar Sols­tice Roundup 2025

datawitch7 Nov 2025 19:03 UTC
14 points
4 comments1 min readLW link

The Decalogue For Aligned AI.

theophilus tabuke7 Nov 2025 18:47 UTC
1 point
0 comments1 min readLW link