Heroic Responsibility

johnswentworth4 Nov 2025 23:26 UTC
80 points
31 comments2 min readLW link

[Linkpost] Com­pet­ing Mo­ti­va­tions: When More In­cen­tives Lead To Less Effort

Gunnar_Zarncke4 Nov 2025 23:02 UTC
11 points
0 comments1 min readLW link
(x.com)

Not Over Or Un­der Indexed

Screwtape4 Nov 2025 22:54 UTC
41 points
0 comments6 min readLW link

Be­ing “Use­fully Con­crete”

Raemon4 Nov 2025 22:15 UTC
44 points
4 comments4 min readLW link

Leg­ible vs. Illeg­ible AI Safety Problems

Wei Dai4 Nov 2025 21:39 UTC
393 points
96 comments2 min readLW link

Pars­ing Validation

Dentosal4 Nov 2025 21:19 UTC
5 points
1 comment3 min readLW link

A/​B test­ing could lead LLMs to re­tain users in­stead of helping them

Daniel Paleka4 Nov 2025 19:30 UTC
28 points
0 comments4 min readLW link
(newsletter.danielpaleka.com)

OpenAI: The Bat­tle of the Board: Ilya’s Testimony

Zvi4 Nov 2025 19:30 UTC
44 points
1 comment5 min readLW link
(thezvi.wordpress.com)

Berkeley Sec­u­lar Sols­tice Weekend

Raemon4 Nov 2025 18:37 UTC
22 points
18 comments1 min readLW link

Model­ing the geopoli­tics of AI development

4 Nov 2025 17:31 UTC
46 points
0 comments2 min readLW link
(ai-scenarios.com)

Thoughts by a non-economist on AI and economics

Boaz Barak4 Nov 2025 17:06 UTC
42 points
2 comments14 min readLW link

GDM: Con­sis­tency Train­ing Helps Limit Sy­co­phancy and Jailbreaks in Gem­ini 2.5 Flash

4 Nov 2025 16:25 UTC
53 points
2 comments6 min readLW link
(arxiv.org)

AI Safety Camp 11

4 Nov 2025 14:56 UTC
8 points
0 comments15 min readLW link

Keep­ing Ants and Spot­ting Queens

Morpheus4 Nov 2025 13:49 UTC
12 points
0 comments2 min readLW link

Let­ter to a close friend

Alexandre Variengien4 Nov 2025 13:17 UTC
9 points
0 comments2 min readLW link
(alexandrevariengien.com)

Open-weight train­ing prac­tices and im­pli­ca­tions for CoT monitorability

4 Nov 2025 10:49 UTC
20 points
0 comments9 min readLW link

Free Learn­ing in To­day’s So­ciety: Some Per­sonal Ex­pe­riences and Reflections

L.M.Sherlock4 Nov 2025 10:30 UTC
30 points
1 comment41 min readLW link
(lmsherlock.substack.com)

A prayer for en­gag­ing in conflict

TsviBT4 Nov 2025 8:19 UTC
68 points
0 comments2 min readLW link

Rain­bows, frac­tals, and crum­pled pa­per: Hölder continuity

Adam Scherlis4 Nov 2025 8:01 UTC
10 points
0 comments3 min readLW link
(adam.scherl.is)

Taste of food

Mikhail Samin4 Nov 2025 7:47 UTC
22 points
0 comments3 min readLW link
(mikhailsamin.substack.com)

Ret­ro­spec­tive on US govt whistle­blower guide and DB

samuelshadrach4 Nov 2025 7:30 UTC
4 points
0 comments2 min readLW link
(samuelshadrach.com)

US Govt Whistle­blower Guide

samuelshadrach4 Nov 2025 7:22 UTC
1 point
6 comments7 min readLW link
(samuelshadrach.com)

US Govt Whistle­blower Database

samuelshadrach4 Nov 2025 7:20 UTC
6 points
6 comments33 min readLW link
(samuelshadrach.com)

The Mor­tify­ing Ordeal of Know­ing Thyself

Philipreal4 Nov 2025 5:16 UTC
6 points
0 comments3 min readLW link

Build the life you ac­tu­ally want

mingyuan4 Nov 2025 4:50 UTC
58 points
3 comments3 min readLW link
(mingyuan.substack.com)

Re­search Reflections

abramdemski4 Nov 2025 4:33 UTC
97 points
3 comments3 min readLW link

I ate bear fat with honey and salt flakes, to prove a point

aggliu4 Nov 2025 2:00 UTC
326 points
53 comments5 min readLW link
(signoregalilei.com)

Ques­tions About Out­perform­ing Com­mon Wisdom

Notelrac4 Nov 2025 0:38 UTC
2 points
0 comments2 min readLW link

Par­ley­ing with the Principled

Screwtape4 Nov 2025 0:23 UTC
14 points
0 comments8 min readLW link

The Zen Of Max­ent As A Gen­er­al­iza­tion Of Bayes Updates

4 Nov 2025 0:02 UTC
63 points
8 comments7 min readLW link

Sam Alt­man’s track record of ma­nipu­la­tion: some quotes from Karen Hao’s “Em­pire of AI”

i_am_nuts3 Nov 2025 22:25 UTC
21 points
3 comments5 min readLW link
(iamnuts.substack.com)

Com­par­a­tive ad­van­tage & AI

Simon Lermen3 Nov 2025 21:50 UTC
120 points
28 comments4 min readLW link

Just com­plain­ing about LLM syco­phancy (filler epi­sode)

Dentosal3 Nov 2025 20:33 UTC
7 points
0 comments3 min readLW link

The Tale of the Top-Tier Intellect

Eliezer Yudkowsky3 Nov 2025 20:21 UTC
123 points
68 comments35 min readLW link

Me­taphors for Biol­ogy: Sizes

Niko McCarty3 Nov 2025 19:40 UTC
1 point
0 comments7 min readLW link
(press.asimov.com)

AI Safety Un­con­fer­ence, Melbourne 2025

mjkerrison3 Nov 2025 19:36 UTC
2 points
0 comments1 min readLW link

[Question] High-Re­sis­tance Sys­tems to Change: Can a Poli­ti­cal Strat­egy Ap­ply to Per­sonal Change?

FireBrito de S. Gabriel3 Nov 2025 19:09 UTC
4 points
0 comments1 min readLW link

Leav­ing Open Philan­thropy, go­ing to Anthropic

Joe Carlsmith3 Nov 2025 17:38 UTC
113 points
30 comments18 min readLW link

Red Heart

PeterMcCluskey3 Nov 2025 17:32 UTC
30 points
0 comments3 min readLW link
(bayesianinvestor.com)

Fal­ling AI Costs and the Pro­lifer­a­tion of Offen­sive Capabilities

Felix Choussat3 Nov 2025 17:32 UTC
15 points
2 comments24 min readLW link

The EU could hold AI ca­pa­bil­ities de­vel­op­ment hostage if they wanted to

beyarkay (Boyd Kane)3 Nov 2025 16:54 UTC
3 points
0 comments1 min readLW link
(boydkane.com)

What’s up with An­thropic pre­dict­ing AGI by early 2027?

ryan_greenblatt3 Nov 2025 16:45 UTC
162 points
16 comments20 min readLW link

To im­prove Ra­tion­al­ity, cre­ate Situations

abstractapplic3 Nov 2025 16:10 UTC
18 points
3 comments3 min readLW link

The Un­rea­son­able Effec­tive­ness of Fiction

Raelifin3 Nov 2025 15:35 UTC
220 points
29 comments8 min readLW link
(raelifin.substack.com)

Crime and Pu­n­ish­ment #1

Zvi3 Nov 2025 15:30 UTC
51 points
4 comments45 min readLW link
(thezvi.wordpress.com)

Solv­ing a prob­lem with mindware

Alexandre Variengien3 Nov 2025 15:17 UTC
10 points
0 comments2 min readLW link
(alexandrevariengien.com)

Pub­lish­ing aca­demic pa­pers on trans­for­ma­tive AI is a nightmare

Jakub Growiec3 Nov 2025 13:04 UTC
167 points
10 comments4 min readLW link

Pep­per­oni and the end of morality

ceselder3 Nov 2025 10:15 UTC
1 point
2 comments2 min readLW link

Try­ing to un­der­stand my own cog­ni­tive edge

Wei Dai3 Nov 2025 8:49 UTC
74 points
17 comments4 min readLW link

There’s some chance oral her­pes is pretty bad for you?

GradientDissenter3 Nov 2025 6:30 UTC
32 points
4 comments6 min readLW link