Your Clone Wants to Kill You Be­cause You As­sumed Too Much

Algon15 Nov 2025 23:21 UTC
67 points
10 comments2 min readLW link

Writ­ing Hack: Write It Just Like That

eleweek15 Nov 2025 22:16 UTC
24 points
0 comments3 min readLW link
(psychotechnology.substack.com)

AI loves octopuses

Sean Herrington15 Nov 2025 21:59 UTC
32 points
19 comments5 min readLW link

Punc­tu­a­tion & Quo­ta­tion Conventions

abramdemski15 Nov 2025 18:13 UTC
21 points
14 comments2 min readLW link

Ma­tri­ces map be­tween biproducts

jessicata15 Nov 2025 18:05 UTC
41 points
6 comments5 min readLW link
(unstableontology.com)

Don’t use the phrase “hu­man val­ues”

Nina Panickssery15 Nov 2025 16:49 UTC
60 points
10 comments1 min readLW link

Halfway there; on des­per­a­tion management

Dentosal15 Nov 2025 14:55 UTC
7 points
0 comments2 min readLW link

“Mid­dle­march” is inane and also one of my fa­vorite books

Ben Pace15 Nov 2025 7:58 UTC
45 points
1 comment11 min readLW link

Just Another Five Minutes

Screwtape15 Nov 2025 7:47 UTC
43 points
4 comments5 min readLW link

Same cog­ni­tive paints, ex­ceed­ingly differ­ent men­tal pictures

Ruby15 Nov 2025 7:13 UTC
17 points
0 comments4 min readLW link

A Love Song to Nicotine

eleweek15 Nov 2025 6:54 UTC
23 points
4 comments5 min readLW link
(psychotechnology.substack.com)

In­creas­ing re­turns to effort are common

habryka15 Nov 2025 6:53 UTC
114 points
6 comments7 min readLW link

Pri­vate La­tent No­ta­tion and AI-Hu­man Alignment

Robert Shuler15 Nov 2025 5:47 UTC
6 points
1 comment6 min readLW link

On Bat­tle-Short: What, How, and Why Not To

Lorxus15 Nov 2025 5:27 UTC
4 points
0 comments3 min readLW link
(tiled-with-pentagons.blogspot.com)

The Flaw in the Paper­clip Max­i­mizer Thought Experiment

Taylor G. Lunt15 Nov 2025 4:46 UTC
3 points
0 comments2 min readLW link

“But You’d Like To Feel Com­pan­ionate Love, Right? … Right?”

johnswentworth15 Nov 2025 4:28 UTC
70 points
25 comments3 min readLW link

Gen­er­a­tion Ship: A Protest Song For PauseAI

LoganStrohl15 Nov 2025 1:17 UTC
43 points
3 comments1 min readLW link

Will AI sys­tems drift into mis­al­ign­ment?

joshc15 Nov 2025 1:03 UTC
15 points
3 comments15 min readLW link

Every­day Clean Air

jefftk15 Nov 2025 1:00 UTC
33 points
5 comments2 min readLW link
(www.jefftk.com)

Some Sun Tsu quotes sound like they’re ac­tu­ally about de­bates/​epistemics

depressurize15 Nov 2025 0:41 UTC
6 points
2 comments1 min readLW link

What are your im­pos­si­ble prob­lems?

Raemon15 Nov 2025 0:28 UTC
28 points
24 comments1 min readLW link

Pre­dic­tion mar­kets for so­cial de­duc­tion games

Mikhail Samin15 Nov 2025 0:18 UTC
10 points
0 comments2 min readLW link
(mikhailsamin.substack.com)

List of great filk songs

Algon15 Nov 2025 0:17 UTC
26 points
5 comments2 min readLW link

a sketch of how we might go about get­ting bas­ins of cor­rigi­bil­ity from RL

williawa14 Nov 2025 22:10 UTC
10 points
0 comments4 min readLW link

Lambda Calcu­lus Prior

abramdemski14 Nov 2025 21:29 UTC
25 points
3 comments4 min readLW link

AI Craz­i­ness: Ad­di­tional Suicide Law­suits and The Fate of GPT-4o

Zvi14 Nov 2025 20:20 UTC
45 points
0 comments7 min readLW link
(thezvi.wordpress.com)

Un­der­stand­ing and Con­trol­ling LLM Generalization

Daniel Tan14 Nov 2025 16:58 UTC
43 points
3 comments1 min readLW link

Lorxus Does Halfhaven: 11/​08~11/​14

Lorxus14 Nov 2025 13:23 UTC
5 points
0 comments2 min readLW link
(tiled-with-pentagons.blogspot.com)

Find­ing Balance & Op­por­tu­nity in the Holi­day Flux [free pub­lic work­shop]

teebarnett14 Nov 2025 10:53 UTC
2 points
2 comments1 min readLW link

From An­thony: Con­trol Inversion

Gabriel Alfour14 Nov 2025 9:36 UTC
10 points
0 comments1 min readLW link
(control-inversion.ai)

LLM would have said this bet­ter, and with­out all these ty­pos too

Dentosal14 Nov 2025 9:33 UTC
8 points
0 comments2 min readLW link

The Charge of the Hobby Horse

TsviBT14 Nov 2025 8:17 UTC
65 points
46 comments5 min readLW link

The Eight­fold Path To En­light­ened Disagreement

dreeves14 Nov 2025 7:57 UTC
9 points
0 comments3 min readLW link

10 Types of LessWrong Post

Ben Pace14 Nov 2025 7:56 UTC
52 points
2 comments4 min readLW link

Don’t let peo­ple buy credit with bor­rowed funds

habryka14 Nov 2025 7:51 UTC
111 points
43 comments10 min readLW link

Every­one has a plan un­til they get lied to the face

Screwtape14 Nov 2025 7:22 UTC
183 points
33 comments7 min readLW link

Notes on the book “Ta­lent”

Nina Panickssery14 Nov 2025 5:43 UTC
25 points
1 comment15 min readLW link
(blog.ninapanickssery.com)

[Question] How do you read Less Wrong?

Mitchell_Porter14 Nov 2025 5:17 UTC
20 points
15 comments1 min readLW link

Thoughts are sur­pris­ingly de­tailed and re­mark­ably autonomous

Ruby14 Nov 2025 5:00 UTC
24 points
1 comment3 min readLW link

Halfhaven Digest #4

Taylor G. Lunt14 Nov 2025 4:16 UTC
9 points
0 comments2 min readLW link

AI Cor­rigi­bil­ity De­bate: Max Harms vs. Jeremy Gillen

14 Nov 2025 4:09 UTC
46 points
1 comment75 min readLW link
(doomdebates.com)

Types of sys­tems that could be use­ful for agent foundations

Alex_Altair14 Nov 2025 3:54 UTC
46 points
3 comments5 min readLW link

The rare, deadly virus lurk­ing in the South­west US, and the big­ger picture

eukaryote14 Nov 2025 3:27 UTC
56 points
1 comment17 min readLW link
(eukaryotewritesblog.com)

Tell peo­ple as early as pos­si­ble it’s not go­ing to work out

habryka14 Nov 2025 2:21 UTC
152 points
17 comments2 min readLW link

Ques­tion­ing Computationalism

abramdemski14 Nov 2025 1:30 UTC
22 points
7 comments19 min readLW link

Ori­ent Speed in the 21st Century

Raemon14 Nov 2025 1:12 UTC
53 points
14 comments3 min readLW link
(thehumanspirit.substack.com)

Eval­u­a­tion Avoidance: How Hu­mans and AIs Hack Re­ward by Dis­abling Eval­u­a­tion In­stead of Gam­ing Metrics

Johannes C. Mayer14 Nov 2025 0:39 UTC
19 points
0 comments3 min readLW link

Self-in­ter­pretabil­ity: LLMs can de­scribe com­plex in­ter­nal pro­cesses that drive their decisions

14 Nov 2025 0:18 UTC
12 points
0 comments4 min readLW link

(Fan­tasy) → (Plan­ning): A Core Men­tal Move For Agen­tic Hu­mans?

johnswentworth14 Nov 2025 0:13 UTC
70 points
6 comments2 min readLW link

[Question] How does one tell apart re­sults in ethics and de­ci­sion the­ory?

StanislavKrym13 Nov 2025 23:42 UTC
6 points
0 comments2 min readLW link