How Go Play­ers Disem­power Them­selves to AI

Ashe Vazquez Nuñez1 May 2026 23:24 UTC
692 points
77 comments8 min readLW link

The Owned Ones

Eliezer Yudkowsky12 May 2026 17:56 UTC
367 points
51 comments6 min readLW link

Ir­re­triev­abil­ity; or, Mur­phy’s Curse of Oneshot­ness upon ASI

Eliezer Yudkowsky4 May 2026 22:11 UTC
367 points
132 comments22 min readLW link

Women should be able to open things

KatjaGrace21 May 2026 3:50 UTC
338 points
134 comments2 min readLW link
(worldspiritsockpuppet.com)

Mnemonic por­traits for 19,023 hu­man genes

Brinedew28 May 2026 22:16 UTC
336 points
27 comments15 min readLW link

It’s nice of you to worry about me, but I re­ally do have a life

Viliam4 May 2026 21:14 UTC
331 points
61 comments4 min readLW link

Models find­ing soft­ware vuln­er­a­bil­ities is not the pri­mary source of cy­ber­se­cu­rity risk

lc14 May 2026 3:39 UTC
308 points
23 comments2 min readLW link

Bad Prob­lems Don’t Stop Be­ing Bad Be­cause Some­body’s Wrong About Fault Analysis

Linch9 May 2026 1:30 UTC
264 points
74 comments3 min readLW link

x-risk-themed

kave6 May 2026 15:16 UTC
218 points
20 comments3 min readLW link
(kaverennedy.substack.com)

Nat­u­ral Lan­guage Au­toen­coders Pro­duce Un­su­per­vised Ex­pla­na­tions of LLM Activations

7 May 2026 20:21 UTC
213 points
35 comments8 min readLW link

A rel­a­tively brief ex­pla­na­tion of Boltz­mann Brains

Eliezer Yudkowsky16 May 2026 21:19 UTC
206 points
154 comments4 min readLW link

MATS 9 Ret­ro­spec­tive & Advice

beyarkay (Boyd Kane)15 May 2026 12:30 UTC
198 points
11 comments18 min readLW link
(boydkane.com)

Em­pow­er­ment, cor­rigi­bil­ity, etc. are sim­ple ab­strac­tions (of a messed-up on­tol­ogy)

Steven Byrnes11 May 2026 17:48 UTC
188 points
70 comments16 min readLW link

Trees are mostly made of air and a gen­er­al­iz­able les­son for AI safety

Zephaniah Roe29 May 2026 4:08 UTC
166 points
28 comments4 min readLW link

A Year Late, Claude Fi­nally Beats Poké­mon

Julian Bradshaw16 May 2026 7:05 UTC
162 points
12 comments9 min readLW link

[Linkpost] In­ter­pret­ing Lan­guage Model Parameters

5 May 2026 17:37 UTC
162 points
2 comments2 min readLW link
(www.goodfire.ai)

Dairy cows make their mis­ery ex­pen­sive (but their calves can’t)

Elizabeth3 May 2026 19:20 UTC
159 points
1 comment6 min readLW link
(acesounderglass.com)

Cog­ni­tive Se­cu­rity as an AI Safety Cause Area

jsteinhardt25 May 2026 18:30 UTC
155 points
16 comments2 min readLW link

The Iliad In­ten­sive Course Materials

11 May 2026 18:55 UTC
152 points
4 comments13 min readLW link
(docs.google.com)

Au­to­mated Align­ment is Harder Than You Think

14 May 2026 22:01 UTC
143 points
5 comments3 min readLW link
(arxiv.org)

The Dar­wi­nian Honey­moon—Why I am not as im­pressed by hu­man progress as I used to be

Elias Schmied10 May 2026 15:55 UTC
138 points
23 comments4 min readLW link

the­ory up­lift differ­en­tially benefits safety & is underleveraged

yudhister20 May 2026 21:43 UTC
132 points
14 comments1 min readLW link

You Are Not Im­mune To Mode Collapse

J Bostock2 May 2026 19:57 UTC
127 points
18 comments4 min readLW link
(jbostock.substack.com)

Tak­ing woo se­ri­ously but not literally

Kaj_Sotala4 May 2026 13:36 UTC
123 points
27 comments23 min readLW link
(kajsotala.substack.com)

Donat­ing 80% While It Still Counts

jefftk26 May 2026 1:30 UTC
123 points
8 comments6 min readLW link
(www.jefftk.com)

Con­tra Went­worth on Phys­i­cal At­trac­tive­ness for Men

Gretta Duleba26 May 2026 23:20 UTC
122 points
25 comments8 min readLW link

Con­ver­gent Ab­strac­tion Hypothesis

Jan_Kulveit15 May 2026 0:04 UTC
122 points
20 comments6 min readLW link

Ne­ga­tion Ne­glect: When mod­els fail to learn nega­tions in training

18 May 2026 18:37 UTC
119 points
37 comments8 min readLW link

Claude, Author of the Humanitas

Linch26 May 2026 16:05 UTC
118 points
41 comments16 min readLW link

Op­ti­mi­sa­tion: Selec­tive ver­sus Predictive

Raymond Douglas12 May 2026 14:03 UTC
117 points
15 comments3 min readLW link

Vot­ers are sur­pris­ingly open to talk­ing about AI risk

less_raichu13 May 2026 14:08 UTC
116 points
11 comments3 min readLW link

In­crim­i­nat­ing mis­al­igned AI mod­els via distillation

15 May 2026 21:43 UTC
115 points
12 comments5 min readLW link

Many in­di­vi­d­ual CEVs are prob­a­bly quite bad

Viliam6 May 2026 20:18 UTC
109 points
32 comments3 min readLW link

Syn­thetic Per­sona Pre­train­ing: Align­ment from To­ken Zero

20 May 2026 14:16 UTC
109 points
26 comments17 min readLW link

Im­pli­ca­tions Of Pre­dict­ing The Next Token

jdp19 May 2026 22:17 UTC
108 points
6 comments31 min readLW link
(minihf.com)

Risk from fit­ness-seek­ing AIs: mechanisms and mitigations

Alex Mallen1 May 2026 17:42 UTC
107 points
0 comments32 min readLW link

The AI In­dus­trial Ex­plo­sion — Part 1: Max­i­mum growth rates with cur­rent pro­duc­tion methods

djbinder4 May 2026 15:32 UTC
106 points
11 comments12 min readLW link
(defensesindepth.bio)

In­ter­na­tional Law Can­not Prevent Ex­tinc­tion Either

Sausage Vector Machine9 May 2026 22:34 UTC
102 points
16 comments5 min readLW link

Try, even if they have you cold

WalterL7 May 2026 17:19 UTC
102 points
14 comments2 min readLW link

Don’t be too Clever to Take Ob­vi­ous Ad­vice

Hide15 May 2026 3:01 UTC
95 points
26 comments2 min readLW link
(hidefromit.substack.com)

Who Got Breasts First and How We Got Them

rba11 May 2026 13:11 UTC
94 points
28 comments10 min readLW link

Your rights when fly­ing to Europe

Yair Halberstadt5 May 2026 19:17 UTC
92 points
14 comments5 min readLW link

Tax­ing Small Cars To Im­prove MPG

jefftk24 May 2026 21:50 UTC
91 points
11 comments2 min readLW link
(www.jefftk.com)

Will we re­ally put data cen­ters in space?

22 May 2026 23:51 UTC
91 points
23 comments5 min readLW link
(www.forethought.org)

Claude is Now Align­ment-Pretrained

RogerDearnaley13 May 2026 23:19 UTC
87 points
9 comments1 min readLW link
(www.anthropic.com)

Bring­ing More Ex­per­tise to Bear on Alignment

8 May 2026 10:29 UTC
87 points
1 comment8 min readLW link

Mechanis­tic es­ti­ma­tion for wide ran­dom MLPs

Jacob_Hilton7 May 2026 16:20 UTC
85 points
5 comments5 min readLW link
(www.alignment.org)

There is no ev­i­dence you should reap­ply sun­screen ev­ery 2 hours.

Hide6 May 2026 9:19 UTC
85 points
14 comments9 min readLW link
(hidefromit.substack.com)

What am I, if not an AI?

makiba21 May 2026 13:14 UTC
84 points
14 comments7 min readLW link

The bal­lad of TIGIT

Abhishaike Mahajan27 May 2026 17:04 UTC
84 points
1 comment9 min readLW link