ArchiveSequencesAbout
QuestionsEventsShortformAlignment ForumAF Comments
HomeFeaturedAllTagsRecent Comments
RSS
NewHotActiveOld
Page 1

The be­hav­ioral se­lec­tion model for pre­dict­ing AI motivations

Alex Mallen and Buck
4 Dec 2025 18:46 UTC
137 points
13 comments16 min readLW link

Why peo­ple like your quick bul­lshit takes bet­ter than your high-effort posts

eukaryote28 Nov 2025 20:12 UTC
221 points
27 comments5 min readLW link
(eukaryotewritesblog.com)

Align­ment re­mains a hard, un­solved problem

evhub27 Nov 2025 8:45 UTC
337 points
94 comments14 min readLW link

Nat­u­ral emer­gent mis­al­ign­ment from re­ward hack­ing in pro­duc­tion RL

evhub, Monte M, Benjamin Wright and Jonathan Uesato
21 Nov 2025 20:00 UTC
258 points
32 comments9 min readLW link

How Colds Spread

RobertM18 Nov 2025 5:25 UTC
233 points
28 comments10 min readLW link

Para­noia: A Begin­ner’s Guide

habryka13 Nov 2025 7:56 UTC
333 points
70 comments13 min readLW link

Leg­ible vs. Illeg­ible AI Safety Problems

Wei Dai4 Nov 2025 21:39 UTC
361 points
93 comments2 min readLW link

The Un­rea­son­able Effec­tive­ness of Fiction

Raelifin3 Nov 2025 15:35 UTC
208 points
26 comments8 min readLW link
(raelifin.substack.com)

Hu­man Values ≠ Goodness

johnswentworth2 Nov 2025 19:24 UTC
38 points
74 comments6 min readLW link

LLM-gen­er­ated text is not testimony

TsviBT1 Nov 2025 14:47 UTC
104 points
86 comments11 min readLW link

The Memet­ics of AI Successionism

Jan_Kulveit28 Oct 2025 15:04 UTC
212 points
54 comments9 min readLW link

Cancer has a sur­pris­ing amount of detail

Abhishaike Mahajan26 Oct 2025 20:33 UTC
127 points
18 comments11 min readLW link
(www.owlposting.com)

EU ex­plained in 10 minutes

Martin Sustrik21 Oct 2025 4:40 UTC
244 points
49 comments8 min readLW link
(www.250bpm.com)

The “Length” of “Hori­zons”

Adam Scholl14 Oct 2025 14:48 UTC
183 points
27 comments7 min readLW link

The Most Com­mon Bad Ar­gu­ment In Th­ese Parts

J Bostock11 Oct 2025 16:29 UTC
242 points
60 comments4 min readLW link

Towards a Ty­pol­ogy of Strange LLM Chains-of-Thought

1a3orn9 Oct 2025 22:02 UTC
301 points
29 comments9 min readLW link

Do One New Thing A Day To Solve Your Problems

Algon3 Oct 2025 17:08 UTC
208 points
28 comments2 min readLW link

Eth­i­cal De­sign Patterns

AnnaSalamon30 Sep 2025 11:52 UTC
227 points
49 comments20 min readLW link

Notes on fatal­ities from AI takeover

ryan_greenblatt23 Sep 2025 17:18 UTC
56 points
61 comments8 min readLW link

The Com­pany Man

Tomás B.17 Sep 2025 17:47 UTC
771 points
74 comments18 min readLW link
Back to topNext