All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30 31

Burnout, depression, and AI safety: some concrete mental health strategies

KatWoods26 Dec 2025 19:52 UTC

44 points

2 comments4 min readLW link

How hard should I prioritize having kids?

Recurrented26 Dec 2025 19:29 UTC

11 points

7 comments1 min readLW link

The moral critic of the AI industry—a Q&A with Holly Elmore

Mordechai Rorvig26 Dec 2025 17:49 UTC

7 points

0 comments2 min readLW link

(www.foommagazine.org)

Apply for Alignment Mentorship from TurnTrout and Alex Cloud

TurnTrout and cloud

26 Dec 2025 17:20 UTC

40 points

0 comments2 min readLW link

(turntrout.com)

Measuring no CoT math time horizon (single forward pass)

ryan_greenblatt26 Dec 2025 16:37 UTC

212 points

18 comments3 min readLW link

Whole Brain Emulation as an Anchor for AI Welfare

sturb26 Dec 2025 14:45 UTC

52 points

13 comments6 min readLW link

Childhood and Education #16: Letting Kids Be Kids

Zvi26 Dec 2025 13:50 UTC

55 points

3 comments18 min readLW link

(thezvi.wordpress.com)

Regression by Composition

Anders_H26 Dec 2025 12:18 UTC

13 points

0 comments1 min readLW link

(rss.org.uk)

Unknown Knowns: Five Ideas You Can’t Unsee

Linch25 Dec 2025 23:28 UTC

76 points

37 comments6 min readLW link

(linch.substack.com)

There’s Room in the Manger

Celer25 Dec 2025 18:00 UTC

20 points

0 comments2 min readLW link

(keller.substack.com)

Call for Science of Eval Awareness (+ Research Directions)

Igor Ivanov25 Dec 2025 17:26 UTC

29 points

23 comments5 min readLW link

AI #148: Christmas Break

Zvi25 Dec 2025 14:00 UTC

30 points

4 comments39 min readLW link

(thezvi.wordpress.com)

Clipboard Normalization

jefftk25 Dec 2025 13:50 UTC

105 points

9 comments1 min readLW link

(www.jefftk.com)

The Intelligence Axis: A Functional Typology

Anurag 25 Dec 2025 12:18 UTC

3 points

0 comments5 min readLW link

Honorable AI

Kaarel24 Dec 2025 21:20 UTC

37 points

23 comments41 min readLW link

Catch-Up Algorithmic Progress Might Actually be 60× per Year

Aaron_Scher24 Dec 2025 21:03 UTC

92 points

16 comments10 min readLW link

The Ones who Feed their Children

xhnk7jwvqj-max24 Dec 2025 19:15 UTC

10 points

2 comments3 min readLW link

[Book Review] “Reality+” by David Chalmers

lsdev24 Dec 2025 19:14 UTC

4 points

0 comments2 min readLW link

Kids and Space

jefftk24 Dec 2025 15:30 UTC

73 points

5 comments3 min readLW link

(www.jefftk.com)

Zvi’s 2025 In Movies

Zvi24 Dec 2025 13:30 UTC

27 points

1 comment11 min readLW link

(thezvi.wordpress.com)

[Question] Acausal communication between isolated universes through simulation

Horosphere24 Dec 2025 11:46 UTC

13 points

14 comments1 min readLW link

Methodological considerations in making malign initializations for control research

Alek Westover24 Dec 2025 1:18 UTC

10 points

0 comments13 min readLW link

Immunodeficiency to Parasitic AI

Andrii Shportko24 Dec 2025 0:17 UTC

4 points

1 comment2 min readLW link

An introduction to modular induction and some attempts to solve it

Thomas Kehrenberg23 Dec 2025 22:35 UTC

12 points

1 comment18 min readLW link

Rules clarification for the Write like lsusr competition

Isusr23 Dec 2025 21:12 UTC

8 points

2 comments2 min readLW link

Human Values

Maitreya23 Dec 2025 21:08 UTC

32 points

1 comment3 min readLW link

Alignment Fellowship

rich_anon23 Dec 2025 20:29 UTC

58 points

14 comments1 min readLW link

Iterative Matrix Steering: Forcing LLMs to “Rationalize” Hallucinations via Subspace Alignment

Artem Herasymenko23 Dec 2025 20:13 UTC

9 points

2 comments4 min readLW link

Unpacking Geometric Rationality

MorgneticField23 Dec 2025 20:10 UTC

2 points

0 comments33 min readLW link

Dreaming Vectors: Gradient-descented steering vectors from Activation Oracles and using them to Red-Team AOs

ceselder23 Dec 2025 19:28 UTC

22 points

4 comments12 min readLW link

The Center for Reducing Suffering wants input from the suffering reduction community

Zoé23 Dec 2025 18:27 UTC

1 point

0 comments1 min readLW link

(centerforreducingsuffering.org)

It’s Good To Create Happy People: A Comprehensive Case

Bentham's Bulldog23 Dec 2025 16:43 UTC

1 point

5 comments33 min readLW link

I Died on DMT

Rebecca Dai23 Dec 2025 16:15 UTC

12 points

2 comments7 min readLW link

(rebeccadai.substack.com)

Open Source is a Normal Term

jefftk23 Dec 2025 15:40 UTC

24 points

4 comments1 min readLW link

(www.jefftk.com)

Don’t Trust Your Brain

silentbob23 Dec 2025 15:06 UTC

37 points

5 comments4 min readLW link

The ML drug discovery startup trying really, really hard to not cheat

Abhishaike Mahajan23 Dec 2025 14:48 UTC

86 points

2 comments19 min readLW link

(www.owlposting.com)

Keeping Up Against the Joneses: Balsa’s 2025 Fundraiser

Zvi23 Dec 2025 14:40 UTC

49 points

1 comment6 min readLW link

(thezvi.wordpress.com)

Does 1025 modulo 57 equal 59?

Jan Betley23 Dec 2025 13:00 UTC

33 points

3 comments2 min readLW link

What Can Wittgenstein Teach Us About LLM Safety Research?

Manqing Liu23 Dec 2025 4:14 UTC

8 points

0 comments4 min readLW link

Job Listing (CLOSED): CBAI Research Managers

Maite Abadia-Manthei and emreyavuz

23 Dec 2025 4:03 UTC

1 point

0 comments1 min readLW link

Grounding Value Learning in Evolutionary Psychology: an Alternative Proposal to CEV

RogerDearnaley23 Dec 2025 3:40 UTC

40 points

25 comments20 min readLW link

The Benefits of Meditation Come From Telling People That You Meditate

ThirdEyeJoe (cousin of CottonEyedJoe)23 Dec 2025 1:48 UTC

35 points

5 comments2 min readLW link

The future of alignment if LLMs are a bubble

Stuart_Armstrong23 Dec 2025 0:08 UTC

47 points

13 comments5 min readLW link

Unsupervised Agent Discovery

Gunnar_Zarncke22 Dec 2025 22:01 UTC

24 points

0 comments6 min readLW link

Announcing Gemma Scope 2

CallumMcDougall, Arthur Conmy, János Kramár, Tom Lieberum, Senthooran Rajamanoharan and Neel Nanda

22 Dec 2025 21:56 UTC

94 points

1 comment2 min readLW link

[Advanced Intro to AI Alignment] 0. Overview and Foundations

Towards_Keeperhood22 Dec 2025 21:20 UTC

15 points

0 comments5 min readLW link

$500 Write like lsusr competition

lsusr22 Dec 2025 20:09 UTC

29 points

43 comments3 min readLW link

Appendices: Supervised finetuning on low-harm reward hacking generalises to high-harm reward hacking

Isaac Dunn, Kei Nishimura-Gasparian, Carson Denison, Ethan Perez and Robert Kirk

22 Dec 2025 19:33 UTC

17 points

0 comments1 min readLW link

Supervised finetuning on low-harm reward hacking generalises to high-harm reward hacking

Isaac Dunn, Kei Nishimura-Gasparian, Carson Denison, Ethan Perez and Robert Kirk

22 Dec 2025 19:32 UTC

14 points

0 comments30 min readLW link

Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance

ryan_greenblatt22 Dec 2025 17:21 UTC

152 points

18 comments7 min readLW link