All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Is Escalation Inevitable?

Lennart Wijers31 May 2025 22:10 UTC

5 points

0 comments3 min readLW link

Policy Entropy, Learning, and Alignment (Or Maybe Your LLM Needs Therapy)

sdeture31 May 2025 22:09 UTC

15 points

6 comments8 min readLW link

The Unseen Hand: AI’s Problem Preemption and the True Future of Labor

Ben Kassan31 May 2025 22:04 UTC

8 points

0 comments20 min readLW link

The 80/20 playbook for mitigating AI scheming in 2025

Charbel-Raphaël31 May 2025 21:17 UTC

40 points

2 comments4 min readLW link

Collective Action for AI Safety (June 4, NYC)

Jordan Braunstein31 May 2025 20:27 UTC

1 point

0 comments1 min readLW link

The best approaches for mitigating “the intelligence curse” (or gradual disempowerment); my quick guesses at the best object-level interventions

ryan_greenblatt31 May 2025 18:20 UTC

78 points

19 comments5 min readLW link

Would It Be Better to Dispense with Good and Evil?

arusarda31 May 2025 16:40 UTC

−2 points

10 comments6 min readLW link

How Epistemic Collapse Looks from Inside

Martin Sustrik31 May 2025 16:30 UTC

9 points

11 comments1 min readLW link

(250bpm.substack.com)

When will AI automate all mental work, and how fast?

aggliu and Writer

31 May 2025 16:18 UTC

10 points

0 comments7 min readLW link

(youtu.be)

Progress links and short notes, 2025-05-31: RPI fellowship deadline tomorrow, Edge Esmeralda next week, and more

jasoncrawford31 May 2025 15:20 UTC

11 points

0 comments7 min readLW link

(newsletter.rootsofprogress.org)

House Party Dances

jefftk31 May 2025 15:20 UTC

13 points

1 comment1 min readLW link

(www.jefftk.com)

Free Will, Like Probability, is About Local Knowledge

Rob Lucas31 May 2025 14:19 UTC

4 points

6 comments16 min readLW link

(open.substack.com)

The (Unofficial) Rationality: A-Z Anki Deck

japancolorado31 May 2025 7:01 UTC

30 points

8 comments1 min readLW link

Zochi Publishes A* Paper

mannatvjain31 May 2025 0:00 UTC

12 points

0 comments4 min readLW link

(www.intology.ai)

Memory Decoding Journal Club: Structure and function of the hippocampal CA3 module

Devin Ward30 May 2025 23:59 UTC

1 point

0 comments1 min readLW link

Diabetes is Caused by Oxidative Stress

Lorec30 May 2025 21:03 UTC

11 points

11 comments8 min readLW link

Too Many Metaphors: A Case for Plain Talk in AI Safety

David Harket30 May 2025 19:29 UTC

1 point

8 comments2 min readLW link

[Question] Could we go another route with computers?

Roman Malov30 May 2025 19:04 UTC

13 points

5 comments1 min readLW link

Aristotelian Optimization: The Economics of Cameralism

Edward Könings30 May 2025 19:02 UTC

−2 points

1 comment13 min readLW link

I replicated the Anthropic alignment faking experiment on other models, and they didn’t fake alignment

Alex Kedryk and Igor Ivanov

30 May 2025 18:57 UTC

35 points

0 comments2 min readLW link

‘GiveWell for AI Safety’: Lessons learned in a week

Lydia Nottingham30 May 2025 18:38 UTC

41 points

0 comments6 min readLW link

Idea Generation and Sifting

belos30 May 2025 16:59 UTC

1 point

0 comments20 min readLW link

(bestofagreatlot.substack.com)

50 Ideas for Life I Repeatedly Share

DMMF30 May 2025 16:57 UTC

26 points

9 comments15 min readLW link

(notnottalmud.substack.com)

Virtues related to honesty

Orioth30 May 2025 14:11 UTC

11 points

23 comments2 min readLW link

AI 2027 - Rogue Replication Timeline

Alvin Ånestrand30 May 2025 13:46 UTC

41 points

3 comments7 min readLW link

(forecastingaifutures.substack.com)

Letting Kids Be Kids

Zvi30 May 2025 10:50 UTC

86 points

15 comments20 min readLW link

(thezvi.wordpress.com)

The Geometry of LLM Logits (an analytical outer bound)

Rohan Ganapavarapu30 May 2025 1:21 UTC

5 points

0 comments2 min readLW link

(rohan.ga)

Memory Decoding Journal Club: Structure and function of the hippocampal CA3 module

Devin Ward30 May 2025 1:08 UTC

1 point

0 comments1 min readLW link

Experimental CFAR Mini-Workshop @ Arbor Summer Camp

Davis_Kingsley30 May 2025 0:23 UTC

12 points

0 comments2 min readLW link

CFAR is running an experimental mini-workshop (June 2-6, Berkeley CA)!

Davis_Kingsley29 May 2025 22:02 UTC

65 points

2 comments2 min readLW link

Orphaned Policies (Post 5 of 7 on AI Governance)

Mass_Driver29 May 2025 21:42 UTC

72 points

5 comments16 min readLW link

Gradual Disempowerment: Concrete Research Projects

Raymond Douglas29 May 2025 18:55 UTC

104 points

10 comments10 min readLW link

Do you even have a system prompt? (PSA / repo)

Croissanthology29 May 2025 18:49 UTC

111 points

78 comments2 min readLW link

Incorrect Baseline Evaluations Call into Question Recent LLM-RL Claims

shash4229 May 2025 18:40 UTC

66 points

7 comments1 min readLW link

(safe-lip-9a8.notion.site)

Dimensionalization

Jordan Rubin29 May 2025 18:18 UTC

7 points

6 comments4 min readLW link

(jordanmrubin.substack.com)

Distilled Human Judgment: Reifying AI Alignment

Devansh Mehta29 May 2025 18:06 UTC

2 points

0 comments4 min readLW link

Summer AI Safety Intro Fellowships in Boston and Online (Policy & Technical) – Apply by June 6!

jandrade11229 May 2025 18:02 UTC

1 point

0 comments1 min readLW link

Digital sentience funding opportunities: Support for applied work and research

aog and zdgroff

29 May 2025 15:22 UTC

21 points

0 comments4 min readLW link

When to Be Nice vs Kind

Declan Molony29 May 2025 15:06 UTC

25 points

2 comments1 min readLW link

AI #118: Claude Ascendant

Zvi29 May 2025 14:10 UTC

45 points

8 comments57 min readLW link

(thezvi.wordpress.com)

Social Capital—Does it Matter?

Momcilo29 May 2025 12:26 UTC

−9 points

1 comment6 min readLW link

Alignment Crisis: Genocide Denial

_mp_29 May 2025 12:04 UTC

−11 points

5 comments4 min readLW link

Cross-posting to Substack

jefftk29 May 2025 11:10 UTC

12 points

0 comments1 min readLW link

(www.jefftk.com)

Reflections on AI Wisdom, plus announcing Wise AI Wednesdays

Chris_Leong29 May 2025 7:13 UTC

18 points

0 comments3 min readLW link

[Question] What was so great about Move 37?

Caleb Biddulph29 May 2025 7:00 UTC

24 points

4 comments3 min readLW link

Procedural vs. Causal Understanding

Caleb Biddulph29 May 2025 7:00 UTC

7 points

2 comments2 min readLW link

Security Mindset: Hacking Pinball High Scores

gwern29 May 2025 3:39 UTC

29 points

4 comments1 min readLW link

(gwern.net)

Quick Minimal Playhouse

jefftk29 May 2025 2:10 UTC

17 points

1 comment1 min readLW link

(www.jefftk.com)

Cognitive Exhaustion and Engineered Trust: Lessons from My Gym

Priyanka Bharadwaj29 May 2025 1:21 UTC

14 points

3 comments3 min readLW link

Truth or Dare

Duncan Sabien (Inactive)29 May 2025 0:07 UTC

264 points

61 comments69 min readLW link