All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30 31

All Rationalists hate & sabotage Strategy without having any awareness of it.

Oxidize26 May 2025 22:09 UTC

−26 points

8 comments7 min readLW link

Personal Ruminations on AI’s Missing Variable Problem

Thehumanproject.ai26 May 2025 21:11 UTC

1 point

0 comments3 min readLW link

Poetic Methods II: Rhyme as a Focusing Device

adamShimi26 May 2025 18:29 UTC

24 points

1 comment17 min readLW link

(formethods.substack.com)

Is Building Good Note-Taking Software an AGI-Complete Problem?

Thane Ruthenis26 May 2025 18:26 UTC

28 points

14 comments7 min readLW link

Principal-Agent Problems and the Structure of Governance

belos26 May 2025 18:23 UTC

1 point

0 comments8 min readLW link

(bestofagreatlot.substack.com)

[Question] Does the Universal Geometry of Embeddings paper have big implications for interpretability?

Evan R. Murphy26 May 2025 18:20 UTC

43 points

6 comments1 min readLW link

Socratic Persuasion: Giving Opinionated Yet Truth-Seeking Advice

Neel Nanda26 May 2025 17:38 UTC

61 points

14 comments21 min readLW link

(www.neelnanda.io)

[Beneath Psychology] Case study on chronic pain: First insights, and the remaining challenge

jimmy26 May 2025 17:29 UTC

17 points

1 comment11 min readLW link

An observation on self-play

jonrxu26 May 2025 17:22 UTC

15 points

1 comment3 min readLW link

New website analyzing AI companies’ model evals

Zach Stein-Perlman26 May 2025 16:00 UTC

58 points

0 comments4 min readLW link

New scorecard evaluating AI companies on safety

Zach Stein-Perlman26 May 2025 16:00 UTC

72 points

8 comments1 min readLW link

[Question] Asking for AI Safety Career Advice

infinibot2726 May 2025 15:26 UTC

3 points

1 comment1 min readLW link

Nerve Blisters: A Stoic Response

Jonathan Moregård26 May 2025 15:07 UTC

8 points

2 comments1 min readLW link

(honestliving.substack.com)

On ‘On Caring’

atharva26 May 2025 13:39 UTC

9 points

4 comments3 min readLW link

Claude 4 You: The Quest for Mundane Utility

Zvi26 May 2025 13:01 UTC

36 points

0 comments17 min readLW link

(thezvi.wordpress.com)

Formalizing Embeddedness Failures in Universal Artificial Intelligence

Cole Wyeth26 May 2025 12:36 UTC

39 points

0 comments1 min readLW link

(arxiv.org)

Techies Wanted: How STEM Backgrounds Can Advance Safe AI Policy

Daniel_Eth26 May 2025 11:29 UTC

16 points

0 comments29 min readLW link

D&D.Sci: The Choosing Ones [Answerkey and Ruleset]

abstractapplic26 May 2025 9:43 UTC

19 points

2 comments3 min readLW link

The Sundog Alignment Theorem: A Proposal for Embodied Alignment via Indirect Inference

Malice26 May 2025 7:26 UTC

−9 points

0 comments3 min readLW link

Superposition Without Compression: Why Entangled Representations Are the Default

James Butterworth26 May 2025 5:26 UTC

4 points

2 comments1 min readLW link

(drive.google.com)

Long-form data bottlenecks might stall AI progress for years

Michelle_Ma26 May 2025 4:36 UTC

21 points

0 comments13 min readLW link

Example of Splitting a PR

jefftk26 May 2025 2:20 UTC

28 points

0 comments2 min readLW link

(www.jefftk.com)

How I’m telling my friends about AI Safety

k6425 May 2025 22:43 UTC

1 point

7 comments7 min readLW link

Good Writing

Biff Wiff25 May 2025 21:52 UTC

11 points

0 comments2 min readLW link

(paulgraham.com)

Consider buying voting shares

Hruss25 May 2025 18:01 UTC

2 points

3 comments1 min readLW link

[Question] Can you donate to AI advocacy?

k6425 May 2025 17:54 UTC

27 points

5 comments1 min readLW link

Rant: the extreme wastefulness of high rent prices

Knight Lee25 May 2025 17:04 UTC

−2 points

0 comments2 min readLW link

Beyond Democracy: A System Where Citizens Vote with Their Taxes

Brendan Golledge25 May 2025 17:00 UTC

−1 points

3 comments7 min readLW link

Claude 4 You: Safety and Alignment

Zvi25 May 2025 14:00 UTC

86 points

8 comments63 min readLW link

(thezvi.wordpress.com)

Alignment Proposal: Adversarially Robust Augmentation and Distillation

Cole Wyeth and abramdemski

25 May 2025 12:58 UTC

56 points

53 comments13 min readLW link

An open job application to AI labs

Hruss25 May 2025 12:57 UTC

17 points

0 comments1 min readLW link

Meditations on Doge

Martin Sustrik25 May 2025 12:00 UTC

132 points

44 comments9 min readLW link

(250bpm.substack.com)

Case Studies in Simulators and Agents

WillPetillo, Sean Herrington, Spencer Ames, Adebayo Mubarak and Can Narin

25 May 2025 5:40 UTC

15 points

8 comments6 min readLW link

On safety of being a moral patient of ASI

Yaroslav Granowski24 May 2025 21:24 UTC

3 points

8 comments1 min readLW link

We Need a Baseline for LLM-Aided Experiments

J Bostock24 May 2025 20:52 UTC

11 points

1 comment1 min readLW link

Lie Detectors. Technical solutions to the cooperation problem.

Window Frame24 May 2025 20:05 UTC

7 points

0 comments10 min readLW link

It’s hard to make scheming evals look realistic for LLMs

Igor Ivanov and Danil Kadochnikov

24 May 2025 19:17 UTC

153 points

29 comments5 min readLW link

Launch of the New Horizons Podcast

Nezir Alic24 May 2025 17:50 UTC

5 points

0 comments1 min readLW link

Priming effects are fake, but framing effects are real

Matrice Jacobine24 May 2025 10:54 UTC

33 points

0 comments1 min readLW link

(xphi.net)

The Cosmic Lottery

James Stephen Brown24 May 2025 4:05 UTC

5 points

0 comments5 min readLW link

(nonzerosum.games)

Some Considerations on Prediction Markets

belos24 May 2025 3:24 UTC

2 points

1 comment9 min readLW link

The Paradox of Low Fertility

Zero Contradictions24 May 2025 0:59 UTC

−9 points

6 comments1 min readLW link

(expandingrationality.substack.com)

That’s Not How Epigenetic Modifications Work

johnswentworth24 May 2025 0:15 UTC

68 points

12 comments2 min readLW link

[Question] To what extent is AI safety work trying to get AI to reliably and safely do what the user asks vs. do what is best in some ultimate sense?

Jordan Arel23 May 2025 21:05 UTC

14 points

3 comments1 min readLW link

Notes on Claude 4 System Card

Dentosal23 May 2025 15:23 UTC

19 points

2 comments6 min readLW link

What is emptiness?

Vadim Golub23 May 2025 12:06 UTC

−4 points

11 comments9 min readLW link

Idiohobbies

dkl923 May 2025 6:38 UTC

11 points

2 comments1 min readLW link

(dkl9.net)

Qualitative Fit Testing

jefftk23 May 2025 2:50 UTC

10 points

0 comments2 min readLW link

(www.jefftk.com)

Anthropic is Quietly Backpedalling on its Safety Commitments

garrison23 May 2025 2:26 UTC

86 points

7 comments5 min readLW link

(www.obsolete.pub)

Learning (more) from horse employment history

Tim H23 May 2025 2:11 UTC

68 points

13 comments5 min readLW link