All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 232425 26 27 28 29 30 31

[Question] To what extent is AI safety work trying to get AI to reliably and safely do what the user asks vs. do what is best in some ultimate sense?

Jordan Arel23 May 2025 21:05 UTC

14 points

3 comments1 min readLW link

Notes on Claude 4 System Card

Dentosal23 May 2025 15:23 UTC

19 points

2 comments6 min readLW link

What is emptiness?

Vadim Golub23 May 2025 12:06 UTC

−4 points

11 comments9 min readLW link

Idiohobbies

dkl923 May 2025 6:38 UTC

11 points

2 comments1 min readLW link

(dkl9.net)

Qualitative Fit Testing

jefftk23 May 2025 2:50 UTC

10 points

0 comments2 min readLW link

(www.jefftk.com)

Anthropic is Quietly Backpedalling on its Safety Commitments

garrison23 May 2025 2:26 UTC

86 points

7 comments5 min readLW link

(www.obsolete.pub)

Learning (more) from horse employment history

Tim H23 May 2025 2:11 UTC

68 points

13 comments5 min readLW link

Schizobench: Documenting Magical-Thinking Behavior in Claude 4 Opus

viemccoy23 May 2025 1:31 UTC

23 points

0 comments1 min readLW link

(metanomicon.ink)

Post-Manifest coworking at Mox

Rachel Shu and Austin Chen

23 May 2025 0:20 UTC

4 points

1 comment1 min readLW link

Claude 4, Opportunistic Blackmail, and “Pleas”

Stephen Martin22 May 2025 19:59 UTC

30 points

2 comments2 min readLW link

Problems in AI Alignment: A Scale Model

Mickey Muldoon22 May 2025 19:22 UTC

−1 points

3 comments2 min readLW link

(muldoon.cloud)

Art Is Art: AI Is the Next Erotica

Charlie Edwards22 May 2025 18:04 UTC

0 points

1 comment14 min readLW link

Reward button alignment

Steven Byrnes22 May 2025 17:36 UTC

53 points

15 comments12 min readLW link

We’re Not Advertising Enough (Post 3 of 7 on AI Governance)

Mass_Driver22 May 2025 17:05 UTC

117 points

10 comments28 min readLW link

Claude 4

Zach Stein-Perlman22 May 2025 17:00 UTC

71 points

24 comments1 min readLW link

(www.anthropic.com)

Video and transcript of talk on AI welfare

Joe Carlsmith22 May 2025 16:15 UTC

24 points

1 comment28 min readLW link

(joecarlsmith.substack.com)

What we can learn from afterlife myths

jchan22 May 2025 15:49 UTC

5 points

0 comments15 min readLW link

Policy recommendations regarding reproductive technology

TsviBT22 May 2025 14:49 UTC

76 points

2 comments3 min readLW link

AI #117: OpenAI Buys Device Maker IO

Zvi22 May 2025 13:40 UTC

37 points

9 comments62 min readLW link

(thezvi.wordpress.com)

Does BPC-157 work for healing and tissue repair?

ChristianKl22 May 2025 13:18 UTC

18 points

0 comments5 min readLW link

(somaticsignals.jollyjoyjourney.com)

[Question] How load-bearing is KL divergence from a known-good base model in modern RL?

faul_sname22 May 2025 12:08 UTC

22 points

3 comments4 min readLW link

Christianity vs. Tantra vs. Sex – one spiritual path?

pchvykov22 May 2025 11:15 UTC

−2 points

0 comments24 min readLW link

Mirror Organisms Are Not Immune to Predation

Matthias Dellago22 May 2025 11:10 UTC

27 points

5 comments1 min readLW link

How 2025 AI Forecasts Fared So Far

Adam B, romeo and elifland

22 May 2025 9:42 UTC

13 points

3 comments8 min readLW link

(theaidigest.org)

Contain and verify: The endgame of US-China AI competition

sjadler22 May 2025 8:13 UTC

6 points

7 comments2 min readLW link

(open.substack.com)

Laugencroissant

Martin Sustrik22 May 2025 6:30 UTC

13 points

0 comments3 min readLW link

(250bpm.substack.com)

Google I/O Day

Zvi21 May 2025 22:00 UTC

49 points

0 comments20 min readLW link

(thezvi.wordpress.com)

Podcast: How not to waste a billion dollars (on your clinical trial), with Meri Beckwith on Development & Research

rossry21 May 2025 21:27 UTC

25 points

0 comments3 min readLW link

(developmentandresearch.bio)

Podcast: From molecule to medicine, with Ross Rheingans-Yoo on Complex Systems

rossry21 May 2025 21:08 UTC

15 points

0 comments5 min readLW link

(www.complexsystemspodcast.com)

The stakes of AI moral status

Joe Carlsmith21 May 2025 18:20 UTC

79 points

65 comments14 min readLW link

(joecarlsmith.substack.com)

[Question] Which AI Safety techniques will be ineffective against diffusion models?

Allen Thomas21 May 2025 18:13 UTC

6 points

1 comment1 min readLW link

Rooting for Moments, Not Jerseys. Another Approach to Enjoying Sports

Ahmed Elsayyad21 May 2025 18:11 UTC

1 point

0 comments3 min readLW link

Unexploitable search: blocking malicious use of free parameters

Jacob Pfau and Geoffrey Irving

21 May 2025 17:23 UTC

40 points

16 comments6 min readLW link

The Real AI Safety Risk Is a Conceptual Exploit: Anthropomorphism

Anthony Fox21 May 2025 16:29 UTC

−2 points

0 comments2 min readLW link

You Can’t Skip Exploration: Why understanding experimentation and taste is key to understanding AI

Oliver Sourbut21 May 2025 16:08 UTC

20 points

0 comments11 min readLW link

(www.oliversourbut.net)

The Problem and Opportunity of Scale

belos21 May 2025 15:52 UTC

1 point

0 comments5 min readLW link

(bestofagreatlot.substack.com)

Sleep need reduction therapies

harsimony21 May 2025 15:22 UTC

87 points

19 comments10 min readLW link

(splittinginfinity.substack.com)

Parental Guidance: Framing Superintelligence

edgecase6421 May 2025 15:01 UTC

10 points

0 comments3 min readLW link

Why Aren’t Rationalists Winning (Again)

k6421 May 2025 14:46 UTC

6 points

25 comments5 min readLW link

Can We Naturalize Moral Epistemology?

tylermjohn 21 May 2025 14:25 UTC

52 points

22 comments6 min readLW link

Units Have More Depth Than I Thought

Morpheus21 May 2025 13:51 UTC

35 points

6 comments1 min readLW link

Humans are Insecure Password Generators

Isaac King21 May 2025 5:58 UTC

16 points

0 comments5 min readLW link

[Crosspost] Anthropic Shadow Geopolitics

akarlin21 May 2025 4:50 UTC

9 points

5 comments18 min readLW link

The Need for Political Advertising (Post 2 of 7 on AI Governance)

Mass_Driver21 May 2025 0:44 UTC

59 points

2 comments13 min readLW link

Notes from Dopamine Detoxing

Alice Blair20 May 2025 23:43 UTC

15 points

4 comments9 min readLW link

Revisiting the ideas for non-neuralese architectures

StanislavKrym20 May 2025 23:35 UTC

2 points

0 comments1 min readLW link

Gemini Diffusion: watch this space

Yair Halberstadt20 May 2025 19:29 UTC

197 points

39 comments1 min readLW link

(deepmind.google)

A Sketch of Belocracy: a new system of governance

belos20 May 2025 18:30 UTC

5 points

0 comments8 min readLW link

(bestofagreatlot.substack.com)

The Codex of Ultimate Vibing

Zvi20 May 2025 18:30 UTC

45 points

2 comments11 min readLW link

(thezvi.wordpress.com)

Outcomes of the Geopolitical Singularity

Nikola Jurkovic20 May 2025 18:09 UTC

62 points

5 comments5 min readLW link