All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Liv Boeree—non-zero hero

James Stephen Brown13 Jul 2025 23:49 UTC

1 point

0 comments2 min readLW link

(nonzerosum.games)

Moloch’s Demise—solving the original problem

James Stephen Brown13 Jul 2025 23:29 UTC

9 points

8 comments1 min readLW link

(nonzerosum.games)

4 Ways Moloch is Ruining Your Life!—a listicle that shows Moloch is all around us, even in listicles

James Stephen Brown13 Jul 2025 23:27 UTC

5 points

0 comments2 min readLW link

(nonzerosum.games)

Three Missing Cakes, or One Turbulent Critic?

Benquo13 Jul 2025 23:08 UTC

31 points

25 comments3 min readLW link

O(1) reasoning in latent space: 1ms inference, 77% accuracy, no attention or tokens

Founder Order One13 Jul 2025 22:54 UTC

−11 points

9 comments2 min readLW link

On actually taking expressions literally: tension as the key to meditation?

Chris_Leong13 Jul 2025 22:49 UTC

16 points

12 comments5 min readLW link

[Question] Why is LW not about winning?

azergante13 Jul 2025 22:36 UTC

21 points

21 comments1 min readLW link

LLMs are stuck in Plato’s cave

Sean Herrington13 Jul 2025 20:37 UTC

9 points

3 comments6 min readLW link

Do LLMs know what they’re capable of? Why this matters for AI safety, and initial findings

Casey Barkan, Sid Black and Oliver Sourbut

13 Jul 2025 19:54 UTC

53 points

5 comments18 min readLW link

10x more training compute = 5x greater task length (kind of)

Expertium13 Jul 2025 18:40 UTC

49 points

8 comments2 min readLW link

How Fast is Algorithmic Progress in AI Inference?

Hans Gundlach, jaysonl and mmertens

13 Jul 2025 18:26 UTC

6 points

4 comments7 min readLW link

xAI’s Grok 4 has no meaningful safety guardrails

eleventhsavi0r13 Jul 2025 18:22 UTC

84 points

15 comments6 min readLW link

You can get LLMs to say almost anything you want

Kaj_Sotala13 Jul 2025 16:30 UTC

84 points

10 comments14 min readLW link

The Fear

NicholasKees13 Jul 2025 16:20 UTC

29 points

1 comment5 min readLW link

Efficiently Detecting Hidden Reasoning with a Small Predictor Model

RohanS, Vishnu Vardhan Sai Lanka, yaumeng and daria

13 Jul 2025 16:04 UTC

34 points

3 comments16 min readLW link

Mapping the off-target effects of every FDA-approved drug in existence

Abhishaike Mahajan13 Jul 2025 15:21 UTC

24 points

1 comment20 min readLW link

(www.owlposting.com)

Memory Decoding Journal Club: Binary and analog variation of synapses between cortical pyramidal neurons

Devin Ward13 Jul 2025 4:00 UTC

2 points

0 comments1 min readLW link

against that one rationalist mashal about japanese fifth-columnists

Fraser13 Jul 2025 1:42 UTC

81 points

6 comments3 min readLW link

(frvser.com)

Win-Win-Win Ethics—Reconciling Consequentialism, Virtue Ethics and Deontology

James Stephen Brown13 Jul 2025 1:42 UTC

9 points

2 comments5 min readLW link

(nonzerosum.games)

Why do LLMs hallucinate?

Nina Panickssery13 Jul 2025 0:09 UTC

24 points

1 comment5 min readLW link

(ninapanickssery.substack.com)

Surprises and learnings from almost two months of Leo Panickssery

Nina Panickssery12 Jul 2025 23:33 UTC

216 points

12 comments6 min readLW link

(ninapanickssery.substack.com)

Stop and check! The parable of the prince and the dog

Dumbledore's Army12 Jul 2025 17:45 UTC

36 points

0 comments2 min readLW link

Take Precautionary Measures Against Superhuman AI Persuasion

Yitz12 Jul 2025 5:34 UTC

14 points

9 comments2 min readLW link

Vitalik’s Response to AI 2027

Daniel Kokotajlo11 Jul 2025 21:43 UTC

123 points

53 comments12 min readLW link

(vitalik.eth.limo)

the jackpot age

thiccythot11 Jul 2025 21:05 UTC

289 points

19 comments4 min readLW link

OpenAI Model Differentiation 101

Zvi11 Jul 2025 20:30 UTC

31 points

5 comments11 min readLW link

(thezvi.wordpress.com)

ACX Cape Town

teegs11 Jul 2025 18:46 UTC

1 point

0 comments1 min readLW link

Adding noise to a sandbagging model can reveal its true capabilities

TheManxLoiner11 Jul 2025 16:56 UTC

18 points

1 comment6 min readLW link

Reflections on AI Companionship and Rational Vulnerability (Or, how I almost fell in love with an anime Catgirl LLM).

Noah Weinberger11 Jul 2025 16:12 UTC

11 points

2 comments8 min readLW link

The Perils of Optimizing Learned Reward Functions

Lukas Fluri11 Jul 2025 16:06 UTC

19 points

1 comment21 min readLW link

Every Universe Thinks It’s the Realest One

Commander Zander11 Jul 2025 15:45 UTC

15 points

1 comment4 min readLW link

On open-science research labs on discord, and getting more people.

Seon Gunness11 Jul 2025 12:05 UTC

10 points

2 comments34 min readLW link

Memory Decoding Journal Club: Binary and analog variation of synapses between cortical pyramidal neurons

Devin Ward11 Jul 2025 4:47 UTC

1 point

0 comments1 min readLW link

Deconfusing ‘AI’ and ‘evolution’

Remmelt11 Jul 2025 1:44 UTC

12 points

11 comments28 min readLW link

So You Think You’ve Awoken ChatGPT

JustisMills11 Jul 2025 1:01 UTC

323 points

88 comments9 min readLW link

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

habryka11 Jul 2025 0:23 UTC

97 points

43 comments6 min readLW link

(metr.org)

On thinking about AI risks concretely

zeshen11 Jul 2025 0:04 UTC

9 points

4 comments4 min readLW link

Metacognition and Self-Modeling in LLMs

Christopher Ackerman10 Jul 2025 21:25 UTC

19 points

2 comments16 min readLW link

My take on AI Alignment: Corporate misalignment and DAOs

act6510 Jul 2025 20:33 UTC

7 points

3 comments1 min readLW link

what makes Claude 3 Opus misaligned

janus10 Jul 2025 20:06 UTC

116 points

12 comments5 min readLW link

The Rising Premium of Life, Or: How We Learned to Start Worrying and Fear Everything

Linch10 Jul 2025 19:12 UTC

10 points

10 comments1 min readLW link

(linch.substack.com)

Lessons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC

200 points

25 comments4 min readLW link

Linkpost: Redwood Research reading list

Julian Stastny10 Jul 2025 18:39 UTC

50 points

0 comments1 min readLW link

(redwoodresearch.substack.com)

Generalized Hangriness: A Standard Rationalist Stance Toward Emotions

johnswentworth10 Jul 2025 18:22 UTC

366 points

70 comments7 min readLW link

The bitter lesson of misuse detection

Hadrien and Charbel-Raphaël

10 Jul 2025 14:50 UTC

37 points

6 comments7 min readLW link

Evaluating and monitoring for AI scheming

Vika, Scott Emmons, Erik Jenner, Mary Phuong, Lewis Ho and Rohin Shah

10 Jul 2025 14:24 UTC

60 points

10 comments5 min readLW link

(deepmindsafetyresearch.medium.com)

White Box Control at UK AISI—Update on Sandbagging Investigations

Joseph Bloom, Jordan Taylor, Connor Kissane, Sid Black, merizian, alexdzm, jacoba, Ben Millwood and Alan Cooney

10 Jul 2025 13:37 UTC

80 points

10 comments18 min readLW link

AI #124: Grokless Interlude

Zvi10 Jul 2025 12:40 UTC

28 points

5 comments43 min readLW link

(thezvi.wordpress.com)

How many OOMs of compute span the human range?

tickybob10 Jul 2025 11:51 UTC

14 points

6 comments1 min readLW link

The anti-Kardashev scale is a better measure of civilizational power

RussellThor10 Jul 2025 10:02 UTC

5 points

2 comments3 min readLW link