All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 111213 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Vitalik’s Response to AI 2027

Daniel Kokotajlo11 Jul 2025 21:43 UTC

123 points

53 comments12 min readLW link

(vitalik.eth.limo)

the jackpot age

thiccythot11 Jul 2025 21:05 UTC

289 points

19 comments4 min readLW link

OpenAI Model Differentiation 101

Zvi11 Jul 2025 20:30 UTC

31 points

5 comments11 min readLW link

(thezvi.wordpress.com)

ACX Cape Town

teegs11 Jul 2025 18:46 UTC

1 point

0 comments1 min readLW link

Adding noise to a sandbagging model can reveal its true capabilities

TheManxLoiner11 Jul 2025 16:56 UTC

18 points

1 comment6 min readLW link

Reflections on AI Companionship and Rational Vulnerability (Or, how I almost fell in love with an anime Catgirl LLM).

Noah Weinberger11 Jul 2025 16:12 UTC

11 points

2 comments8 min readLW link

The Perils of Optimizing Learned Reward Functions

Lukas Fluri11 Jul 2025 16:06 UTC

19 points

1 comment21 min readLW link

Every Universe Thinks It’s the Realest One

Commander Zander11 Jul 2025 15:45 UTC

15 points

1 comment4 min readLW link

On open-science research labs on discord, and getting more people.

Seon Gunness11 Jul 2025 12:05 UTC

10 points

2 comments34 min readLW link

Memory Decoding Journal Club: Binary and analog variation of synapses between cortical pyramidal neurons

Devin Ward11 Jul 2025 4:47 UTC

1 point

0 comments1 min readLW link

Deconfusing ‘AI’ and ‘evolution’

Remmelt11 Jul 2025 1:44 UTC

12 points

11 comments28 min readLW link

So You Think You’ve Awoken ChatGPT

JustisMills11 Jul 2025 1:01 UTC

323 points

88 comments9 min readLW link

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

habryka11 Jul 2025 0:23 UTC

97 points

43 comments6 min readLW link

(metr.org)

On thinking about AI risks concretely

zeshen11 Jul 2025 0:04 UTC

9 points

4 comments4 min readLW link

Metacognition and Self-Modeling in LLMs

Christopher Ackerman10 Jul 2025 21:25 UTC

19 points

2 comments16 min readLW link

My take on AI Alignment: Corporate misalignment and DAOs

act6510 Jul 2025 20:33 UTC

7 points

3 comments1 min readLW link

what makes Claude 3 Opus misaligned

janus10 Jul 2025 20:06 UTC

116 points

12 comments5 min readLW link

The Rising Premium of Life, Or: How We Learned to Start Worrying and Fear Everything

Linch10 Jul 2025 19:12 UTC

10 points

10 comments1 min readLW link

(linch.substack.com)

Lessons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC

200 points

25 comments4 min readLW link

Linkpost: Redwood Research reading list

Julian Stastny10 Jul 2025 18:39 UTC

50 points

0 comments1 min readLW link

(redwoodresearch.substack.com)

Generalized Hangriness: A Standard Rationalist Stance Toward Emotions

johnswentworth10 Jul 2025 18:22 UTC

366 points

70 comments7 min readLW link

The bitter lesson of misuse detection

Hadrien and Charbel-Raphaël

10 Jul 2025 14:50 UTC

37 points

6 comments7 min readLW link

Evaluating and monitoring for AI scheming

Vika, Scott Emmons, Erik Jenner, Mary Phuong, Lewis Ho and Rohin Shah

10 Jul 2025 14:24 UTC

60 points

10 comments5 min readLW link

(deepmindsafetyresearch.medium.com)

White Box Control at UK AISI—Update on Sandbagging Investigations

Joseph Bloom, Jordan Taylor, Connor Kissane, Sid Black, merizian, alexdzm, jacoba, Ben Millwood and Alan Cooney

10 Jul 2025 13:37 UTC

80 points

10 comments18 min readLW link

AI #124: Grokless Interlude

Zvi10 Jul 2025 12:40 UTC

28 points

5 comments43 min readLW link

(thezvi.wordpress.com)

How many OOMs of compute span the human range?

tickybob10 Jul 2025 11:51 UTC

14 points

6 comments1 min readLW link

The anti-Kardashev scale is a better measure of civilizational power

RussellThor10 Jul 2025 10:02 UTC

5 points

2 comments3 min readLW link

If Anyone Builds It, Everyone Dies: A Conversation with Nate Soares and Tim Urban

yams and alexvermeer

10 Jul 2025 8:00 UTC

23 points

2 comments1 min readLW link

80,000 Hours is producing AI in Context — a new YouTube channel. Our first video, about the AI 2027 scenario, is up!

chanamessinger9 Jul 2025 23:58 UTC

54 points

3 comments3 min readLW link

Asking for a Friend (AI Research Protocols)

The Dao of Bayes9 Jul 2025 23:41 UTC

11 points

33 comments2 min readLW link

Demons, Simulators and Gremlins

J Bostock9 Jul 2025 20:22 UTC

10 points

1 comment3 min readLW link

Investigating Priming in Alignment Faking

Wayne9 Jul 2025 17:08 UTC

13 points

0 comments4 min readLW link

No, Grok, No

Zvi9 Jul 2025 15:10 UTC

92 points

3 comments17 min readLW link

(thezvi.wordpress.com)

The Asteroid Setup That Demands an Explanation

David Björling9 Jul 2025 14:55 UTC

−2 points

32 comments5 min readLW link

What’s worse, spies or schemers?

Buck and Julian Stastny

9 Jul 2025 14:37 UTC

51 points

2 comments5 min readLW link

Anthropic reasoning intro (notes on Bostrom)

jchan9 Jul 2025 14:24 UTC

7 points

0 comments7 min readLW link

No, We’re Not Getting Meaningful Oversight of AI

Davidmanheim9 Jul 2025 11:10 UTC

48 points

4 comments1 min readLW link

(arxiv.org)

Hybrid model reveals people act less rationally in complex games, more predictably in simple ones

Gunnar_Zarncke9 Jul 2025 10:15 UTC

9 points

0 comments1 min readLW link

(arxiv.org)

Subway Particle Levels Aren’t That High

jefftk9 Jul 2025 2:30 UTC

82 points

5 comments1 min readLW link

(www.jefftk.com)

TT Self Study Journal # 2

TristanTrim9 Jul 2025 2:16 UTC

3 points

0 comments7 min readLW link

AI Agent Benchmarks Are Broken

Sasha Cui8 Jul 2025 22:11 UTC

10 points

0 comments1 min readLW link

(ddkang.substack.com)

Why Do Some Language Models Fake Alignment While Others Don’t?

abhayesian, John Hughes, Alex Mallen, Jozdien, janus and Fabien Roger

8 Jul 2025 21:49 UTC

158 points

14 comments5 min readLW link

(arxiv.org)

A Medium Scenario

Chapin Lenthall-Cleary8 Jul 2025 20:09 UTC

19 points

13 comments20 min readLW link

An Opinionated Guide to Using Anki Correctly

Luise Woehlke8 Jul 2025 20:01 UTC

162 points

60 comments27 min readLW link

Lenses, Metaphors, and Meaning

WillPetillo, Sean Herrington, Spencer Ames, Adebayo Mubarak and Can Narin

8 Jul 2025 19:46 UTC

8 points

0 comments4 min readLW link

Applying right-wing frames to AGI (geo)politics

Richard_Ngo8 Jul 2025 18:03 UTC

67 points

25 comments3 min readLW link

(x.com)

The Unjournal’s “Pivotal Questions” project

david reinstein8 Jul 2025 15:55 UTC

6 points

1 comment1 min readLW link

(forum.effectivealtruism.org)

Balsa Update: Springtime in DC

Zvi8 Jul 2025 15:00 UTC

61 points

6 comments10 min readLW link

(thezvi.wordpress.com)

MIT FutureTech are hiring a Postdoctoral Associate to work on AI Performance and Safety

peterslattery8 Jul 2025 14:02 UTC

3 points

0 comments4 min readLW link

Energy-Based Transformers are Scalable Learners and Thinkers

Matrice Jacobine8 Jul 2025 13:44 UTC

7 points

5 comments1 min readLW link

(energy-based-transformers.github.io)