All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

All Jan Feb Mar AprMayJun

All 1 2 3 4 5 6 7 8 9 10 11 12 13 141516 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Emma Baker on ADHD

koratkar14 May 2026 23:29 UTC

8 points

2 comments3 min readLW link

(emma00baker.substack.com)

Designing AI factual claims for “easy verification”

Raemon14 May 2026 23:23 UTC

33 points

17 comments2 min readLW link

Automated Alignment is Harder Than You Think

Aleksandr Bowkis, Marie_DB, Jacob Pfau and Geoffrey Irving

14 May 2026 22:01 UTC

143 points

7 comments3 min readLW link

(arxiv.org)

2B scoring model flags out-of-domain misalignment, suggesting specialist judges have potential for audits

burnssa14 May 2026 20:00 UTC

8 points

0 comments6 min readLW link

The safe-to-dangerous shift is a fundamental problem for eval realism; but also for measuring awareness

Charlie Griffin and Patrick Leask

14 May 2026 17:05 UTC

59 points

3 comments3 min readLW link

AI #168: Not Leading the Future

Zvi14 May 2026 14:10 UTC

38 points

2 comments45 min readLW link

(thezvi.wordpress.com)

Why Ensuring Flourishing Is Not About Alignment

ofpetro14 May 2026 6:24 UTC

5 points

6 comments35 min readLW link

Intervening on Sparse, Anchored Concepts

Sandy Fraser14 May 2026 4:35 UTC

24 points

3 comments10 min readLW link

Algorithmic Perfection

zw514 May 2026 3:44 UTC

5 points

1 comment2 min readLW link

Models finding software vulnerabilities is not the primary source of cybersecurity risk

lc14 May 2026 3:39 UTC

310 points

24 comments2 min readLW link

Claude is Now Alignment-Pretrained

RogerDearnaley13 May 2026 23:19 UTC

87 points

9 comments1 min readLW link

(www.anthropic.com)

MATS Autumn 2026 Fellowship Applications Now Open—Apply by June 7

Elise Racine, Raj Thimmiah and Ryan Kidd

13 May 2026 21:40 UTC

21 points

0 comments2 min readLW link

Building Connections

Xenomirant and Jamilya Erkenova

13 May 2026 20:27 UTC

8 points

0 comments5 min readLW link

A lack of introspective ability is not a lack of corrigibility

lc13 May 2026 20:23 UTC

26 points

3 comments1 min readLW link

Cyber Lack of Security and AI Governance

Zvi13 May 2026 20:20 UTC

41 points

1 comment16 min readLW link

(thezvi.wordpress.com)

Stickiness in AI Behavioral Design

James_T13 May 2026 19:55 UTC

10 points

0 comments14 min readLW link

(www.forethought.org)

Predicting Rare LLM Failures with 30× Fewer Rollouts

Santiago Aranguri and Francisco Pernice

13 May 2026 17:53 UTC

55 points

3 comments5 min readLW link

Most “inner work” looks like entertainment.

Chris Lakin13 May 2026 17:51 UTC

48 points

10 comments2 min readLW link

A Research Agenda for Secret Loyalties

Joe Kwon, Alfie Lamerton, draganover, Dave Banerjee, Bronson Schoen, Daniel Kokotajlo, ryan_greenblatt, Owain_Evans, Fabien Roger and Tom Davidson

13 May 2026 17:34 UTC

35 points

3 comments3 min readLW link

 Apollo Update May 2026

Marius Hobbhahn13 May 2026 16:43 UTC

48 points

0 comments1 min readLW link

(www.apolloresearch.ai)

The case for fine-grained tracking of compute for AI

Farhan and Katherine Biewer

13 May 2026 16:00 UTC

36 points

17 comments9 min readLW link

(forum.effectivealtruism.org)

Vibe Excel and the Future of White-Collar Work

ykevinzhang13 May 2026 15:39 UTC

13 points

5 comments6 min readLW link

“Community organizer” is a double oxymoron

jchan13 May 2026 15:10 UTC

5 points

13 comments5 min readLW link

Voters are surprisingly open to talking about AI risk

less_raichu13 May 2026 14:08 UTC

117 points

11 comments3 min readLW link

Civilization as a tower of holes

Joe Rogero13 May 2026 13:48 UTC

24 points

3 comments4 min readLW link

(subatomicarticles.com)

Applications Open for Impact Accelerator Program

High Impact Professionals13 May 2026 8:35 UTC

6 points

0 comments1 min readLW link

Epistemic Immunodepression in the Age of AI

Tuyen Tran13 May 2026 5:49 UTC

15 points

5 comments2 min readLW link

Lorxus Does Budget Inkhaven Again: 4/29, 4/30, Highlights, Postmortem

Lorxus13 May 2026 1:37 UTC

15 points

0 comments3 min readLW link

(tiled-with-pentagons.blogspot.com)

Guesstimate For Prediction Market Returns

DirectedEvolution12 May 2026 23:13 UTC

10 points

0 comments1 min readLW link

Probabilistic, Reformative Justice

Leo Schmidt-Traub12 May 2026 22:41 UTC

2 points

0 comments3 min readLW link

This is a Dating Ad

Xger3112 May 2026 22:37 UTC

−17 points

6 comments3 min readLW link

Reinforcement Learning, Agency and Taste

epicurus12 May 2026 18:22 UTC

7 points

0 comments9 min readLW link

Childhood and Education #18: Do The Math

Zvi12 May 2026 18:20 UTC

56 points

11 comments13 min readLW link

(thezvi.wordpress.com)

The Owned Ones

Eliezer Yudkowsky12 May 2026 17:56 UTC

369 points

51 comments6 min readLW link

Signaling and Perverse Adoption of Expensive AI

Adam Chlipala12 May 2026 14:34 UTC

−21 points

2 comments8 min readLW link

On Having Good Hot Takes

Celer12 May 2026 14:20 UTC

9 points

2 comments8 min readLW link

(keller.substack.com)

Optimisation: Selective versus Predictive

Raymond Douglas12 May 2026 14:03 UTC

117 points

15 comments3 min readLW link

The Lies and Fallacies of the Buyer and Seller

Hide12 May 2026 11:26 UTC

28 points

18 comments16 min readLW link

(hidefromit.substack.com)

When should an AI incident trigger an international response? Criteria for international escalation and implications for the design of AI incident frameworks

Josephine Schwab, Francesca Gomez, Mike Harre, Matthew Ball and Lydia Preston

12 May 2026 8:52 UTC

13 points

0 comments4 min readLW link

Verbalised evaluation awareness in language models has little effect on their behaviour

Amelie Knecht, lucas-florin and Thilo Hagendorff

12 May 2026 5:36 UTC

19 points

1 comment6 min readLW link

The terrible weight of seeing the board

philosophybear12 May 2026 5:13 UTC

1 point

8 comments9 min readLW link

Fibonacci Structure in Harmonic Series Partitions

Avyukth Nilajagi12 May 2026 4:26 UTC

5 points

1 comment2 min readLW link

Hedging global oil supply shocks?

Nicholas Kross12 May 2026 1:37 UTC

14 points

2 comments1 min readLW link

Our experience of the first research in a project incubator: much more than you wanted to know

Ivan Belashkin and Iosif Polozov

11 May 2026 20:28 UTC

7 points

0 comments10 min readLW link

I don’t have questions: how a good Jewish boy turns atheist

Semi-Pseudonymous11 May 2026 20:11 UTC

22 points

4 comments6 min readLW link

Foresight Institute Workshop (Berlin): Bootstrapping Research Agents — Hands-On for Scientists

morisil11 May 2026 20:11 UTC

1 point

0 comments1 min readLW link

Experience Report: ML4Good AI Governance Bootcamp,Lyon,May 2026

Rohit Mehdiratta11 May 2026 20:05 UTC

0 points

0 comments3 min readLW link

[Academic questionnaire] Human reasoning in social deduction games vs. LLM reasoning.

atuin11 May 2026 20:01 UTC

1 point

0 comments1 min readLW link

Where are all the Decision Markets?

alexjaniak11 May 2026 19:48 UTC

13 points

3 comments3 min readLW link

RFDiffusion3: A Brief Exploration

michaelwaves11 May 2026 19:26 UTC

3 points

0 comments5 min readLW link