All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161718 19 20 21 22 23 24 25 26 27 28 29 30 31

Non-Scheming Saints (Whether Human Or Digital) Might Be Shirking Their Governance Duties, And, If True, It Is Probably An Objective Tragedy

JenniferRM16 Dec 2025 23:56 UTC

42 points

3 comments9 min readLW link

A Primer on Operant Conditioning

foodforthought16 Dec 2025 21:26 UTC

5 points

0 comments4 min readLW link

Towards training-time mitigations for alignment faking in RL

Vlad Mikulik, gasteigerjo, Hoagy, Joe Benton, Benjamin Wright, Jonathan Uesato, Monte M, Fabien Roger and evhub

16 Dec 2025 21:01 UTC

39 points

1 comment5 min readLW link

(alignment.anthropic.com)

Measuring Drug Target Success

sarahconstantin16 Dec 2025 21:00 UTC

19 points

3 comments2 min readLW link

(sarahconstantin.substack.com)

A Study in Attention

hamilton16 Dec 2025 20:39 UTC

14 points

0 comments2 min readLW link

Emergent Sycophancy

ohdearohdear16 Dec 2025 20:21 UTC

8 points

0 comments5 min readLW link

Systems of Control

phoenix16 Dec 2025 19:00 UTC

15 points

3 comments22 min readLW link

Discursive Games, Discursive Warfare

Suspended Reason16 Dec 2025 18:24 UTC

36 points

0 comments30 min readLW link

Scientific breakthroughs of the year

technicalities16 Dec 2025 18:00 UTC

185 points

13 comments3 min readLW link

(x.com)

In defense of slop

jasoncrawford16 Dec 2025 17:36 UTC

20 points

3 comments4 min readLW link

(newsletter.rootsofprogress.org)

TSMC most definitely has a golden record of all AI chips it made

Naci Cankaya16 Dec 2025 17:20 UTC

3 points

0 comments1 min readLW link

(nacicankaya.substack.com)

The $140,000 Question

Zvi16 Dec 2025 16:50 UTC

19 points

0 comments15 min readLW link

(thezvi.wordpress.com)

Radiology Automation Does Not Generalize to Other Jobs

Xodarap16 Dec 2025 14:32 UTC

47 points

5 comments1 min readLW link

Fermi paradox solutions map

avturchin16 Dec 2025 14:21 UTC

27 points

9 comments1 min readLW link

According to doctors, how feasible is preserving the dying for future revival?

Ariel Zeleznikow-Johnston16 Dec 2025 13:18 UTC

18 points

2 comments2 min readLW link

(open.substack.com)

A friction in my dealings with friends who have not yet bought into the reality of AI risk

Olle Häggström16 Dec 2025 8:12 UTC

19 points

13 comments4 min readLW link

A Rationalist Christmas

Ryan Meservey16 Dec 2025 7:23 UTC

5 points

1 comment4 min readLW link

[Question] Why do LLMs so often say “It’s not an X, it’s a Y”?

ChristianKl16 Dec 2025 1:02 UTC

27 points

13 comments1 min readLW link

Response to titotal’s critique of our AI 2027 timelines model

elifland and Daniel Kokotajlo

16 Dec 2025 0:51 UTC

46 points

6 comments43 min readLW link

(aifuturesnotes.substack.com)

Introducing Lunette: auditing agents for evals and environments

zef, leni and kaivu

15 Dec 2025 23:17 UTC

23 points

0 comments1 min readLW link

(fulcrumresearch.ai)

Private AI clouds are the future of inference

perfectfwd15 Dec 2025 23:04 UTC

3 points

0 comments9 min readLW link

(perfectforward.substack.com)

Naming

CTA15 Dec 2025 23:00 UTC

3 points

0 comments4 min readLW link

Viewing animals as economic agents

foodforthought15 Dec 2025 18:13 UTC

10 points

2 comments5 min readLW link

Digital Freedom Fund open for grant applications (Deadline: 17th February)

gergogaspar15 Dec 2025 16:25 UTC

8 points

0 comments1 min readLW link

Луна Лавгуд и Комната Тайн, Часть 9

Kongo Landwalker and lsusr

15 Dec 2025 16:01 UTC

2 points

0 comments1 min readLW link

Defending Against Model Weight Exfiltration Through Inference Verification

Roy Rinberg, Adam Karvonen, dreuter and Keri Warr

15 Dec 2025 15:26 UTC

120 points

15 comments8 min readLW link

Rotations in Superposition

Linda Linsefors and Lucius Bushnaq

15 Dec 2025 14:58 UTC

54 points

6 comments11 min readLW link

What is an evaluation, and why this definition matters

Igor Ivanov15 Dec 2025 14:53 UTC

33 points

1 comment7 min readLW link

Conscious stars

Alexandre Variengien15 Dec 2025 14:49 UTC

7 points

0 comments4 min readLW link

(alexandrevariengien.com)

A Case for Model Persona Research

nielsrolf, Maxime Riché and Daniel Tan

15 Dec 2025 13:35 UTC

121 points

11 comments4 min readLW link

GPT-5.2 Is Frontier Only For The Frontier

Zvi15 Dec 2025 13:20 UTC

33 points

1 comment19 min readLW link

(thezvi.wordpress.com)

[Question] How to account for misinformation when looking for effective altruist causes?

SpectrumDT15 Dec 2025 13:13 UTC

8 points

2 comments1 min readLW link

Do you love Berkeley, or do you just love Lighthaven conferences?

Screwtape15 Dec 2025 7:48 UTC

86 points

4 comments5 min readLW link

When bits of optimization imply bits of modeling: the Touchette-Lloyd theorem

Alfred Harwood and Alex_Altair

15 Dec 2025 4:21 UTC

32 points

0 comments11 min readLW link

Notes on Software-Based Compute-Usage Verification

Alek Westover15 Dec 2025 3:40 UTC

9 points

0 comments12 min readLW link

Designing a Job Displacement Model

claywren14 Dec 2025 22:23 UTC

22 points

0 comments19 min readLW link

A high integrity/epistemics political coalition?

Raemon14 Dec 2025 22:21 UTC

149 points

34 comments13 min readLW link

Fanning Radiators

jefftk14 Dec 2025 21:10 UTC

14 points

0 comments1 min readLW link

(www.jefftk.com)

Abstraction as a generalization of algorithmic Markov condition

Daniel C14 Dec 2025 18:55 UTC

8 points

0 comments7 min readLW link

No, Americans Don’t Think Foreign Aid Is 26% of the Budget

Julius14 Dec 2025 18:47 UTC

67 points

18 comments5 min readLW link

(thegreymatter.substack.com)

A Life That Cannot Be A Failure

Bentham's Bulldog14 Dec 2025 16:40 UTC

−7 points

0 comments5 min readLW link

Should LLMs accept invites to Epstein’s island?

Lukas Petersson14 Dec 2025 15:21 UTC

5 points

0 comments1 min readLW link

(lukaspetersson.com)

The Axiom of Choice is Not Controversial

GenericModel14 Dec 2025 4:08 UTC

44 points

29 comments7 min readLW link

(enrichedjamsham.substack.com)

Open Source Replication of the Auditing Game Model Organism

abhayesian14 Dec 2025 2:10 UTC

24 points

0 comments1 min readLW link

(alignment.anthropic.com)

Why did I believe Oliver Sacks?

Eye You13 Dec 2025 23:39 UTC

70 points

17 comments1 min readLW link

In Favor of Inkhaven-But-Less

Alice Blair13 Dec 2025 23:16 UTC

26 points

6 comments2 min readLW link

Micro-visions for AI-powered online content

Alexandre Variengien13 Dec 2025 23:05 UTC

11 points

0 comments8 min readLW link

(alexandrevariengien.com)

When is it Worth Working?

foodforthought13 Dec 2025 21:40 UTC

23 points

1 comment6 min readLW link

[Question] What does “lattice of abstraction” mean?

Adam Zerner13 Dec 2025 21:19 UTC

11 points

8 comments1 min readLW link

Filler tokens don’t allow sequential reasoning

Brendan Long13 Dec 2025 20:22 UTC

77 points

5 comments1 min readLW link

(www.brendanlong.com)