All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

AllJanFeb Mar Apr May Jun

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 181920 21 22 23 24 25 26 27 28 29 30 31

VLAs as Model Organisms for AI Safety

TheSigillite18 Jan 2026 22:40 UTC

16 points

0 comments6 min readLW link

“The first two weeks are the hardest”: my first digital declutter

mingyuan18 Jan 2026 22:04 UTC

219 points

11 comments2 min readLW link

(mingyuan.substack.com)

When the LLM isn’t the one who’s wrong

Julian Bradshaw18 Jan 2026 21:37 UTC

81 points

9 comments2 min readLW link

Lifelink™: Freedom for your Child

TsviBT18 Jan 2026 20:35 UTC

9 points

1 comment3 min readLW link

How to Love Them Equally

Shoshannah Tekofsky18 Jan 2026 17:09 UTC

38 points

5 comments2 min readLW link

(shoshanigans.substack.com)

Massive Activations in DroPE: Evidence for Attention Reorganization

David Africa18 Jan 2026 15:05 UTC

19 points

0 comments8 min readLW link

Irrationality as a Defense Mechanism for Reward-hacking

Ashe Vazquez Nuñez18 Jan 2026 3:57 UTC

49 points

8 comments4 min readLW link

Blogging, Writing, Musing, And Thinking

sonicrocketman18 Jan 2026 3:28 UTC

11 points

4 comments3 min readLW link

(brianschrader.com)

Is METR Underestimating LLM Time Horizons?

andreasrobinson18 Jan 2026 1:19 UTC

40 points

6 comments17 min readLW link

Understanding Trust: Project Update

abramdemski17 Jan 2026 21:19 UTC

66 points

0 comments2 min readLW link

Focusing on Flourishing Even When Survival is Unlikely (Part I)

Cleo Nardo17 Jan 2026 18:47 UTC

24 points

3 comments4 min readLW link

The truth behind the 2026 J.P. Morgan Healthcare Conference

Abhishaike Mahajan17 Jan 2026 17:28 UTC

83 points

35 comments9 min readLW link

(www.owlposting.com)

Japan is a bank

bhauth17 Jan 2026 16:33 UTC

22 points

2 comments1 min readLW link

(www.bhauth.com)

Turning Down the Overthinking: How Cathodal Brain Stimulation Could Transform Stuttering Therapy

Rudaiba17 Jan 2026 14:54 UTC

9 points

0 comments8 min readLW link

What Washington Says About AGI

Zephaniah Roe17 Jan 2026 5:43 UTC

134 points

7 comments6 min readLW link

Lightcone is hiring a generalist, a designer, and a campus operations co-lead

habryka17 Jan 2026 1:47 UTC

118 points

0 comments5 min readLW link

Applying to MATS: What the Program Is Like, and Who It’s For

Raj Thimmiah, Elise Racine and Ryan Kidd

17 Jan 2026 0:25 UTC

24 points

1 comment5 min readLW link

Forfeiting Ill-Gotten Gains

jefftk17 Jan 2026 0:20 UTC

47 points

6 comments1 min readLW link

(www.jefftk.com)

Is It Reasoning or Just a Fixed Bias?

Sriram Kiron16 Jan 2026 21:43 UTC

14 points

0 comments1 min readLW link

(ramaway.com)

Future-as-Label: Scalable Supervision from Real-World Outcomes

Ben Turtel16 Jan 2026 21:21 UTC

−1 points

2 comments1 min readLW link

Comparing yourself to other people

dominicq16 Jan 2026 20:31 UTC

10 points

3 comments2 min readLW link

(sundaystopwatch.eu)

Eliciting Frontier Model Character Training

avikrishna16 Jan 2026 20:15 UTC

1 point

0 comments7 min readLW link

Precedents for the Unprecedented: Historical Analogies for Thirteen Artificial Superintelligence Risks

James_Miller16 Jan 2026 18:43 UTC

165 points

15 comments63 min readLW link

Why falling labor share ≠ falling employment

Lydia Nottingham16 Jan 2026 17:27 UTC

−5 points

3 comments2 min readLW link

(lydianottingham.substack.com)

Digital Minds: A Quickstart Guide

Avi Parrack and Štěpán Los

16 Jan 2026 17:15 UTC

10 points

1 comment22 min readLW link

(aviparrack.substack.com)

The culture and design of human-AI interactions

zef16 Jan 2026 17:11 UTC

2 points

0 comments4 min readLW link

(bloodsteel.substack.com)

Confession: I pranked Inkhaven to make sure no one fails

Mikhail Samin16 Jan 2026 16:03 UTC

52 points

1 comment10 min readLW link

(open.substack.com)

Monthly Roundup #38: January 2026

Zvi16 Jan 2026 15:10 UTC

22 points

3 comments26 min readLW link

(thezvi.wordpress.com)

Scaling Laws for Economic Impacts: Experimental Evidence from 500 Professionals and 13 LLMs

Ali Merali16 Jan 2026 13:40 UTC

21 points

6 comments4 min readLW link

[Pre-print] Building safe AGI as an ergonomics problem

ricardotkcl and paris.lalousis@kcl.ac.uk

16 Jan 2026 13:18 UTC

1 point

0 comments1 min readLW link

(doi.org)

Powerful misaligned AIs may be extremely persuasive, especially absent mitigations

Cody Rushing16 Jan 2026 8:08 UTC

68 points

5 comments14 min readLW link

How to Use Foam Earplugs Correctly

Morpheus16 Jan 2026 7:47 UTC

8 points

2 comments1 min readLW link

(www.tassiloneubauer.com)

Should control down-weight negative net-sabotage-value threats?

Fabien Roger16 Jan 2026 4:18 UTC

35 points

0 comments10 min readLW link

The Default Contra Dance Weekend Deal

jefftk16 Jan 2026 0:50 UTC

12 points

0 comments5 min readLW link

(www.jefftk.com)

Total utilitarianism is fine

Abhimanyu Pallavi Sudhir16 Jan 2026 0:32 UTC

4 points

3 comments3 min readLW link

Test your interpretability techniques by de-censoring Chinese models

Khoi Tran, aryaj, Senthooran Rajamanoharan and Neel Nanda

15 Jan 2026 16:33 UTC

91 points

14 comments20 min readLW link

Reflections on TA-ing Harvard’s first AI safety course

Roy Rinberg15 Jan 2026 16:28 UTC

79 points

4 comments9 min readLW link

I Made a Judgment Calibration Game for Beginners (Calibrate)

Luise Woehlke15 Jan 2026 15:04 UTC

15 points

2 comments1 min readLW link

AI #151: While Claude Coworks

Zvi15 Jan 2026 14:30 UTC

38 points

5 comments31 min readLW link

(thezvi.wordpress.com)

Corrigibility Scales To Value Alignment

PeterMcCluskey15 Jan 2026 0:05 UTC

13 points

12 comments5 min readLW link

(bayesianinvestor.com)

Deeper Reviews for the top 15 (of the 2024 Review)

Raemon14 Jan 2026 23:59 UTC

45 points

4 comments5 min readLW link

If we get primary cruxes right, secondary cruxes will be solved automatically

Jordan Arel14 Jan 2026 22:44 UTC

1 point

1 comment4 min readLW link

Boltzmann Tulpas

Mariven14 Jan 2026 21:45 UTC

21 points

6 comments13 min readLW link

(mariven.substack.com)

Status In A Tribe Of One

J Bostock14 Jan 2026 20:44 UTC

27 points

2 comments2 min readLW link

Quantifying Love and Hatred

RobinHa14 Jan 2026 20:40 UTC

10 points

8 comments1 min readLW link

Why we are excited about confession!

Boaz Barak, Gabriel Wu and Manas Joglekar

14 Jan 2026 20:37 UTC

138 points

32 comments9 min readLW link

(alignment.openai.com)

Why Motivated Reasoning?

johnswentworth14 Jan 2026 19:55 UTC

78 points

20 comments5 min readLW link

When Will They Take Our Jobs?

Zvi14 Jan 2026 19:40 UTC

39 points

1 comment8 min readLW link

(thezvi.wordpress.com)

The Many Ways of Knowing

Gordon Seidoh Worley14 Jan 2026 17:00 UTC

18 points

1 comment5 min readLW link

(www.uncertainupdates.com)

GD Roundup #4 - inference, monopolies, and AI Jesus

Raymond Douglas14 Jan 2026 15:43 UTC

38 points

0 comments6 min readLW link