All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

AllJanFeb Mar Apr May Jun

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 242526 27 28 29 30 31

Skill: cognitive black box flight recorder

TsviBT24 Jan 2026 22:54 UTC

27 points

2 comments5 min readLW link

In Defense of Memorization

David Goodman24 Jan 2026 22:49 UTC

24 points

7 comments13 min readLW link

Thinking from the Other Side: Should I Wash My Hair with Shampoo?

R0sberg24 Jan 2026 22:47 UTC

6 points

1 comment2 min readLW link

Small language models hallucinate knowing something’s off.

Toheed24 Jan 2026 22:46 UTC

12 points

0 comments5 min readLW link

IABIED Book Review: Core Arguments and Counterarguments

Stephen McAleese24 Jan 2026 14:25 UTC

90 points

39 comments25 min readLW link

The Global AI Dataset (GAID) Project: From Closing Research Gaps to Building Responsible and Trustworthy AI

Jason Hung24 Jan 2026 3:23 UTC

7 points

0 comments15 min readLW link

A Black Box Made Less Opaque (part 1)

Matthew McDonnell24 Jan 2026 3:20 UTC

6 points

0 comments12 min readLW link

A Simple Method for Accelerating Grokking

josh :)24 Jan 2026 3:19 UTC

14 points

1 comment3 min readLW link

Who is choosing your preferences- You or your Mind?

shanzson24 Jan 2026 3:17 UTC

0 points

4 comments1 min readLW link

How I Used Methodable to Have a Nice Tuesday

dnsosebee24 Jan 2026 2:57 UTC

4 points

0 comments10 min readLW link

AI X-Risk Bottleneck = Advocacy?

fortytwo24 Jan 2026 2:52 UTC

10 points

0 comments1 min readLW link

Every Benchmark is Broken

Jonathan Gabor24 Jan 2026 2:42 UTC

95 points

0 comments4 min readLW link

(jonathanpgabor.substack.com)

Thousand Year Old Advice on Relinquishing Control to AI

Dom Polsinelli24 Jan 2026 2:20 UTC

−3 points

2 comments3 min readLW link

(dompols.substack.com)

AI Must Learn to Police Itself

savant23 Jan 2026 22:39 UTC

1 point

0 comments2 min readLW link

Condensation & Relevance

abramdemski23 Jan 2026 22:21 UTC

38 points

0 comments5 min readLW link

Paying attention to Attention Sinks

Mitali M23 Jan 2026 21:40 UTC

11 points

5 comments1 min readLW link

Dating Roundup #11: Going Too Meta

Zvi23 Jan 2026 20:50 UTC

40 points

4 comments14 min readLW link

(thezvi.wordpress.com)

The Artificial Man

Jack Bradshaw23 Jan 2026 19:55 UTC

1 point

0 comments2 min readLW link

The Long View Of History

sonicrocketman23 Jan 2026 19:30 UTC

10 points

2 comments2 min readLW link

(brianschrader.com)

Emergency Response Measures for Catastrophic AI Risk

MKodama23 Jan 2026 18:18 UTC

27 points

2 comments3 min readLW link

Eliciting base models with simple unsupervised techniques

Callum Canavan, Aditya Shrivastava, Allison Qi, Tianyi (Alex) Qiu, Jonathan Michala and Fabien Roger

23 Jan 2026 18:06 UTC

34 points

2 comments8 min readLW link

New version of “Intro to Brain-Like-AGI Safety”

Steven Byrnes23 Jan 2026 16:21 UTC

63 points

1 comment19 min readLW link

Automated Alignment Research, Abductively

future_detective23 Jan 2026 16:14 UTC

2 points

0 comments2 min readLW link

Digital Consciousness Model Results and Key Takeaways

arvomm, derek shiller, David Moss, Adrià Moret and ChrisPercy

23 Jan 2026 15:58 UTC

15 points

0 comments6 min readLW link

From Neurons to Newtons: What can the brain teach us about physics?

Carly Turini23 Jan 2026 15:20 UTC

1 point

0 comments1 min readLW link

A Framework for Eval Awareness

LAThomson23 Jan 2026 10:16 UTC

38 points

5 comments8 min readLW link

All Of The Good Things, None Of The Bad Things

omegastick23 Jan 2026 9:50 UTC

8 points

1 comment1 min readLW link

(dumbideas.xyz)

Are Short AI Timelines Really Higher-Leverage?

Mia Taylor and wdmacaskill

23 Jan 2026 7:28 UTC

25 points

1 comment15 min readLW link

(www.forethought.org)

Principles for Meta-Science and AI Safety Replications

Zephaniah Roe23 Jan 2026 6:59 UTC

47 points

7 comments4 min readLW link

Value Learning Needs a Low-Dimensional Bottleneck

Gunnar_Zarncke23 Jan 2026 2:12 UTC

24 points

7 comments1 min readLW link

A quick, elegant derivation of Bayes’ Theorem

RohanS23 Jan 2026 1:40 UTC

37 points

7 comments1 min readLW link

The World Hasn’t Gone Mad

goldfine23 Jan 2026 0:01 UTC

19 points

3 comments2 min readLW link

(itsnotgambling.substack.com)

Like night and day: Light glasses and dark therapy can treat non-24 (and SAD)

JennaS22 Jan 2026 23:23 UTC

30 points

1 comment9 min readLW link

Does Pentagon Pizza Theory Work?

rba22 Jan 2026 19:24 UTC

140 points

11 comments5 min readLW link

(goflaw.substack.com)

The phases of an AI takeover

sjadler22 Jan 2026 19:09 UTC

12 points

1 comment9 min readLW link

(stevenadler.substack.com)

Will we get automated alignment research before an AI Takeoff?

Jan Wehner22 Jan 2026 17:46 UTC

33 points

2 comments11 min readLW link

[Question] How Could I Have Learned That Faster?

Dom Polsinelli22 Jan 2026 17:35 UTC

9 points

4 comments2 min readLW link

AI can suddenly become dangerous despite gradual progress

Simon Lermen22 Jan 2026 16:47 UTC

15 points

0 comments4 min readLW link

(simonlermen.substack.com)

Releasing TakeOverBench.com: a benchmark, for AI takeover

otto.barten22 Jan 2026 16:34 UTC

16 points

5 comments1 min readLW link

AI #152: Brought To You By The Torment Nexus

Zvi22 Jan 2026 14:40 UTC

35 points

5 comments56 min readLW link

(thezvi.wordpress.com)

Resisting Reality

robertzk22 Jan 2026 13:50 UTC

26 points

3 comments6 min readLW link

Experiments on Reward Hacking Monitorability in Language Models

Monketo22 Jan 2026 2:42 UTC

9 points

0 comments8 min readLW link

Neural chameleons can(’t) hide from activation oracles

ceselder22 Jan 2026 1:47 UTC

55 points

5 comments3 min readLW link

Dedicated continuous supervision of AI companies

Michael Bennett22 Jan 2026 1:47 UTC

8 points

0 comments15 min readLW link

Uncovering Unfaithful CoT in Deceptive Models

Agastya Agrawal22 Jan 2026 1:46 UTC

12 points

2 comments3 min readLW link

Claude’s Constitution is an excellent guide for humans, too

Eye You22 Jan 2026 1:26 UTC

27 points

0 comments5 min readLW link

The first type of transformative AI?

Lizka21 Jan 2026 23:47 UTC

19 points

0 comments1 min readLW link

(www.forethought.org)

How (and why) to read Drexler on AI

owencb21 Jan 2026 23:25 UTC

55 points

12 comments6 min readLW link

(strangecities.substack.com)

Finding Yourself in Others

1a3orn21 Jan 2026 23:22 UTC

51 points

1 comment4 min readLW link

AI Risks Slip Out of Mind

MarkelKori21 Jan 2026 22:30 UTC

5 points

1 comment1 min readLW link