All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30

Is Gemini now better than Claude at Pokémon?

Julian Bradshaw19 Apr 2025 23:34 UTC

92 points

12 comments5 min readLW link

Impact, agency, and taste

benkuhn19 Apr 2025 21:10 UTC

210 points

10 comments8 min readLW link

(www.benkuhn.net)

Moral patienthood of simulated minds allows uncountabe infinity of value on finite hardware

Luck19 Apr 2025 20:41 UTC

2 points

13 comments2 min readLW link

When the Model Starts Talking Like Me: A User-Induced Structural Adaptation Case Study

Junxi19 Apr 2025 19:40 UTC

3 points

1 comment4 min readLW link

A Block-Based Regularization Proposal for Neural Networks

Otto.Dev19 Apr 2025 18:56 UTC

−8 points

0 comments1 min readLW link

How Close We Are to a Complete List of Imprinted Genes

Morpheus19 Apr 2025 18:37 UTC

30 points

3 comments14 min readLW link

(www.tassiloneubauer.com)

Novel Idea Generation in LLMs: Judgment as Bottleneck

Davey Morse19 Apr 2025 15:37 UTC

−2 points

1 comment1 min readLW link

Why Should I Assume CCP AGI is Worse Than USG AGI?

Tomás B.19 Apr 2025 14:47 UTC

268 points

89 comments1 min readLW link

An Introduction to SAEs and their Variants for Mech Interp

Adam Newgas19 Apr 2025 14:09 UTC

17 points

0 comments10 min readLW link

AI Advances and Detection Strategy

jefftk19 Apr 2025 11:40 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

The System Didn’t, and Doesn’t Need to be This Way ~ Thomas Paine on Economic Justice

James Stephen Brown19 Apr 2025 5:16 UTC

2 points

3 comments4 min readLW link

(nonzerosum.games)

SecureDrop review

samuelshadrach19 Apr 2025 4:29 UTC

2 points

0 comments5 min readLW link

(samuelshadrach.com)

AI, Alignment & the Art of Relationship Design

Priyanka Bharadwaj19 Apr 2025 0:47 UTC

6 points

4 comments2 min readLW link

Measuring Beliefs of Language Models During Chain-of-Thought Reasoning

Baram Sosis and Tomáš Gavenčiak

18 Apr 2025 22:56 UTC

12 points

0 comments13 min readLW link

LLM-based Fact Checking for Popular Posts?

azergante18 Apr 2025 21:26 UTC

1 point

2 comments62 min readLW link

o3 Will Use Its Tools For You

Zvi18 Apr 2025 21:20 UTC

46 points

3 comments45 min readLW link

(thezvi.wordpress.com)

AI Control Methods Literature Review

Ram Potham18 Apr 2025 21:15 UTC

11 points

1 comment9 min readLW link

Consequentialists should have a comprehensive set of deontological beliefs they adhere to

Jay9518 Apr 2025 20:50 UTC

3 points

2 comments1 min readLW link

What Makes an AI Startup “Net Positive” for Safety?

jacquesthibs18 Apr 2025 20:33 UTC

82 points

23 comments2 min readLW link

Alignment Does Not Need to Be Opaque! An Introduction to Feature Steering with Reinforcement Learning

Jeremias Ferrao18 Apr 2025 19:34 UTC

10 points

0 comments10 min readLW link

Evaluating Collaborative AI Performance Subject to Sabotage

Matthew Khoriaty18 Apr 2025 19:33 UTC

3 points

0 comments19 min readLW link

Inside OpenAI’s Controversial Plan to Abandon its Nonprofit Roots

garrison18 Apr 2025 18:46 UTC

21 points

0 comments11 min readLW link

(garrisonlovely.substack.com)

Could LLMs Learn to Detect Bias Autonomously, Like Tesla’s Self-Driving Cars?

Omnipheasant18 Apr 2025 18:45 UTC

0 points

0 comments3 min readLW link

Scaffolding Skills

Screwtape18 Apr 2025 17:39 UTC

37 points

9 comments4 min readLW link

[Rockville] Rationalist Shabbat

maia18 Apr 2025 15:38 UTC

8 points

0 comments1 min readLW link

Handling schemers if shutdown is not an option

Buck18 Apr 2025 14:39 UTC

43 points

2 comments14 min readLW link

British and American Connotations

jefftk18 Apr 2025 13:00 UTC

14 points

4 comments1 min readLW link

(www.jefftk.com)

Towards Understanding the Representation of Belief State Geometry in Transformers

Karthik Viswanathan18 Apr 2025 12:39 UTC

5 points

0 comments12 min readLW link

Training AGI in Secret would be Unsafe and Unethical

Daniel Kokotajlo18 Apr 2025 12:27 UTC

144 points

16 comments6 min readLW link

Karma Tests in Logical Counterfactual Simulations motivates strong agents to protect weak agents

Knight Lee18 Apr 2025 11:11 UTC

9 points

8 comments3 min readLW link

What If Galaxies Are Alive and Atoms Have Minds? A Thought Experiment on Life Across Scales

Saif Khan18 Apr 2025 10:01 UTC

−2 points

5 comments3 min readLW link

Three Months In, Evaluating Three Rationalist Cases for Trump

Arjun Panickssery18 Apr 2025 8:27 UTC

118 points

33 comments4 min readLW link

[Question] Comprehensive up-to-date resources on the Chinese Communist Party’s AI strategy, etc?

Mateusz Bagiński18 Apr 2025 4:58 UTC

14 points

6 comments1 min readLW link

Conditional Forecasting as Model Parameterization

Molly18 Apr 2025 2:35 UTC

15 points

0 comments7 min readLW link

(cuttyshark.substack.com)

One Night in Delphi

Eggs18 Apr 2025 2:17 UTC

4 points

2 comments3 min readLW link

The Russell Conjugation Illuminator

TimmyM17 Apr 2025 19:33 UTC

51 points

14 comments1 min readLW link

(russellconjugations.com)

Announcing Progress Conference 2025

jasoncrawford17 Apr 2025 17:12 UTC

12 points

0 comments1 min readLW link

(newsletter.rootsofprogress.org)

The Mirror Paradox

Jeremy Kraybill17 Apr 2025 16:23 UTC

−6 points

0 comments1 min readLW link

Memory Decoding Journal Club

Devin Ward17 Apr 2025 16:19 UTC

1 point

0 comments1 min readLW link

Host Keys and SSHing to EC2

jefftk17 Apr 2025 15:10 UTC

10 points

6 comments1 min readLW link

(www.jefftk.com)

AI #112: Release the Everything

Zvi17 Apr 2025 15:10 UTC

41 points

6 comments40 min readLW link

(thezvi.wordpress.com)

On AI personhood

p.b.17 Apr 2025 12:31 UTC

4 points

7 comments1 min readLW link

Automating Mechanistic Interpretability via Program Synthesis

Edy Nastase17 Apr 2025 10:58 UTC

1 point

1 comment1 min readLW link

Understanding and overcoming AGI apathy

Dhruv Sumathi17 Apr 2025 1:04 UTC

25 points

1 comment13 min readLW link

(dhruvsumathi.substack.com)

ALLFED emergency appeal: Help us raise $800,000 to avoid cutting half of programs

denkenberger16 Apr 2025 21:47 UTC

49 points

9 comments3 min readLW link

Prodromes and Biomarkers in Chronic Disease

sarahconstantin16 Apr 2025 21:30 UTC

23 points

2 comments3 min readLW link

(sarahconstantin.substack.com)

The Practical Imperative for AI Control Research

Archana Vaidheeswaran16 Apr 2025 20:27 UTC

1 point

0 comments4 min readLW link

METR’s preliminary evaluation of o3 and o4-mini

Christopher King16 Apr 2025 20:23 UTC

14 points

7 comments1 min readLW link

(metr.github.io)

Mass Exposure Paradox

max-sixty16 Apr 2025 20:18 UTC

6 points

2 comments2 min readLW link

GPT-4.5 is Cognitive Empathy, Sonnet 3.5 is Affective Empathy

Jack16 Apr 2025 19:12 UTC

15 points

2 comments4 min readLW link