All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

All Jan Feb Mar AprMayJun Jul

All 1 2 3 4 567 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

What’s going on with AI progress and trends? (As of 5/2025)

ryan_greenblatt2 May 2025 19:00 UTC

75 points

8 comments8 min readLW link

When AI Optimizes for the Wrong Thing

Anthony Fox2 May 2025 18:00 UTC

5 points

0 comments1 min readLW link

Alignment Structure Direction—Recursive Adversarial Oversight(RAO)

Jayden Shepard2 May 2025 17:51 UTC

2 points

0 comments2 min readLW link

AI Welfare Risks

Adrià Moret2 May 2025 17:49 UTC

6 points

0 comments1 min readLW link

(philpapers.org)

Philosoplasticity: On the Inevitable Drift of Meaning in Recursive Self-Interpreting Systems

Maikol Coin2 May 2025 17:46 UTC

−1 points

0 comments4 min readLW link

Supermen of the (Not so Far) Future

TerriLeaf2 May 2025 15:55 UTC

9 points

0 comments4 min readLW link

Steering Language Models in Multiple Directions Simultaneously

lukemarks, Narmeen and Amirali Abdullah

2 May 2025 15:27 UTC

18 points

0 comments7 min readLW link

AI Incident Monitoring: A Brief Analysis

Spencer Ames2 May 2025 15:06 UTC

3 points

0 comments5 min readLW link

RA x ControlAI video: What if AI just keeps getting smarter?

Writer2 May 2025 14:19 UTC

100 points

17 comments9 min readLW link

OpenAI Preparedness Framework 2.0

Zvi2 May 2025 13:10 UTC

61 points

1 comment23 min readLW link

(thezvi.wordpress.com)

Ex-OpenAI employee amici leave to file denied in Musk v OpenAI case?

TFD2 May 2025 12:27 UTC

4 points

6 comments2 min readLW link

(www.thefloatingdroid.com)

Roads are at maximum efficiency always

Hruss2 May 2025 10:29 UTC

1 point

3 comments1 min readLW link

The Continuum Fallacy and its Relatives

Zero Contradictions2 May 2025 2:58 UTC

4 points

2 comments4 min readLW link

(thewaywardaxolotl.blogspot.com)

Memory Decoding Journal Club: Motor learning selectively strengthens cortical and striatal synapses of motor engram neurons

Devin Ward1 May 2025 23:52 UTC

1 point

0 comments1 min readLW link

My Research Process: Understanding and Cultivating Research Taste

Neel Nanda1 May 2025 23:08 UTC

26 points

1 comment9 min readLW link

AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions

peterbarnett and Aaron_Scher

1 May 2025 22:46 UTC

105 points

7 comments8 min readLW link

(techgov.intelligence.org)

How to specify an alignment target

Richard Juggins1 May 2025 21:11 UTC

14 points

2 comments12 min readLW link

Obstacles in ARC’s agenda: Mechanistic Anomaly Detection

David Matolcsi1 May 2025 20:51 UTC

42 points

1 comment11 min readLW link

AI-Generated GitHub repo backdated with junk then filled with my systems work. Has anyone seen this before?

rgunther1 May 2025 20:14 UTC

7 points

1 comment1 min readLW link

What is Inadequate about Bayesianism for AI Alignment: Motivating Infra-Bayesianism

Brittany Gelb1 May 2025 19:06 UTC

17 points

0 comments7 min readLW link

Can LLMs Simulate Internal Evaluation? A Case Study in Self-Generated Recommendations

The Neutral Mind1 May 2025 19:04 UTC

4 points

0 comments2 min readLW link

Superhuman Coders in AI 2027 - Not So Fast

dschwarz and FutureSearch

1 May 2025 18:56 UTC

66 points

0 comments5 min readLW link

AI #114: Liars, Sycophants and Cheaters

Zvi1 May 2025 14:00 UTC

40 points

6 comments63 min readLW link

(thezvi.wordpress.com)

Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall

Vladimir_Nesov1 May 2025 13:54 UTC

181 points

22 comments5 min readLW link

Anthropomorphizing AI might be good, actually

Seth Herd1 May 2025 13:50 UTC

35 points

6 comments3 min readLW link

Dont focus on updating P doom

Algon1 May 2025 11:10 UTC

7 points

3 comments2 min readLW link

Prioritizing Work

jefftk1 May 2025 2:00 UTC

106 points

11 comments1 min readLW link

(www.jefftk.com)

Don’t rely on a “race to the top”

sjadler1 May 2025 0:33 UTC

5 points

0 comments1 min readLW link

Meta-Technicalities: Safeguarding Values in Formal Systems

LTM30 Apr 2025 23:43 UTC

2 points

0 comments3 min readLW link

(routecause.substack.com)

Obstacles in ARC’s agenda: Finding explanations

David Matolcsi30 Apr 2025 23:03 UTC

123 points

10 comments17 min readLW link

GPT-4o Responds to Negative Feedback

Zvi30 Apr 2025 20:20 UTC

45 points

2 comments18 min readLW link

(thezvi.wordpress.com)

State of play of AI progress (and related brakes on an intelligence explosion) [Linkpost]

Noosphere8930 Apr 2025 19:58 UTC

7 points

0 comments5 min readLW link

(www.interconnects.ai)

Don’t accuse your interlocutor of being insufficiently truth-seeking

TFD30 Apr 2025 19:38 UTC

30 points

15 comments2 min readLW link

(www.thefloatingdroid.com)

How can we solve diffuse threats like research sabotage with AI control?

Vivek Hebbar30 Apr 2025 19:23 UTC

52 points

1 comment8 min readLW link

[Question] Can Narrowing One’s Reference Class Undermine the Doomsday Argument?

Iannoose n.30 Apr 2025 18:24 UTC

2 points

1 comment1 min readLW link

[Question] Does there exist an interactive reasoning map tool that lets users visually lay out claims, assign probabilities and confidence levels, and dynamically adjust their beliefs based on weighted influences between connected assertions?

Zack Friedman30 Apr 2025 18:22 UTC

5 points

4 comments1 min readLW link

Distilling the Internal Model Principle part II

JoseFaustino30 Apr 2025 17:56 UTC

15 points

0 comments19 min readLW link

Research Priorities for Hardware-Enabled Mechanisms (HEMs)

aog30 Apr 2025 17:43 UTC

17 points

2 comments15 min readLW link

(www.longview.org)

Video and transcript of talk on automating alignment research

Joe Carlsmith30 Apr 2025 17:43 UTC

21 points

0 comments24 min readLW link

(joecarlsmith.com)

Can we safely automate alignment research?

Joe Carlsmith30 Apr 2025 17:37 UTC

54 points

29 comments48 min readLW link

(joecarlsmith.com)

Investigating task-specific prompts and sparse autoencoders for activation monitoring

Henk Tillman30 Apr 2025 17:09 UTC

23 points

0 comments1 min readLW link

(arxiv.org)

European Links (30.04.25)

Martin Sustrik30 Apr 2025 15:40 UTC

15 points

1 comment8 min readLW link

(250bpm.substack.com)

Scaling Laws for Scalable Oversight

Subhash Kantamneni, Josh Engels, David Baek and Max Tegmark

30 Apr 2025 12:13 UTC

34 points

0 comments9 min readLW link

Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis

jeanne_ and eeeee

30 Apr 2025 11:06 UTC

211 points

11 comments11 min readLW link

[Paper] Automated Feature Labeling with Token-Space Gradient Descent

Wuschel Schulz30 Apr 2025 10:22 UTC

4 points

0 comments4 min readLW link

A single principle related to many Alignment subproblems?

Q Home30 Apr 2025 9:49 UTC

37 points

34 comments17 min readLW link

What if Brain Computer Interfaces went exponential?

Stephen Martin30 Apr 2025 5:07 UTC

−1 points

0 comments12 min readLW link

Interpreting the METR Time Horizons Post

snewman30 Apr 2025 3:03 UTC

66 points

12 comments10 min readLW link

(amistrongeryet.substack.com)

Should we expect the future to be good?

Neil Crawford30 Apr 2025 0:36 UTC

15 points

0 comments14 min readLW link

Judging types of consequentialism by influence and normativity

Cole Wyeth29 Apr 2025 23:25 UTC

19 points

0 comments2 min readLW link