All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 20252026

All Jan Feb Mar AprMayJun

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

AI risk was not invented by AI CEOs to hype their companies

KatjaGrace30 Apr 2026 23:10 UTC

60 points

0 comments3 min readLW link

(worldspiritsockpuppet.com)

How much should the ideal person cry wolf?

KatjaGrace30 Apr 2026 23:10 UTC

37 points

7 comments2 min readLW link

(worldspiritsockpuppet.com)

Cambridge: the kettle

KatjaGrace30 Apr 2026 23:10 UTC

19 points

1 comment4 min readLW link

(worldspiritsockpuppet.com)

AI unemployment and AI extinction are often the same

KatjaGrace30 Apr 2026 23:10 UTC

61 points

6 comments2 min readLW link

(worldspiritsockpuppet.com)

San Francisco: self driving

KatjaGrace30 Apr 2026 23:10 UTC

8 points

0 comments1 min readLW link

(worldspiritsockpuppet.com)

SFF’s HSEE grant round; human intelligence amplification projects I’d like to see

TsviBT30 Apr 2026 21:41 UTC

33 points

0 comments11 min readLW link

To what extent is Qwen3-32B predicting its persona?

Arjun Khandelwal, ryan_greenblatt and Alex Mallen

30 Apr 2026 21:09 UTC

85 points

3 comments10 min readLW link

Projects that might help accelerate strong reprogenetics

TsviBT30 Apr 2026 20:55 UTC

22 points

1 comment12 min readLW link

Exploring the capabilities spike with METR’s time horizon data: no clear signal

Ben_Snodin30 Apr 2026 20:54 UTC

19 points

0 comments5 min readLW link

(www.bensnodin.com)

Alignment Faking in DeepSeek V4

Amina Keldibek30 Apr 2026 20:23 UTC

23 points

1 comment5 min readLW link

Upcoming Workshop on Post-AGI Civilizational Equilibria

David Duvenaud, Jan_Kulveit and Raymond Douglas

30 Apr 2026 19:51 UTC

28 points

1 comment1 min readLW link

Cyborg evals

Eye You and frmsaul

30 Apr 2026 17:31 UTC

33 points

2 comments5 min readLW link

AI #166: Google Sells Out

Zvi30 Apr 2026 15:40 UTC

33 points

2 comments55 min readLW link

(thezvi.wordpress.com)

Copycat Inkhaven 2 retrospective

Dentosal30 Apr 2026 13:33 UTC

4 points

0 comments1 min readLW link

Open internship position + call for collaborations on threat model-dependent alignment, governance, and offense/defense balance

otto.barten30 Apr 2026 12:40 UTC

7 points

0 comments1 min readLW link

Maybe I was too harsh on deep learning theory (three days ago)

LawrenceC30 Apr 2026 6:57 UTC

109 points

13 comments4 min readLW link

On today’s panel with Bernie Sanders

David Scott Krueger30 Apr 2026 5:00 UTC

199 points

3 comments2 min readLW link

(therealartificialintelligence.substack.com)

Red vs blue: The parable of the feud within a feud

Joe Rogero30 Apr 2026 4:01 UTC

25 points

22 comments5 min readLW link

(subatomicarticles.com)

Scaffolding vs Reinforcement Finetuning for AI Forecasting

Ram Potham30 Apr 2026 2:51 UTC

15 points

0 comments4 min readLW link

What Do You Mean by a Two-Year AGI Timeline?

Koby Lewis30 Apr 2026 1:58 UTC

6 points

1 comment1 min readLW link

No Strong Orthogonality From Selection Pressure

lumpenspace30 Apr 2026 1:56 UTC

55 points

192 comments10 min readLW link

Computation in Superposition: Two Handcrafted Models

RGRGRG and Kyle Ray

30 Apr 2026 0:58 UTC

17 points

0 comments7 min readLW link

Research Sabotage in ML Codebases

egan, Vivek Hebbar and Julian Stastny

30 Apr 2026 0:26 UTC

62 points

3 comments6 min readLW link

The fall of the theorem economy (David Bessis)

Caleb Biddulph29 Apr 2026 19:35 UTC

32 points

8 comments4 min readLW link

(davidbessis.substack.com)

Probe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training

Santiago Aranguri and frankyaoxiao

29 Apr 2026 19:30 UTC

16 points

0 comments13 min readLW link

Book review: The Infinity Machine

PeterMcCluskey29 Apr 2026 18:59 UTC

24 points

1 comment6 min readLW link

Lorxus Does Budget Inkhaven Again: 04/22~04/28

Lorxus29 Apr 2026 17:07 UTC

6 points

2 comments3 min readLW link

(tiled-with-pentagons.blogspot.com)

Poisoning Fine-tuning Datasets of Constitutional Classifiers

Chase Bowers and Fabien Roger

29 Apr 2026 17:04 UTC

28 points

2 comments11 min readLW link

(alignment.anthropic.com)

AGI is Probably Inevitable: A Model of Societal Ruptures

Mira Kennard29 Apr 2026 16:00 UTC

4 points

0 comments5 min readLW link

Final research agenda #2: first sketch of a plan

Mitchell_Porter29 Apr 2026 15:19 UTC

22 points

0 comments4 min readLW link

Bridging the Gap on AI Safety Policy

James Newport29 Apr 2026 14:53 UTC

7 points

0 comments4 min readLW link

(forum.effectivealtruism.org)

The Enneagram is a Useful Fake Framework

Gordon Seidoh Worley29 Apr 2026 14:30 UTC

8 points

0 comments3 min readLW link

(www.uncertainupdates.com)

The Most Important Charts In The World

Zvi29 Apr 2026 14:10 UTC

69 points

1 comment2 min readLW link

(thezvi.wordpress.com)

Let Kids Keep More Productivity Gains

jefftk29 Apr 2026 14:00 UTC

66 points

5 comments1 min readLW link

(www.jefftk.com)

Pears

TylerH29 Apr 2026 13:17 UTC

−1 points

0 comments4 min readLW link

Goblin Mode, 24 Hours Later

Dylan Bowman29 Apr 2026 12:19 UTC

52 points

10 comments4 min readLW link

Learning zero, and what SLT gets wrong about it

Dmitry Vaintrob29 Apr 2026 6:41 UTC

37 points

6 comments13 min readLW link

Are LLMs not getting better?

kqr29 Apr 2026 6:27 UTC

24 points

4 comments2 min readLW link

llm assistant personas seem increasingly incoherent (some subjective observations)

nostalgebraist29 Apr 2026 3:53 UTC

343 points

84 comments9 min readLW link

The AI x-risk lawsuit waiting to happen

David Scott Krueger29 Apr 2026 3:50 UTC

12 points

0 comments2 min readLW link

(therealartificialintelligence.substack.com)

Not a Paper: “Frontier Lab CEOs are Capable of In-Context Scheming”

LawrenceC29 Apr 2026 3:00 UTC

226 points

8 comments7 min readLW link

Notes on Transformer Consciousness

slavachalnev29 Apr 2026 0:00 UTC

36 points

2 comments2 min readLW link

SecureMaxx: A Lightweight Sequence Screening Tool for Agents

Austin Morrissey28 Apr 2026 23:47 UTC

11 points

0 comments8 min readLW link

Will whole brain emulation matter for the AI transition?

djbinder28 Apr 2026 23:04 UTC

38 points

2 comments41 min readLW link

(defensesindepth.bio)

Causal inference diary: skiing causes snow

Gretta Duleba28 Apr 2026 22:21 UTC

28 points

2 comments8 min readLW link

Is AI welfare work puntable?

Oscar28 Apr 2026 21:17 UTC

15 points

2 comments7 min readLW link

The Problem in the “Nerd Sniping” xkcd Comic

peralice28 Apr 2026 20:40 UTC

72 points

6 comments12 min readLW link

Comment on “Forecasting is Way Overrated, and We Should Stop Funding It”

Josh Rosenberg28 Apr 2026 20:16 UTC

22 points

0 comments9 min readLW link

Strategy matters when someone implements it. Astra is cultivating people to do both.

Aris and steveld

28 Apr 2026 19:58 UTC

18 points

0 comments4 min readLW link

ML Safety Newsletter #20: AI Wellbeing, Classifier Jailbreaking and Honest Pushback Benchmarking

Alice Blair and Dan H

28 Apr 2026 19:16 UTC

16 points

0 comments5 min readLW link