All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 232425 26 27 28 29 30

The Lazarus Project—TV Review

Commander Zander23 Jun 2025 23:53 UTC

6 points

0 comments1 min readLW link

Frontier AI Labs: the Call Option to AGI

ykevinzhang23 Jun 2025 21:02 UTC

22 points

2 comments10 min readLW link

The Rose Test: a fun way to feel with your guts (not just logically understand) why AI-safety matters right now (and get new adepts)

lovagrus23 Jun 2025 21:02 UTC

10 points

0 comments3 min readLW link

Knowledge Extraction Plan (KEP): An alternative to reckless scaling

aswani23 Jun 2025 20:58 UTC

1 point

0 comments1 min readLW link

History repeats itself: Frontier AI labs acting like Amazon in the early 2000s

i_am_nuts23 Jun 2025 20:56 UTC

8 points

0 comments3 min readLW link

(iamnuts.substack.com)

Open Thread—Summer 2025

habryka23 Jun 2025 20:54 UTC

23 points

70 comments1 min readLW link

Childhood and Education #10: Behaviors

Zvi23 Jun 2025 20:40 UTC

26 points

4 comments15 min readLW link

(thezvi.wordpress.com)

Compressed Computation is (probably) not Computation in Superposition

Jai Bhagat, Sara Molas Medina, Giorgi Giglemiani and StefanHex

23 Jun 2025 19:35 UTC

57 points

9 comments10 min readLW link

Situational Awareness: A One-Year Retrospective

Nathan Delisle23 Jun 2025 19:15 UTC

82 points

4 comments12 min readLW link

Recognizing Optimality

jimmy23 Jun 2025 17:55 UTC

20 points

0 comments7 min readLW link

Comparing risk from internally-deployed AI to insider and outsider threats from humans

Buck23 Jun 2025 17:47 UTC

150 points

22 comments3 min readLW link

Foom & Doom 2: Technical alignment is hard

Steven Byrnes23 Jun 2025 17:19 UTC

165 points

65 comments28 min readLW link

Foom & Doom 1: “Brain in a box in a basement”

Steven Byrnes23 Jun 2025 17:18 UTC

282 points

120 comments29 min readLW link

The Lies of Big Bug

Bentham's Bulldog23 Jun 2025 16:03 UTC

−4 points

2 comments4 min readLW link

AI companies aren’t planning to secure critical model weights

Zach Stein-Perlman23 Jun 2025 16:00 UTC

15 points

0 comments1 min readLW link

“It isn’t magic”

Ben (Berlin)23 Jun 2025 14:00 UTC

92 points

17 comments2 min readLW link

Forecasting AI Forecasting

Alvin Ånestrand23 Jun 2025 13:39 UTC

15 points

4 comments6 min readLW link

Recent progress on the science of evaluations

PabloAMC23 Jun 2025 9:41 UTC

14 points

1 comment8 min readLW link

Racial Dating Preferences and Sexual Racism

koreindian23 Jun 2025 3:57 UTC

56 points

70 comments32 min readLW link

(vishalblog.substack.com)

Mainstream Grantmaking Expertise (Post 7 of 7 on AI Governance)

Mass_Driver23 Jun 2025 1:39 UTC

56 points

7 comments37 min readLW link

[Question] How does the LessWrong team generate the website illustrations?

Nina Panickssery23 Jun 2025 0:05 UTC

16 points

1 comment1 min readLW link

The AI’s Toolbox: From Soggy Toast to Optimal Solutions

Thehumanproject.ai22 Jun 2025 20:54 UTC

1 point

0 comments8 min readLW link

Black-box interpretability methodology blueprint: Probing runaway optimisation in LLMs

Roland Pihlakas22 Jun 2025 18:16 UTC

17 points

0 comments7 min readLW link

The Croissant Principle: A Theory of AI Generalization

Jeffrey Liang22 Jun 2025 17:58 UTC

20 points

6 comments2 min readLW link

Relational Design Can’t Be Left to Chance

Priyanka Bharadwaj22 Jun 2025 15:32 UTC

5 points

0 comments3 min readLW link

Grounding to Avoid Airplane Delays

jefftk22 Jun 2025 1:50 UTC

30 points

0 comments2 min readLW link

(www.jefftk.com)

Open questions on compatibilist free will and subjunctive dependence

jackmastermind22 Jun 2025 1:15 UTC

3 points

0 comments1 min readLW link

(jacktlab.substack.com)

The Sixteen Kinds of Intimacy

Ruby21 Jun 2025 19:59 UTC

57 points

2 comments5 min readLW link

Book review: Against Method

Valdes21 Jun 2025 18:59 UTC

9 points

0 comments6 min readLW link

Contrived evaluations are useful evaluations

pradyuprasad21 Jun 2025 18:18 UTC

3 points

0 comments3 min readLW link

(speculativedecoding.substack.com)

Consider chilling out in 2028

Valentine21 Jun 2025 17:07 UTC

189 points

143 comments13 min readLW link

Upcoming workshop on Post-AGI Civilizational Equilibria

David Duvenaud, Jan_Kulveit, Raymond Douglas, Nora_Ammann and David Scott Krueger (formerly: capybaralet)

21 Jun 2025 15:57 UTC

25 points

0 comments1 min readLW link

Genomic emancipation

TsviBT21 Jun 2025 8:15 UTC

83 points

14 comments26 min readLW link

Evaluating the Risk of Job Displacement by Transformative AI Automation in Developing Countries: A Case Study on Brazil

Abubakar21 Jun 2025 0:48 UTC

4 points

0 comments15 min readLW link

Backdoor awareness and misaligned personas in reasoning models

James Chua, Owain_Evans and Jan Betley

20 Jun 2025 23:38 UTC

35 points

8 comments6 min readLW link

Agentic Misalignment: How LLMs Could be Insider Threats

Aengus Lynch, Benjamin Wright, Ethan Perez and evhub

20 Jun 2025 22:34 UTC

83 points

13 comments6 min readLW link

Clarifying “wisdom”: Foundational topics for aligned AIs to prioritize before irreversible decisions

Anthony DiGiovanni20 Jun 2025 21:55 UTC

40 points

2 comments12 min readLW link

Are Intelligent Agents More Ethical?

PeterMcCluskey20 Jun 2025 21:26 UTC

13 points

7 comments2 min readLW link

An AI Arms Race Scenario

shanzson20 Jun 2025 19:25 UTC

2 points

2 comments1 min readLW link

Making deals with early schemers

Julian Stastny, Olli Järviniemi and Buck

20 Jun 2025 18:21 UTC

127 points

41 comments15 min readLW link

Ivan Gayton: A Right and a Duty

Elizabeth20 Jun 2025 18:20 UTC

21 points

0 comments1 min readLW link

(acesounderglass.com)

What is the functional role of SAE errors?

Taras Kutsyk, Tim Hua, woog and Andre Assis

20 Jun 2025 18:11 UTC

12 points

6 comments38 min readLW link

Musings on AI Companies of 2025-2026 (Jun 2025)

Vladimir_Nesov20 Jun 2025 17:14 UTC

66 points

4 comments3 min readLW link

Escaping the Jungles of Norwood: A Rationalist’s Guide to Male Pattern Baldness

AlphaAndOmega20 Jun 2025 16:40 UTC

12 points

10 comments1 min readLW link

(open.substack.com)

Prefix cache untrusted monitors: a method to apply after you catch your AI

ryan_greenblatt20 Jun 2025 15:56 UTC

33 points

2 comments7 min readLW link

Did the Army Poison a Bunch of Women in Minnesota?

rba20 Jun 2025 15:33 UTC

54 points

2 comments4 min readLW link

AI #121 Part 2: The OpenAI Files

Zvi20 Jun 2025 14:50 UTC

37 points

9 comments41 min readLW link

(thezvi.wordpress.com)

Smarter Models Lie Less

Expertium20 Jun 2025 13:31 UTC

6 points

0 comments2 min readLW link

AI Safety Communicators Meet-up

Vishakha20 Jun 2025 12:34 UTC

3 points

0 comments1 min readLW link

X explains Z% of the variance in Y

Leon Lang20 Jun 2025 12:17 UTC

160 points

36 comments9 min readLW link