All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

All Jan Feb Mar Apr May Jun Jul Aug SepOct

All12 3 4 5 6 7 8

Claude Sonnet 4.5: System Card and Alignment

Zvi30 Sep 2025 20:50 UTC

72 points

4 comments27 min readLW link

(thezvi.wordpress.com)

Halfhaven virtual blogger camp

Viliam30 Sep 2025 20:22 UTC

87 points

6 comments2 min readLW link

Masks: On the benefits and drawbacks of a society where everyone covering their face is the norm

3Nora30 Sep 2025 18:43 UTC

−3 points

1 comment3 min readLW link

How reimagining the nature of consciousness entirely changes the AI game

Jáchym Fibír30 Sep 2025 18:30 UTC

−9 points

0 comments14 min readLW link

(www.phiand.ai)

The Basic Case For Doom

Bentham's Bulldog30 Sep 2025 16:04 UTC

26 points

4 comments5 min readLW link

AI Safety Research Futarchy: Using Prediction Markets to Choose Research Projects for MARS

JasonBrown30 Sep 2025 15:37 UTC

32 points

8 comments4 min readLW link

ARENA 7.0 - Call for Applicants

JScriven, JamesH, CallumMcDougall and David Quarel

30 Sep 2025 14:54 UTC

22 points

0 comments6 min readLW link

The famous survivorship bias image is a “loose reconstruction” of methods used on a hypothetical dataset

Lao Mein30 Sep 2025 13:13 UTC

47 points

0 comments1 min readLW link

[GDPval] Models Could Automate the U.S. Economy by 2027

bira30 Sep 2025 11:53 UTC

14 points

0 comments1 min readLW link

Ethical Design Patterns

AnnaSalamon30 Sep 2025 11:52 UTC

210 points

39 comments20 min readLW link

What is the Base Model Simulation of Human AI-Assistant Conversation?:

bodry30 Sep 2025 7:08 UTC

5 points

0 comments21 min readLW link

Firstpost: First impressions

Shell30 Sep 2025 2:23 UTC

14 points

1 comment1 min readLW link

Exploration of Counterfactual Importance and Attention Heads

Realmbird30 Sep 2025 1:17 UTC

12 points

0 comments6 min readLW link

Why Corrigibility is Hard and Important (i.e. “Whence the high MIRI confidence in alignment difficulty?”)

Raemon, Eliezer Yudkowsky and So8res

30 Sep 2025 0:12 UTC

80 points

52 comments17 min readLW link

What SB 53, California’s new AI law, does

tlevin29 Sep 2025 23:29 UTC

104 points

12 comments4 min readLW link

Why Most Efforts Towards “Democratic AI” Fall Short

jacobhaimes29 Sep 2025 20:52 UTC

2 points

0 comments6 min readLW link

(www.odysseaninstitute.org)

You’re probably overestimating how well you understand Dunning-Kruger

abstractapplic29 Sep 2025 19:27 UTC

216 points

24 comments4 min readLW link

On Dwarkesh Patel’s Podcast With Richard Sutton

Zvi29 Sep 2025 19:20 UTC

54 points

10 comments23 min readLW link

(thezvi.wordpress.com)

Controlling the options AIs can pursue

Joe Carlsmith29 Sep 2025 17:23 UTC

15 points

0 comments35 min readLW link

Exponential increase is the default (assuming it increases at all) [Linkpost]

Noosphere8929 Sep 2025 16:13 UTC

13 points

0 comments2 min readLW link

(x.com)

[Question] How does the current AI paradigm give rise to the “superagency” that IABIED is concerned with?

jchan29 Sep 2025 15:23 UTC

3 points

4 comments1 min readLW link

AI companies’ policy advocacy (Sep 2025)

Zach Stein-Perlman29 Sep 2025 15:00 UTC

43 points

0 comments3 min readLW link

KYC for ChatGPT? Preventing AI Harms for Youth Should Not Mean Violating Everyone Else’s Privacy Rights

Noah Weinberger29 Sep 2025 14:18 UTC

7 points

0 comments7 min readLW link

System Level Safety Evaluations

markov and Jonas Hallgren

29 Sep 2025 13:57 UTC

14 points

0 comments9 min readLW link

(equilibria1.substack.com)

I have decided to stop lying to Americans about 9/11

Lao Mein29 Sep 2025 13:55 UTC

86 points

24 comments1 min readLW link

[Retracted] Guess I Was Wrong About AIxBio Risks

J Bostock29 Sep 2025 11:44 UTC

62 points

7 comments5 min readLW link

If Drexler Is Wrong, He May as Well Be Right

Tomás B.29 Sep 2025 7:00 UTC

51 points

8 comments2 min readLW link

Applied Murphyjitsu Meditation

Alice Blair29 Sep 2025 6:31 UTC

20 points

0 comments3 min readLW link

The personal intelligence I want

Rebecca Dai29 Sep 2025 4:09 UTC

20 points

9 comments8 min readLW link

(rebeccadai.substack.com)

Why ASI Alignment Is Hard (an overview)

Yotam29 Sep 2025 4:05 UTC

16 points

1 comment25 min readLW link

When the AI Dam Breaks: From Surveillance to Game Theory in AI Alignment

pataphor29 Sep 2025 4:01 UTC

5 points

7 comments5 min readLW link

Yet Another IABIED Review

PeterMcCluskey28 Sep 2025 21:36 UTC

15 points

0 comments7 min readLW link

(bayesianinvestor.com)

A non-review of “If Anyone Builds It, Everyone Dies”

boazbarak28 Sep 2025 17:34 UTC

125 points

50 comments4 min readLW link

Transgender Sticker Fallacy

ymeskhout28 Sep 2025 16:54 UTC

110 points

25 comments7 min readLW link

(www.ymeskhout.com)

Solving the problem of needing to give a talk

Kaj_Sotala28 Sep 2025 15:34 UTC

60 points

3 comments8 min readLW link

Lessons from organizing a technical AI safety bootcamp

Vili Kohonen and Dmitrii Gusev

28 Sep 2025 13:48 UTC

16 points

3 comments16 min readLW link

The Risk of Human Disconnection

Priyanka Bharadwaj28 Sep 2025 2:14 UTC

5 points

0 comments3 min readLW link

A Reply to MacAskill on “If Anyone Builds It, Everyone Dies”

Rob Bensinger27 Sep 2025 23:03 UTC

55 points

21 comments17 min readLW link

The Sensible Way Forward for AI Alignment

Davey Morse27 Sep 2025 21:00 UTC

−9 points

0 comments3 min readLW link

Book Review: The System

Julius27 Sep 2025 20:49 UTC

14 points

2 comments16 min readLW link

(thegreymatter.substack.com)

Learnings from AI safety course so far

boazbarak27 Sep 2025 18:17 UTC

103 points

5 comments3 min readLW link

My Weirdest Experience Wasn’t

Bridgett Kay27 Sep 2025 18:01 UTC

24 points

3 comments3 min readLW link

(dxmrevealed.wordpress.com)

Making sense of parameter-space decomposition

Malmesbury27 Sep 2025 17:37 UTC

45 points

0 comments19 min readLW link

AI Safety Field Growth Analysis 2025

Stephen McAleese27 Sep 2025 17:03 UTC

29 points

13 comments3 min readLW link

2025 Petrov day speech

nick lacombe27 Sep 2025 15:07 UTC

9 points

0 comments1 min readLW link

(nikthink.net)

LLMs Suck at Deep Thinking Part 3 - Trying to Prove It (fixed)

Taylor G. Lunt27 Sep 2025 14:54 UTC

17 points

6 comments15 min readLW link

Our Beloved Monsters

Tomás B.27 Sep 2025 13:25 UTC

71 points

4 comments11 min readLW link

Ranking the endgames of AI development

Sean Herrington27 Sep 2025 11:47 UTC

17 points

4 comments5 min readLW link

An N=1 observational study on interpretability of Natural General Intelligence (NGI)

dr_s27 Sep 2025 9:28 UTC

12 points

3 comments6 min readLW link

Day #14 Hunger Strike, on livestream, In protest of Superintelligent AI

samuelshadrach27 Sep 2025 9:16 UTC

2 points

0 comments2 min readLW link