All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 123 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Why’s equality in logic less flexible than in category theory?

Algon1 Oct 2025 22:03 UTC

17 points

25 comments3 min readLW link

[Linkpost] A Field Guide to Writing Styles

Linch1 Oct 2025 21:49 UTC

17 points

0 comments17 min readLW link

(linch.substack.com)

</rant> </uncharitable> </psychologizing>

Raemon1 Oct 2025 21:20 UTC

56 points

13 comments2 min readLW link

How I think about alignment and ethics as a cooperation protocol software

Burny1 Oct 2025 21:09 UTC

3 points

0 comments1 min readLW link

Introducing the Mox Guest Program

Austin Chen, RobinGoins and Rachel Shu

1 Oct 2025 18:35 UTC

11 points

0 comments2 min readLW link

(moxsf.com)

The Problem of the Concentration of Power

hazem1 Oct 2025 18:13 UTC

−5 points

2 comments2 min readLW link

Claude Sonnet 4.5 Is A Very Good Model

Zvi1 Oct 2025 18:00 UTC

48 points

2 comments24 min readLW link

(thezvi.wordpress.com)

My Brush with Superhuman Persuasion

Ben S.1 Oct 2025 17:50 UTC

25 points

13 comments9 min readLW link

(thebsdetector.substack.com)

AI and Cheap Weapons

Felix Choussat1 Oct 2025 17:31 UTC

42 points

3 comments23 min readLW link

But what kind of stuff can you just do?

Bastiaan1 Oct 2025 16:58 UTC

26 points

6 comments1 min readLW link

AI Safety at the Frontier: Paper Highlights, September ’25

gasteigerjo1 Oct 2025 16:24 UTC

11 points

0 comments6 min readLW link

(aisafetyfrontier.substack.com)

Uncertain Updates: September 2025

Gordon Seidoh Worley1 Oct 2025 14:50 UTC

11 points

0 comments1 min readLW link

(uncertainupdates.substack.com)

[CS2881r] Optimizing Prompts with Reinforcement Learning

Anastasia Ahani and atticusw

1 Oct 2025 14:02 UTC

2 points

0 comments5 min readLW link

“Pessimization” is Just Ordinary Failure

J Bostock1 Oct 2025 13:48 UTC

61 points

6 comments6 min readLW link

Beyond the Zombie Argument

James Diacoumis1 Oct 2025 13:16 UTC

7 points

23 comments2 min readLW link

(jamesdiacoumis.substack.com)

Against the Inevitability of Habituation to Continuous Bliss

CanYouFeelTheBenefits1 Oct 2025 12:12 UTC

8 points

0 comments1 min readLW link

Lectures on statistical learning theory for alignment researchers

Vanessa Kosoy1 Oct 2025 8:36 UTC

42 points

1 comment1 min readLW link

(www.youtube.com)

Claude Sonnet 4.5: System Card and Alignment

Zvi30 Sep 2025 20:50 UTC

75 points

5 comments27 min readLW link

(thezvi.wordpress.com)

Halfhaven virtual blogger camp

Viliam30 Sep 2025 20:22 UTC

91 points

16 comments2 min readLW link

Masks: On the benefits and drawbacks of a society where everyone covering their face is the norm

3Nora30 Sep 2025 18:43 UTC

−3 points

1 comment3 min readLW link

How reimagining the nature of consciousness entirely changes the AI game

Jáchym Fibír30 Sep 2025 18:30 UTC

−9 points

0 comments14 min readLW link

(www.phiand.ai)

The Basic Case For Doom

Bentham's Bulldog30 Sep 2025 16:04 UTC

27 points

4 comments5 min readLW link

AI Safety Research Futarchy: Using Prediction Markets to Choose Research Projects for MARS

JasonB30 Sep 2025 15:37 UTC

37 points

10 comments4 min readLW link

ARENA 7.0 - Call for Applicants

JScriven, JamesH, CallumMcDougall and David Quarel

30 Sep 2025 14:54 UTC

26 points

1 comment6 min readLW link

The famous survivorship bias image is a “loose reconstruction” of methods used on a hypothetical dataset

Lao Mein30 Sep 2025 13:13 UTC

57 points

0 comments1 min readLW link

[GDPval] Models Could Automate the U.S. Economy by 2027

bira30 Sep 2025 11:53 UTC

14 points

0 comments1 min readLW link

Ethical Design Patterns

AnnaSalamon30 Sep 2025 11:52 UTC

227 points

49 comments20 min readLW link

What is the Base Model Simulation of Human AI-Assistant Conversation?

bodry30 Sep 2025 7:08 UTC

5 points

0 comments21 min readLW link

Firstpost: First impressions

Shell30 Sep 2025 2:23 UTC

14 points

1 comment1 min readLW link

Exploration of Counterfactual Importance and Attention Heads

Realmbird30 Sep 2025 1:17 UTC

12 points

0 comments6 min readLW link

Why Corrigibility is Hard and Important (i.e. “Whence the high MIRI confidence in alignment difficulty?”)

Raemon, Eliezer Yudkowsky and So8res

30 Sep 2025 0:12 UTC

90 points

55 comments17 min readLW link

What SB 53, California’s new AI law, does

tlevin29 Sep 2025 23:29 UTC

106 points

12 comments4 min readLW link

Why Most Efforts Towards “Democratic AI” Fall Short

jacobhaimes29 Sep 2025 20:52 UTC

2 points

0 comments6 min readLW link

(www.odysseaninstitute.org)

You’re probably overestimating how well you understand Dunning-Kruger

abstractapplic29 Sep 2025 19:27 UTC

229 points

23 comments4 min readLW link

On Dwarkesh Patel’s Podcast With Richard Sutton

Zvi29 Sep 2025 19:20 UTC

59 points

10 comments23 min readLW link

(thezvi.wordpress.com)

Controlling the options AIs can pursue

Joe Carlsmith29 Sep 2025 17:23 UTC

8 points

0 comments35 min readLW link

Exponential increase is the default (assuming it increases at all) [Linkpost]

Noosphere8929 Sep 2025 16:13 UTC

22 points

0 comments2 min readLW link

(x.com)

[Question] How does the current AI paradigm give rise to the “superagency” that IABIED is concerned with?

jchan29 Sep 2025 15:23 UTC

3 points

4 comments1 min readLW link

AI companies’ policy advocacy (Sep 2025)

Zach Stein-Perlman29 Sep 2025 15:00 UTC

44 points

0 comments3 min readLW link

KYC for ChatGPT? Preventing AI Harms for Youth Should Not Mean Violating Everyone Else’s Privacy Rights

Noah Weinberger29 Sep 2025 14:18 UTC

7 points

0 comments7 min readLW link

System Level Safety Evaluations

markov and Jonas Hallgren

29 Sep 2025 13:57 UTC

15 points

0 comments9 min readLW link

(equilibria1.substack.com)

I have decided to stop lying to Americans about 9/11

Lao Mein29 Sep 2025 13:55 UTC

79 points

24 comments1 min readLW link

[Retracted] Guess I Was Wrong About AIxBio Risks

J Bostock29 Sep 2025 11:44 UTC

62 points

7 comments5 min readLW link

If Drexler Is Wrong, He May as Well Be Right

Tomás B.29 Sep 2025 7:00 UTC

61 points

8 comments2 min readLW link

Applied Murphyjitsu Meditation

Alice Blair29 Sep 2025 6:31 UTC

21 points

0 comments3 min readLW link

The personal intelligence I want

Rebecca Dai29 Sep 2025 4:09 UTC

20 points

9 comments8 min readLW link

(rebeccadai.substack.com)

Why ASI Alignment Is Hard (an overview)

Yotam29 Sep 2025 4:05 UTC

16 points

1 comment25 min readLW link

When the AI Dam Breaks: From Surveillance to Game Theory in AI Alignment

pataphor29 Sep 2025 4:01 UTC

5 points

7 comments5 min readLW link

Yet Another IABIED Review

PeterMcCluskey28 Sep 2025 21:36 UTC

15 points

0 comments7 min readLW link

(bayesianinvestor.com)

A non-review of “If Anyone Builds It, Everyone Dies”

boazbarak28 Sep 2025 17:34 UTC

123 points

51 comments4 min readLW link