All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 345 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

A Workflow for System Prompted Model Organisms

michaelwaves3 Oct 2025 21:39 UTC

1 point

0 comments3 min readLW link

Goodness is harder to achieve than competence

Joe Rogero3 Oct 2025 21:32 UTC

22 points

0 comments3 min readLW link

Memory Decoding Journal Club: Connectomic traces of Hebbian plasticity in the entorhinal-hippocampal system

Devin Ward3 Oct 2025 21:24 UTC

1 point

0 comments1 min readLW link

Good is a smaller target than smart

Joe Rogero3 Oct 2025 21:04 UTC

21 points

0 comments2 min readLW link

Making Sense of Consciousness Part 6: Perceptions of Disembodiment

sarahconstantin3 Oct 2025 20:40 UTC

27 points

0 comments8 min readLW link

(sarahconstantin.substack.com)

Recent AI Experiences

abramdemski3 Oct 2025 19:32 UTC

59 points

5 comments6 min readLW link

Our Experience Running Independent Evaluations on LLMs: What Have We Learned?

MAlvarado3 Oct 2025 18:26 UTC

7 points

1 comment5 min readLW link

Do One New Thing A Day To Solve Your Problems

Algon3 Oct 2025 17:08 UTC

211 points

28 comments2 min readLW link

ENAIS is looking for an Executive Director (apply by 20th October)

gergogaspar and ENAIS

3 Oct 2025 15:29 UTC

16 points

0 comments2 min readLW link

Anthropic’s JumpReLU training method is really good

chanind and Adrià Garriga-alonso

3 Oct 2025 15:23 UTC

45 points

2 comments2 min readLW link

Sora and The Big Bright Screen Slop Machine

Zvi3 Oct 2025 11:40 UTC

42 points

1 comment35 min readLW link

(thezvi.wordpress.com)

We’ve automated x-risk-pilling people

Mikhail Samin3 Oct 2025 10:26 UTC

51 points

34 comments1 min readLW link

(whycare.aisgf.us)

Open Thread Autumn 2025

kave3 Oct 2025 5:32 UTC

20 points

97 comments1 min readLW link

Memory Decoding Journal Club: Connectomic traces of Hebbian plasticity in the entorhinal-hippocampal system

Devin Ward3 Oct 2025 5:13 UTC

1 point

0 comments1 min readLW link

Prompting Myself: Maybe it’s not a damn platitude?

CstineSublime3 Oct 2025 2:28 UTC

9 points

2 comments1 min readLW link

IABIED and Memetic Engineering

Error3 Oct 2025 1:01 UTC

49 points

5 comments4 min readLW link

Antisocial media: AI’s killer app?

David Scott Krueger (formerly: capybaralet)3 Oct 2025 0:00 UTC

35 points

8 comments5 min readLW link

(therealartificialintelligence.substack.com)

Omelas Is Perfectly Misread

Tobias H2 Oct 2025 23:11 UTC

221 points

59 comments5 min readLW link

Journalism about game theory could advance AI safety quickly

Chris Santos-Lang2 Oct 2025 23:05 UTC

8 points

0 comments3 min readLW link

(arxiv.org)

In which the author is struck by an electric couplet

Algon2 Oct 2025 21:46 UTC

10 points

5 comments2 min readLW link

Nice-ish, smooth takeoff (with imperfect safeguards) probably kills most “classic humans” in a few decades.

Raemon2 Oct 2025 21:03 UTC

153 points

19 comments12 min readLW link

Eliciting secret knowledge from language models

Bartosz Cywiński, Arthur Conmy and Sam Marks

2 Oct 2025 20:57 UTC

68 points

3 comments2 min readLW link

(arxiv.org)

The Four Pillars: A Hypothesis for Countering Catastrophic Biological Risk

ASB2 Oct 2025 20:20 UTC

9 points

0 comments14 min readLW link

(defensesindepth.bio)

AI Risk: Can We Thread the Needle? [Recorded Talk from EA Summit Vancouver ’25]

Evan R. Murphy2 Oct 2025 19:08 UTC

6 points

0 comments2 min readLW link

Checking in on AI-2027

Baybar2 Oct 2025 18:46 UTC

128 points

22 comments4 min readLW link

Prompt Framing Changes LLM Performance (and Safety)

Kilian Merkelbach2 Oct 2025 18:29 UTC

5 points

0 comments7 min readLW link

No, That’s Not What the Flight Costs

Max Niederman2 Oct 2025 17:55 UTC

51 points

19 comments1 min readLW link

(maxniederman.com)

Why AI Caste bias is more Dangerous than you think

shanzson2 Oct 2025 16:36 UTC

9 points

1 comment6 min readLW link

Homo sapiens and homo silicus

Alexander Müller and Sophia Lopotaru

2 Oct 2025 16:33 UTC

6 points

0 comments3 min readLW link

How to Feel More Alive

Logan Riggs2 Oct 2025 15:45 UTC

49 points

2 comments4 min readLW link

AI and Biological Risk: Forecasting Key Capability Thresholds

Alvin Ånestrand2 Oct 2025 14:06 UTC

7 points

4 comments11 min readLW link

(forecastingaifutures.substack.com)

AI #136: A Song and Dance

Zvi2 Oct 2025 13:10 UTC

31 points

3 comments47 min readLW link

(thezvi.wordpress.com)

Some Biology Related Things I Found Interesting

Morpheus2 Oct 2025 12:18 UTC

40 points

9 comments2 min readLW link

Random safe AGI idea dump

sig2 Oct 2025 10:16 UTC

−3 points

0 comments3 min readLW link

How likely are “s-risks” (large-scale suffering outcomes) from unaligned AI compared to extinction risks?

CanYouFeelTheBenefits2 Oct 2025 10:02 UTC

5 points

0 comments1 min readLW link

Are we an ASI thought experiment?

Amy Rose Vossberg2 Oct 2025 1:43 UTC

−6 points

8 comments1 min readLW link

Why’s equality in logic less flexible than in category theory?

Algon1 Oct 2025 22:03 UTC

17 points

25 comments3 min readLW link

[Linkpost] A Field Guide to Writing Styles

Linch1 Oct 2025 21:49 UTC

17 points

0 comments17 min readLW link

(linch.substack.com)

</rant> </uncharitable> </psychologizing>

Raemon1 Oct 2025 21:20 UTC

56 points

13 comments2 min readLW link

How I think about alignment and ethics as a cooperation protocol software

Burny1 Oct 2025 21:09 UTC

4 points

0 comments1 min readLW link

Introducing the Mox Guest Program

Austin Chen, poconder and Rachel Shu

1 Oct 2025 18:35 UTC

11 points

0 comments2 min readLW link

(moxsf.com)

The Problem of the Concentration of Power

hazem1 Oct 2025 18:13 UTC

−5 points

2 comments2 min readLW link

Claude Sonnet 4.5 Is A Very Good Model

Zvi1 Oct 2025 18:00 UTC

48 points

2 comments24 min readLW link

(thezvi.wordpress.com)

My Brush with Superhuman Persuasion

Ben S.1 Oct 2025 17:50 UTC

25 points

13 comments9 min readLW link

(thebsdetector.substack.com)

AI and Cheap Weapons

Felix Choussat1 Oct 2025 17:31 UTC

42 points

3 comments23 min readLW link

But what kind of stuff can you just do?

Bastiaan1 Oct 2025 16:58 UTC

26 points

6 comments1 min readLW link

AI Safety at the Frontier: Paper Highlights, September ’25

gasteigerjo1 Oct 2025 16:24 UTC

11 points

0 comments6 min readLW link

(aisafetyfrontier.substack.com)

Uncertain Updates: September 2025

Gordon Seidoh Worley1 Oct 2025 14:50 UTC

11 points

0 comments1 min readLW link

(uncertainupdates.substack.com)

[CS2881r] Optimizing Prompts with Reinforcement Learning

Anastasia Ahani and atticusw

1 Oct 2025 14:02 UTC

2 points

0 comments5 min readLW link

“Pessimization” is Just Ordinary Failure

J Bostock1 Oct 2025 13:48 UTC

61 points

6 comments6 min readLW link