All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 567 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

The quotation mark

Maxwell Peterson5 Oct 2025 23:23 UTC

22 points

8 comments13 min readLW link

The Sadism Spectrum and How to Access It

Dawn Drescher5 Oct 2025 23:09 UTC

14 points

2 comments20 min readLW link

(impartial-priorities.org)

Maybe social media algorithms don’t suck

Algon5 Oct 2025 18:47 UTC

70 points

25 comments3 min readLW link

Base64Bench: How good are LLMs at base64, and why care about it?

richbc5 Oct 2025 18:07 UTC

39 points

10 comments11 min readLW link

[Question] What can Canadians do to help end the AI arms race?

Tom9385 Oct 2025 18:03 UTC

8 points

7 comments2 min readLW link

17 years old, self-taught state control—looking for people who actually get this

Cornelius Caspian5 Oct 2025 18:02 UTC

−3 points

3 comments1 min readLW link

Behavior Best-of-N achieves Near Human Performance on Computer Tasks

Baybar5 Oct 2025 16:53 UTC

6 points

0 comments3 min readLW link

Accelerating AI Safety Progress via Technical Methods- Calling Researchers, Founders, and Funders

Martin Leitgab5 Oct 2025 16:40 UTC

1 point

0 comments1 min readLW link

Mini-Symposium on Accelerating AI Safety Progress via Technical Methods—Hybrid In-Person and Virtual

Martin Leitgab5 Oct 2025 16:05 UTC

1 point

0 comments1 min readLW link

[Question] How likely are “s-risks” (large-scale suffering outcomes) from unaligned AI compared to extinction risks?

CanYouFeelTheBenefits5 Oct 2025 14:38 UTC

15 points

2 comments1 min readLW link

LLMs are badly misaligned

Joe Rogero5 Oct 2025 14:00 UTC

27 points

25 comments3 min readLW link

The Counterfactual Quiet AGI Timeline

Davidmanheim5 Oct 2025 9:09 UTC

71 points

5 comments9 min readLW link

AISafety.com Reading Group session 328

Søren Elverlin5 Oct 2025 7:51 UTC

5 points

0 comments1 min readLW link

How the NanoGPT Speedrun WR dropped by 20% in 3 months

larry-dial5 Oct 2025 1:05 UTC

54 points

9 comments9 min readLW link

a quick thought about AI alignment

foodforthought5 Oct 2025 0:51 UTC

10 points

4 comments1 min readLW link

Making Your Pain Worse can Get You What You Want

Logan Riggs5 Oct 2025 0:19 UTC

87 points

5 comments3 min readLW link

Markets in Democracy: What happens when you can sell your vote?

Mike Evron4 Oct 2025 23:59 UTC

4 points

21 comments3 min readLW link

$250 bounties for the best short stories set in our near future world & Brooklyn event to select them

Ramon Gonzalez4 Oct 2025 22:49 UTC

10 points

0 comments2 min readLW link

What I’ve Learnt About How to Sleep

Algon4 Oct 2025 20:52 UTC

29 points

8 comments2 min readLW link

The ‘Magic’ of LLMs: The Function of Language

Joseph Banks4 Oct 2025 17:45 UTC

13 points

0 comments7 min readLW link

Open Philanthropy’s Biosecurity and Pandemic Preparedness Team Is Hiring and Seeking New Grantees

miriam.hinthorn4 Oct 2025 17:42 UTC

3 points

0 comments1 min readLW link

Consider Small Walks at Work

Morpheus4 Oct 2025 11:53 UTC

10 points

0 comments3 min readLW link

Where does Sonnet 4.5′s desire to “not get too comfortable” come from?

Kaj_Sotala4 Oct 2025 10:19 UTC

103 points

24 comments64 min readLW link

Munk Debate on AI: a few observations and opinions

[deactivated]4 Oct 2025 0:24 UTC

2 points

0 comments1 min readLW link

A Workflow for System Prompted Model Organisms

michaelwaves3 Oct 2025 21:39 UTC

1 point

0 comments3 min readLW link

Goodness is harder to achieve than competence

Joe Rogero3 Oct 2025 21:32 UTC

22 points

0 comments3 min readLW link

Memory Decoding Journal Club: Connectomic traces of Hebbian plasticity in the entorhinal-hippocampal system

Devin Ward3 Oct 2025 21:24 UTC

1 point

0 comments1 min readLW link

Good is a smaller target than smart

Joe Rogero3 Oct 2025 21:04 UTC

21 points

0 comments2 min readLW link

Making Sense of Consciousness Part 6: Perceptions of Disembodiment

sarahconstantin3 Oct 2025 20:40 UTC

27 points

0 comments8 min readLW link

(sarahconstantin.substack.com)

Recent AI Experiences

abramdemski3 Oct 2025 19:32 UTC

59 points

5 comments6 min readLW link

Our Experience Running Independent Evaluations on LLMs: What Have We Learned?

MAlvarado3 Oct 2025 18:26 UTC

7 points

1 comment5 min readLW link

Do One New Thing A Day To Solve Your Problems

Algon3 Oct 2025 17:08 UTC

211 points

28 comments2 min readLW link

ENAIS is looking for an Executive Director (apply by 20th October)

gergogaspar and ENAIS

3 Oct 2025 15:29 UTC

16 points

0 comments2 min readLW link

Anthropic’s JumpReLU training method is really good

chanind and Adrià Garriga-alonso

3 Oct 2025 15:23 UTC

45 points

2 comments2 min readLW link

Sora and The Big Bright Screen Slop Machine

Zvi3 Oct 2025 11:40 UTC

42 points

1 comment35 min readLW link

(thezvi.wordpress.com)

We’ve automated x-risk-pilling people

Mikhail Samin3 Oct 2025 10:26 UTC

51 points

34 comments1 min readLW link

(whycare.aisgf.us)

Open Thread Autumn 2025

kave3 Oct 2025 5:32 UTC

20 points

97 comments1 min readLW link

Memory Decoding Journal Club: Connectomic traces of Hebbian plasticity in the entorhinal-hippocampal system

Devin Ward3 Oct 2025 5:13 UTC

1 point

0 comments1 min readLW link

Prompting Myself: Maybe it’s not a damn platitude?

CstineSublime3 Oct 2025 2:28 UTC

9 points

2 comments1 min readLW link

IABIED and Memetic Engineering

Error3 Oct 2025 1:01 UTC

49 points

5 comments4 min readLW link

Antisocial media: AI’s killer app?

David Scott Krueger (formerly: capybaralet)3 Oct 2025 0:00 UTC

35 points

8 comments5 min readLW link

(therealartificialintelligence.substack.com)

Omelas Is Perfectly Misread

Tobias H2 Oct 2025 23:11 UTC

221 points

59 comments5 min readLW link

Journalism about game theory could advance AI safety quickly

Chris Santos-Lang2 Oct 2025 23:05 UTC

8 points

0 comments3 min readLW link

(arxiv.org)

In which the author is struck by an electric couplet

Algon2 Oct 2025 21:46 UTC

10 points

5 comments2 min readLW link

Nice-ish, smooth takeoff (with imperfect safeguards) probably kills most “classic humans” in a few decades.

Raemon2 Oct 2025 21:03 UTC

153 points

19 comments12 min readLW link

Eliciting secret knowledge from language models

Bartosz Cywiński, Arthur Conmy and Sam Marks

2 Oct 2025 20:57 UTC

68 points

3 comments2 min readLW link

(arxiv.org)

The Four Pillars: A Hypothesis for Countering Catastrophic Biological Risk

ASB2 Oct 2025 20:20 UTC

9 points

0 comments14 min readLW link

(defensesindepth.bio)

AI Risk: Can We Thread the Needle? [Recorded Talk from EA Summit Vancouver ’25]

Evan R. Murphy2 Oct 2025 19:08 UTC

6 points

0 comments2 min readLW link

Checking in on AI-2027

Baybar2 Oct 2025 18:46 UTC

128 points

22 comments4 min readLW link

Prompt Framing Changes LLM Performance (and Safety)

Kilian Merkelbach2 Oct 2025 18:29 UTC

5 points

0 comments7 min readLW link