All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 121314 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Don’t Mock Yourself

Algon12 Oct 2025 22:40 UTC

166 points

18 comments2 min readLW link

Experiment: Test your priors on Bernoulli processes.

joseph_c12 Oct 2025 22:09 UTC

20 points

15 comments1 min readLW link

The Problem of Consciousness and AI as an Ethical Subject

Nicolas Villarreal12 Oct 2025 18:30 UTC

−5 points

0 comments14 min readLW link

Dr Evil & Realpolitik

James Stephen Brown12 Oct 2025 17:30 UTC

16 points

0 comments5 min readLW link

(nonzerosum.games)

How do we know when something is deserving of welfare?

Dom Polsinelli12 Oct 2025 16:27 UTC

11 points

7 comments4 min readLW link

The Narcissistic Spectrum

Dawn Drescher12 Oct 2025 15:46 UTC

32 points

0 comments22 min readLW link

(impartial-priorities.org)

Non-copyability as a security feature

tailcalled12 Oct 2025 9:03 UTC

16 points

4 comments1 min readLW link

International Programme on AI Evaluations

PabloAMC12 Oct 2025 7:12 UTC

3 points

0 comments2 min readLW link

The Alignment Problem Isn’t Theoretical

Austin Morrissey12 Oct 2025 3:49 UTC

0 points

1 comment14 min readLW link

If a Lioness Could Speak

Taylor G. Lunt12 Oct 2025 3:43 UTC

−1 points

0 comments2 min readLW link

Designing for perpetual control

Remmelt12 Oct 2025 2:06 UTC

1 point

11 comments2 min readLW link

“Naive Consequentialism” as a Thought-Terminating cliche

Jacob Goldsmith12 Oct 2025 0:54 UTC

−3 points

0 comments3 min readLW link

[Question] How long do AI companies have to achieve significant capability gains before funding collapses?

Hide11 Oct 2025 23:20 UTC

41 points

8 comments1 min readLW link

I wasn’t confused by Thermodynamics

Algon11 Oct 2025 22:20 UTC

26 points

4 comments2 min readLW link

Subscribe to my Inkhaven feed!

Alex_Altair11 Oct 2025 20:41 UTC

21 points

3 comments2 min readLW link

The Most Common Bad Argument In These Parts

J Bostock11 Oct 2025 16:29 UTC

247 points

62 comments4 min readLW link

Experiments With Sonnet 4.5′s Fiction

Tomás B.11 Oct 2025 15:17 UTC

63 points

30 comments5 min readLW link

Letter to Heads of AI labs

samuelshadrach11 Oct 2025 7:43 UTC

−1 points

2 comments2 min readLW link

Emil the Moose

Martin Sustrik11 Oct 2025 6:11 UTC

49 points

1 comment1 min readLW link

(www.250bpm.com)

Using complex polynomials to approximate arbitrary continuous functions

Joseph Van Name11 Oct 2025 4:06 UTC

5 points

2 comments5 min readLW link

What does it feel like to understand?

Algon10 Oct 2025 22:50 UTC

20 points

5 comments5 min readLW link

The 5 Obstacles I Had to Overcome to Become Vegan

David Bravo10 Oct 2025 18:34 UTC

5 points

8 comments7 min readLW link

2025 State of AI Report and Predictions

Zvi10 Oct 2025 17:30 UTC

28 points

4 comments9 min readLW link

(thezvi.wordpress.com)

Applications Open for a Weekend Exploring Civilisational Sanity [DEADLINE EXTENDED]

Yulia, Ashe Vazquez Nuñez and Jonte Hünerbein

10 Oct 2025 16:26 UTC

26 points

0 comments4 min readLW link

Maybe Use BioLMs To Mitigate Pre-ASI Biorisk?

J Bostock10 Oct 2025 16:25 UTC

18 points

7 comments4 min readLW link

The statement “IABIED” is true even if the book IABIED is mostly false

Ihor Kendiukhov10 Oct 2025 15:13 UTC

11 points

4 comments2 min readLW link

Why Future AIs will Require New Alignment Methods

Alvin Ånestrand10 Oct 2025 14:27 UTC

17 points

7 comments5 min readLW link

(forecastingaifutures.substack.com)

Iterated Development and Study of Schemers (IDSS)

ryan_greenblatt10 Oct 2025 14:17 UTC

41 points

1 comment8 min readLW link

Materialist Semiotics and the Nature of Qualia

Nicolas Villarreal10 Oct 2025 13:08 UTC

−1 points

16 comments7 min readLW link

Patience and Willingness to Be Slow

Morpheus10 Oct 2025 12:10 UTC

22 points

3 comments6 min readLW link

We won’t get docile, brilliant AIs before we solve alignment

Joe Rogero10 Oct 2025 4:11 UTC

7 points

3 comments3 min readLW link

Labs lack the tools to course-correct

Joe Rogero10 Oct 2025 4:10 UTC

4 points

0 comments3 min readLW link

The Liberty Tractor

Taylor G. Lunt10 Oct 2025 0:52 UTC

−4 points

0 comments9 min readLW link

Assuring Agent Safety Evaluations By Analysing Transcripts

Jerome Wynne and Cozmin Ududec

10 Oct 2025 0:42 UTC

7 points

0 comments15 min readLW link

At odds with the unavoidable meta-message

Ruby10 Oct 2025 0:13 UTC

58 points

22 comments4 min readLW link

Stars are a rounding error

Algon9 Oct 2025 23:35 UTC

67 points

19 comments3 min readLW link

Towards a Typology of Strange LLM Chains-of-Thought

1a3orn9 Oct 2025 22:02 UTC

305 points

29 comments9 min readLW link

Training Qwen-1.5B with a CoT legibility penalty

Fabien Roger9 Oct 2025 21:33 UTC

68 points

7 comments4 min readLW link

Interview with a drone expert on the future of AI warfare

NunoSempere and rai sur

9 Oct 2025 20:16 UTC

33 points

0 comments25 min readLW link

(blog.sentinel-team.org)

Investigating Neural Scaling Laws Emerging from Deep Data Structure

Nathaniel Mitrani and aribrill

9 Oct 2025 20:11 UTC

4 points

0 comments8 min readLW link

I take antidepressants. You’re welcome

Elizabeth9 Oct 2025 19:30 UTC

265 points

11 comments3 min readLW link

(acesounderglass.com)

Training fails to elicit subtle reasoning in current language models

mishajw, Fabien Roger, Hoagy, gasteigerjo, Joe Benton and Vlad Mikulik

9 Oct 2025 19:04 UTC

49 points

3 comments4 min readLW link

(alignment.anthropic.com)

Realistic Reward Hacking Induces Different and Deeper Misalignment

Jozdien9 Oct 2025 18:45 UTC

146 points

2 comments23 min readLW link

Why am I not currently starting a religion around AI or similar topics?

samuelshadrach9 Oct 2025 18:31 UTC

8 points

2 comments18 min readLW link

(samuelshadrach.com)

The Underexplored Prospects of Benevolent Superintelligences—PART 1: THE WISE, THE GOOD, THE POWERFUL

Jesper L.9 Oct 2025 17:49 UTC

3 points

7 comments25 min readLW link

“Yes, and—” Requires the Possibility of “No, Because—”

Zack_M_Davis9 Oct 2025 17:39 UTC

32 points

4 comments3 min readLW link

(zackmdavis.net)

Four Questions to Refine Your Policy Proposal

Mass_Driver9 Oct 2025 16:30 UTC

10 points

2 comments6 min readLW link

A Snippet On The Epistemically Hygienic Containment Of Faith-In-Reason-Itself

JenniferRM9 Oct 2025 16:19 UTC

10 points

0 comments1 min readLW link

Alignment progress doesn’t compensate for higher capabilities

Joe Rogero9 Oct 2025 16:06 UTC

2 points

0 comments6 min readLW link

The Thinking Machines Tinker API is good news for AI control and security

Buck9 Oct 2025 15:22 UTC

92 points

10 comments6 min readLW link