All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 101112 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

What does it feel like to understand?

Algon10 Oct 2025 22:50 UTC

20 points

5 comments5 min readLW link

The 5 Obstacles I Had to Overcome to Become Vegan

David Bravo10 Oct 2025 18:34 UTC

5 points

8 comments7 min readLW link

2025 State of AI Report and Predictions

Zvi10 Oct 2025 17:30 UTC

28 points

4 comments9 min readLW link

(thezvi.wordpress.com)

Applications Open for a Weekend Exploring Civilisational Sanity [DEADLINE EXTENDED]

Yulia, Ashe Vazquez and Jonte Hünerbein

10 Oct 2025 16:26 UTC

26 points

0 comments4 min readLW link

Maybe Use BioLMs To Mitigate Pre-ASI Biorisk?

J Bostock10 Oct 2025 16:25 UTC

18 points

7 comments4 min readLW link

The statement “IABIED” is true even if the book IABIED is mostly false

Ihor Kendiukhov10 Oct 2025 15:13 UTC

11 points

2 comments2 min readLW link

Why Future AIs will Require New Alignment Methods

Alvin Ånestrand10 Oct 2025 14:27 UTC

17 points

7 comments5 min readLW link

(forecastingaifutures.substack.com)

Iterated Development and Study of Schemers (IDSS)

ryan_greenblatt10 Oct 2025 14:17 UTC

41 points

1 comment8 min readLW link

Materialist Semiotics and the Nature of Qualia

Nicolas Villarreal10 Oct 2025 13:08 UTC

−1 points

16 comments7 min readLW link

Patience and Willingness to Be Slow

Morpheus10 Oct 2025 12:10 UTC

22 points

3 comments6 min readLW link

We won’t get docile, brilliant AIs before we solve alignment

Joe Rogero10 Oct 2025 4:11 UTC

7 points

3 comments3 min readLW link

Labs lack the tools to course-correct

Joe Rogero10 Oct 2025 4:10 UTC

4 points

0 comments3 min readLW link

The Liberty Tractor

Taylor G. Lunt10 Oct 2025 0:52 UTC

−4 points

0 comments9 min readLW link

Assuring Agent Safety Evaluations By Analysing Transcripts

Jerome Wynne and Cozmin Ududec

10 Oct 2025 0:42 UTC

7 points

0 comments15 min readLW link

At odds with the unavoidable meta-message

Ruby10 Oct 2025 0:13 UTC

58 points

22 comments4 min readLW link

Stars are a rounding error

Algon9 Oct 2025 23:35 UTC

67 points

19 comments3 min readLW link

Towards a Typology of Strange LLM Chains-of-Thought

1a3orn9 Oct 2025 22:02 UTC

301 points

29 comments9 min readLW link

Training Qwen-1.5B with a CoT legibility penalty

Fabien Roger9 Oct 2025 21:33 UTC

68 points

7 comments4 min readLW link

Interview with a drone expert on the future of AI warfare

NunoSempere and rai sur

9 Oct 2025 20:16 UTC

33 points

0 comments25 min readLW link

(blog.sentinel-team.org)

Investigating Neural Scaling Laws Emerging from Deep Data Structure

Nathaniel Mitrani and aribrill

9 Oct 2025 20:11 UTC

4 points

0 comments8 min readLW link

I take antidepressants. You’re welcome

Elizabeth9 Oct 2025 19:30 UTC

258 points

11 comments3 min readLW link

(acesounderglass.com)

Training fails to elicit subtle reasoning in current language models

mishajw, Fabien Roger, Hoagy, gasteigerjo, Joe Benton and Vlad Mikulik

9 Oct 2025 19:04 UTC

49 points

3 comments4 min readLW link

(alignment.anthropic.com)

Realistic Reward Hacking Induces Different and Deeper Misalignment

Jozdien9 Oct 2025 18:45 UTC

143 points

2 comments23 min readLW link

Why am I not currently starting a religion around AI or similar topics?

samuelshadrach9 Oct 2025 18:31 UTC

8 points

2 comments18 min readLW link

(samuelshadrach.com)

How we’ll make all world leaders work together to make the world better (Expert-approved idea)

Wes R9 Oct 2025 18:30 UTC

−3 points

4 comments3 min readLW link

The Underexplored Prospects of Benevolent Superintelligences—PART 1: THE WISE, THE GOOD, THE POWERFUL

Jesper L.9 Oct 2025 17:49 UTC

3 points

7 comments25 min readLW link

“Yes, and—” Requires the Possibility of “No, Because—”

Zack_M_Davis9 Oct 2025 17:39 UTC

32 points

4 comments3 min readLW link

(zackmdavis.net)

Four Questions to Refine Your Policy Proposal

Mass_Driver9 Oct 2025 16:30 UTC

10 points

2 comments6 min readLW link

A Snippet On The Epistemically Hygienic Containment Of Faith-In-Reason-Itself

JenniferRM9 Oct 2025 16:19 UTC

10 points

0 comments1 min readLW link

Alignment progress doesn’t compensate for higher capabilities

Joe Rogero9 Oct 2025 16:06 UTC

2 points

0 comments6 min readLW link

The Thinking Machines Tinker API is good news for AI control and security

Buck9 Oct 2025 15:22 UTC

91 points

10 comments6 min readLW link

Biouploading: Preserving My Living Neurons and Connectome as a Spatially Distributed Mesh

avturchin9 Oct 2025 15:19 UTC

16 points

0 comments3 min readLW link

self reflections of a striver

thiccythot9 Oct 2025 14:59 UTC

18 points

0 comments8 min readLW link

Hospitalization: A Review

Logan Riggs9 Oct 2025 14:36 UTC

363 points

21 comments9 min readLW link

AI #137: An OpenAI App For That

Zvi9 Oct 2025 14:00 UTC

32 points

4 comments57 min readLW link

(thezvi.wordpress.com)

CRC Follow-up Report v1.0 — OpenAI Feedback Integration Edition

Seira9 Oct 2025 6:12 UTC

−4 points

2 comments2 min readLW link

[Question] Are We Leaving Literature To The Psychotic?

Yitz9 Oct 2025 6:09 UTC

11 points

4 comments1 min readLW link

Lessons from the Mountains

Philipreal9 Oct 2025 4:10 UTC

15 points

2 comments3 min readLW link

Probabilistic Societies

Benjamin_Sturisky9 Oct 2025 4:08 UTC

0 points

0 comments3 min readLW link

Inverting the Most Forbidden Technique: What happens when we train LLMs to lie detectably?

Peter Jordan9 Oct 2025 0:43 UTC

20 points

4 comments4 min readLW link

Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior

Sam Marks, Nevan Wichers, Daniel Tan, Aram Ebtekar, Jozdien, David Africa, Alex Mallen and Fabien Roger

8 Oct 2025 22:02 UTC

156 points

37 comments2 min readLW link

NEPA, Permitting and Energy Roundup #2

Zvi8 Oct 2025 20:20 UTC

27 points

1 comment28 min readLW link

(thezvi.wordpress.com)

What shapes does reasoning take but circular?

Algon8 Oct 2025 20:18 UTC

9 points

2 comments2 min readLW link

The Oracle’s Gift

Karthik Tadepalli8 Oct 2025 20:13 UTC

5 points

1 comment3 min readLW link

Thinking Mathematically—Convergent Sequences

Yair Halberstadt8 Oct 2025 19:44 UTC

18 points

5 comments4 min readLW link

The Relationship Between Social Punishment and Shared Maps

Zack_M_Davis8 Oct 2025 19:38 UTC

64 points

14 comments4 min readLW link

(zackmdavis.net)

IABIED: Paradigm Confusion and Overconfidence

PeterMcCluskey8 Oct 2025 19:19 UTC

12 points

14 comments11 min readLW link

(bayesianinvestor.com)

The Wise Baboon of Loyalty

Zander_Drax8 Oct 2025 18:48 UTC

13 points

0 comments4 min readLW link

Spooky Collusion at a Distance with Superrational AI

bira8 Oct 2025 18:13 UTC

75 points

9 comments6 min readLW link

The Architecture of the Narcissistic False Self

Dawn Drescher8 Oct 2025 17:39 UTC

4 points

0 comments12 min readLW link

(impartial-priorities.org)