What does it feel like to un­der­stand?

Algon10 Oct 2025 22:50 UTC
20 points
5 comments5 min readLW link

The 5 Ob­sta­cles I Had to Over­come to Be­come Vegan

David Bravo10 Oct 2025 18:34 UTC
5 points
8 comments7 min readLW link

2025 State of AI Re­port and Predictions

Zvi10 Oct 2025 17:30 UTC
28 points
4 comments9 min readLW link
(thezvi.wordpress.com)

Ap­pli­ca­tions Open for a Week­end Ex­plor­ing Civil­i­sa­tional San­ity [DEADLINE EXTENDED]

10 Oct 2025 16:26 UTC
26 points
0 comments4 min readLW link

Maybe Use BioLMs To Miti­gate Pre-ASI Biorisk?

J Bostock10 Oct 2025 16:25 UTC
18 points
7 comments4 min readLW link

The state­ment “IABIED” is true even if the book IABIED is mostly false

Ihor Kendiukhov10 Oct 2025 15:13 UTC
11 points
2 comments2 min readLW link

Why Fu­ture AIs will Re­quire New Align­ment Methods

Alvin Ånestrand10 Oct 2025 14:27 UTC
17 points
7 comments5 min readLW link
(forecastingaifutures.substack.com)

Iter­ated Devel­op­ment and Study of Schemers (IDSS)

ryan_greenblatt10 Oct 2025 14:17 UTC
41 points
1 comment8 min readLW link

Ma­te­ri­al­ist Semiotics and the Na­ture of Qualia

Nicolas Villarreal10 Oct 2025 13:08 UTC
−1 points
16 comments7 min readLW link

Pa­tience and Willing­ness to Be Slow

Morpheus10 Oct 2025 12:10 UTC
22 points
3 comments6 min readLW link

We won’t get docile, brilli­ant AIs be­fore we solve alignment

Joe Rogero10 Oct 2025 4:11 UTC
7 points
3 comments3 min readLW link

Labs lack the tools to course-correct

Joe Rogero10 Oct 2025 4:10 UTC
4 points
0 comments3 min readLW link

The Liberty Tractor

Taylor G. Lunt10 Oct 2025 0:52 UTC
−4 points
0 comments9 min readLW link

As­sur­ing Agent Safety Eval­u­a­tions By Analysing Tran­scripts

10 Oct 2025 0:42 UTC
7 points
0 comments15 min readLW link

At odds with the un­avoid­able meta-message

Ruby10 Oct 2025 0:13 UTC
58 points
22 comments4 min readLW link

Stars are a round­ing error

Algon9 Oct 2025 23:35 UTC
67 points
19 comments3 min readLW link

Towards a Ty­pol­ogy of Strange LLM Chains-of-Thought

1a3orn9 Oct 2025 22:02 UTC
301 points
29 comments9 min readLW link

Train­ing Qwen-1.5B with a CoT leg­i­bil­ity penalty

Fabien Roger9 Oct 2025 21:33 UTC
68 points
7 comments4 min readLW link

In­ter­view with a drone ex­pert on the fu­ture of AI warfare

9 Oct 2025 20:16 UTC
33 points
0 comments25 min readLW link
(blog.sentinel-team.org)

In­ves­ti­gat­ing Neu­ral Scal­ing Laws Emerg­ing from Deep Data Structure

9 Oct 2025 20:11 UTC
4 points
0 comments8 min readLW link

I take an­tide­pres­sants. You’re welcome

Elizabeth9 Oct 2025 19:30 UTC
258 points
11 comments3 min readLW link
(acesounderglass.com)

Train­ing fails to elicit sub­tle rea­son­ing in cur­rent lan­guage models

9 Oct 2025 19:04 UTC
49 points
3 comments4 min readLW link
(alignment.anthropic.com)

Real­is­tic Re­ward Hack­ing In­duces Differ­ent and Deeper Misalignment

Jozdien9 Oct 2025 18:45 UTC
143 points
2 comments23 min readLW link

Why am I not cur­rently start­ing a re­li­gion around AI or similar top­ics?

samuelshadrach9 Oct 2025 18:31 UTC
8 points
2 comments18 min readLW link
(samuelshadrach.com)

How we’ll make all world lead­ers work to­gether to make the world bet­ter (Ex­pert-ap­proved idea)

Wes R9 Oct 2025 18:30 UTC
−3 points
4 comments3 min readLW link

The Un­der­ex­plored Prospects of Benev­olent Su­per­in­tel­li­gences—PART 1: THE WISE, THE GOOD, THE POWERFUL

Jesper L.9 Oct 2025 17:49 UTC
3 points
7 comments25 min readLW link

“Yes, and—” Re­quires the Pos­si­bil­ity of “No, Be­cause—”

Zack_M_Davis9 Oct 2025 17:39 UTC
32 points
4 comments3 min readLW link
(zackmdavis.net)

Four Ques­tions to Refine Your Policy Proposal

Mass_Driver9 Oct 2025 16:30 UTC
10 points
2 comments6 min readLW link

A Snip­pet On The Epistem­i­cally Hy­gienic Con­tain­ment Of Faith-In-Rea­son-Itself

JenniferRM9 Oct 2025 16:19 UTC
10 points
0 comments1 min readLW link

Align­ment progress doesn’t com­pen­sate for higher capabilities

Joe Rogero9 Oct 2025 16:06 UTC
2 points
0 comments6 min readLW link

The Think­ing Machines Tinker API is good news for AI con­trol and security

Buck9 Oct 2025 15:22 UTC
91 points
10 comments6 min readLW link

Biou­pload­ing: Pre­serv­ing My Liv­ing Neu­rons and Con­nec­tome as a Spa­tially Distributed Mesh

avturchin9 Oct 2025 15:19 UTC
16 points
0 comments3 min readLW link

self re­flec­tions of a striver

thiccythot9 Oct 2025 14:59 UTC
18 points
0 comments8 min readLW link

Hospi­tal­iza­tion: A Review

Logan Riggs9 Oct 2025 14:36 UTC
363 points
21 comments9 min readLW link

AI #137: An OpenAI App For That

Zvi9 Oct 2025 14:00 UTC
32 points
4 comments57 min readLW link
(thezvi.wordpress.com)

CRC Fol­low-up Re­port v1.0 — OpenAI Feed­back In­te­gra­tion Edition

Seira9 Oct 2025 6:12 UTC
−4 points
2 comments2 min readLW link

[Question] Are We Leav­ing Liter­a­ture To The Psy­chotic?

Yitz9 Oct 2025 6:09 UTC
11 points
4 comments1 min readLW link

Les­sons from the Mountains

Philipreal9 Oct 2025 4:10 UTC
15 points
2 comments3 min readLW link

Prob­a­bil­is­tic Societies

Benjamin_Sturisky9 Oct 2025 4:08 UTC
0 points
0 comments3 min readLW link

In­vert­ing the Most For­bid­den Tech­nique: What hap­pens when we train LLMs to lie de­tectably?

Peter Jordan9 Oct 2025 0:43 UTC
20 points
4 comments4 min readLW link

Inoc­u­la­tion prompt­ing: In­struct­ing mod­els to mis­be­have at train-time can im­prove run-time behavior

8 Oct 2025 22:02 UTC
156 points
37 comments2 min readLW link

NEPA, Per­mit­ting and En­ergy Roundup #2

Zvi8 Oct 2025 20:20 UTC
27 points
1 comment28 min readLW link
(thezvi.wordpress.com)

What shapes does rea­son­ing take but cir­cu­lar?

Algon8 Oct 2025 20:18 UTC
9 points
2 comments2 min readLW link

The Or­a­cle’s Gift

Karthik Tadepalli8 Oct 2025 20:13 UTC
5 points
1 comment3 min readLW link

Think­ing Math­e­mat­i­cally—Con­ver­gent Sequences

Yair Halberstadt8 Oct 2025 19:44 UTC
18 points
5 comments4 min readLW link

The Re­la­tion­ship Between So­cial Pu­n­ish­ment and Shared Maps

Zack_M_Davis8 Oct 2025 19:38 UTC
64 points
14 comments4 min readLW link
(zackmdavis.net)

IABIED: Paradigm Con­fu­sion and Overconfidence

PeterMcCluskey8 Oct 2025 19:19 UTC
12 points
14 comments11 min readLW link
(bayesianinvestor.com)

The Wise Ba­boon of Loyalty

Zander_Drax8 Oct 2025 18:48 UTC
13 points
0 comments4 min readLW link

Spooky Col­lu­sion at a Dis­tance with Su­per­ra­tional AI

bira8 Oct 2025 18:13 UTC
75 points
9 comments6 min readLW link

The Ar­chi­tec­ture of the Nar­cis­sis­tic False Self

Dawn Drescher8 Oct 2025 17:39 UTC
4 points
0 comments12 min readLW link
(impartial-priorities.org)