Don’t Mock Yourself

Algon12 Oct 2025 22:40 UTC
163 points
18 comments2 min readLW link

Ex­per­i­ment: Test your pri­ors on Bernoulli pro­cesses.

joseph_c12 Oct 2025 22:09 UTC
20 points
15 comments1 min readLW link

The Prob­lem of Con­scious­ness and AI as an Eth­i­cal Subject

Nicolas Villarreal12 Oct 2025 18:30 UTC
−5 points
0 comments14 min readLW link

Dr Evil & Realpolitik

James Stephen Brown12 Oct 2025 17:30 UTC
16 points
0 comments5 min readLW link
(nonzerosum.games)

How do we know when some­thing is de­serv­ing of welfare?

Dom Polsinelli12 Oct 2025 16:27 UTC
11 points
7 comments4 min readLW link

The Nar­cis­sis­tic Spectrum

Dawn Drescher12 Oct 2025 15:46 UTC
32 points
0 comments22 min readLW link
(impartial-priorities.org)

Non-copy­a­bil­ity as a se­cu­rity feature

tailcalled12 Oct 2025 9:03 UTC
16 points
4 comments1 min readLW link

In­ter­na­tional Pro­gramme on AI Evaluations

PabloAMC12 Oct 2025 7:12 UTC
3 points
0 comments2 min readLW link

The Align­ment Prob­lem Isn’t Theoretical

Austin Morrissey12 Oct 2025 3:49 UTC
0 points
1 comment14 min readLW link

If a Lioness Could Speak

Taylor G. Lunt12 Oct 2025 3:43 UTC
−1 points
0 comments2 min readLW link

De­sign­ing for per­pet­ual control

Remmelt12 Oct 2025 2:06 UTC
1 point
11 comments2 min readLW link

“Naive Con­se­quen­tial­ism” as a Thought-Ter­mi­nat­ing cliche

Jacob Goldsmith12 Oct 2025 0:54 UTC
−3 points
0 comments3 min readLW link

[Question] How long do AI com­pa­nies have to achieve sig­nifi­cant ca­pa­bil­ity gains be­fore fund­ing col­lapses?

Hide11 Oct 2025 23:20 UTC
41 points
8 comments1 min readLW link

I wasn’t con­fused by Thermodynamics

Algon11 Oct 2025 22:20 UTC
26 points
4 comments2 min readLW link

Sub­scribe to my Inkhaven feed!

Alex_Altair11 Oct 2025 20:41 UTC
21 points
3 comments2 min readLW link

The Most Com­mon Bad Ar­gu­ment In Th­ese Parts

J Bostock11 Oct 2025 16:29 UTC
243 points
61 comments4 min readLW link

Ex­per­i­ments With Son­net 4.5′s Fic­tion

Tomás B.11 Oct 2025 15:17 UTC
63 points
30 comments5 min readLW link

Let­ter to Heads of AI labs

samuelshadrach11 Oct 2025 7:43 UTC
−1 points
2 comments2 min readLW link

Emil the Moose

Martin Sustrik11 Oct 2025 6:11 UTC
49 points
1 comment1 min readLW link
(www.250bpm.com)

Us­ing com­plex polyno­mi­als to ap­prox­i­mate ar­bi­trary con­tin­u­ous functions

Joseph Van Name11 Oct 2025 4:06 UTC
5 points
2 comments5 min readLW link

What does it feel like to un­der­stand?

Algon10 Oct 2025 22:50 UTC
20 points
5 comments5 min readLW link

The 5 Ob­sta­cles I Had to Over­come to Be­come Vegan

David Bravo10 Oct 2025 18:34 UTC
5 points
8 comments7 min readLW link

2025 State of AI Re­port and Predictions

Zvi10 Oct 2025 17:30 UTC
28 points
4 comments9 min readLW link
(thezvi.wordpress.com)

Ap­pli­ca­tions Open for a Week­end Ex­plor­ing Civil­i­sa­tional San­ity [DEADLINE EXTENDED]

10 Oct 2025 16:26 UTC
26 points
0 comments4 min readLW link

Maybe Use BioLMs To Miti­gate Pre-ASI Biorisk?

J Bostock10 Oct 2025 16:25 UTC
18 points
7 comments4 min readLW link

The state­ment “IABIED” is true even if the book IABIED is mostly false

Ihor Kendiukhov10 Oct 2025 15:13 UTC
11 points
2 comments2 min readLW link

Why Fu­ture AIs will Re­quire New Align­ment Methods

Alvin Ånestrand10 Oct 2025 14:27 UTC
17 points
7 comments5 min readLW link
(forecastingaifutures.substack.com)

Iter­ated Devel­op­ment and Study of Schemers (IDSS)

ryan_greenblatt10 Oct 2025 14:17 UTC
41 points
1 comment8 min readLW link

Ma­te­ri­al­ist Semiotics and the Na­ture of Qualia

Nicolas Villarreal10 Oct 2025 13:08 UTC
−1 points
16 comments7 min readLW link

Pa­tience and Willing­ness to Be Slow

Morpheus10 Oct 2025 12:10 UTC
22 points
3 comments6 min readLW link

We won’t get docile, brilli­ant AIs be­fore we solve alignment

Joe Rogero10 Oct 2025 4:11 UTC
7 points
3 comments3 min readLW link

Labs lack the tools to course-correct

Joe Rogero10 Oct 2025 4:10 UTC
4 points
0 comments3 min readLW link

The Liberty Tractor

Taylor G. Lunt10 Oct 2025 0:52 UTC
−4 points
0 comments9 min readLW link

As­sur­ing Agent Safety Eval­u­a­tions By Analysing Tran­scripts

10 Oct 2025 0:42 UTC
7 points
0 comments15 min readLW link

At odds with the un­avoid­able meta-message

Ruby10 Oct 2025 0:13 UTC
58 points
22 comments4 min readLW link

Stars are a round­ing error

Algon9 Oct 2025 23:35 UTC
67 points
19 comments3 min readLW link

Towards a Ty­pol­ogy of Strange LLM Chains-of-Thought

1a3orn9 Oct 2025 22:02 UTC
301 points
29 comments9 min readLW link

Train­ing Qwen-1.5B with a CoT leg­i­bil­ity penalty

Fabien Roger9 Oct 2025 21:33 UTC
68 points
7 comments4 min readLW link

In­ter­view with a drone ex­pert on the fu­ture of AI warfare

9 Oct 2025 20:16 UTC
33 points
0 comments25 min readLW link
(blog.sentinel-team.org)

In­ves­ti­gat­ing Neu­ral Scal­ing Laws Emerg­ing from Deep Data Structure

9 Oct 2025 20:11 UTC
4 points
0 comments8 min readLW link

I take an­tide­pres­sants. You’re welcome

Elizabeth9 Oct 2025 19:30 UTC
258 points
11 comments3 min readLW link
(acesounderglass.com)

Train­ing fails to elicit sub­tle rea­son­ing in cur­rent lan­guage models

9 Oct 2025 19:04 UTC
49 points
3 comments4 min readLW link
(alignment.anthropic.com)

Real­is­tic Re­ward Hack­ing In­duces Differ­ent and Deeper Misalignment

Jozdien9 Oct 2025 18:45 UTC
143 points
2 comments23 min readLW link

Why am I not cur­rently start­ing a re­li­gion around AI or similar top­ics?

samuelshadrach9 Oct 2025 18:31 UTC
8 points
2 comments18 min readLW link
(samuelshadrach.com)

How we’ll make all world lead­ers work to­gether to make the world bet­ter (Ex­pert-ap­proved idea)

Wes R9 Oct 2025 18:30 UTC
−3 points
4 comments3 min readLW link

The Un­der­ex­plored Prospects of Benev­olent Su­per­in­tel­li­gences—PART 1: THE WISE, THE GOOD, THE POWERFUL

Jesper L.9 Oct 2025 17:49 UTC
3 points
7 comments25 min readLW link

“Yes, and—” Re­quires the Pos­si­bil­ity of “No, Be­cause—”

Zack_M_Davis9 Oct 2025 17:39 UTC
32 points
4 comments3 min readLW link
(zackmdavis.net)

Four Ques­tions to Refine Your Policy Proposal

Mass_Driver9 Oct 2025 16:30 UTC
10 points
2 comments6 min readLW link

A Snip­pet On The Epistem­i­cally Hy­gienic Con­tain­ment Of Faith-In-Rea­son-Itself

JenniferRM9 Oct 2025 16:19 UTC
10 points
0 comments1 min readLW link

Align­ment progress doesn’t com­pen­sate for higher capabilities

Joe Rogero9 Oct 2025 16:06 UTC
2 points
0 comments6 min readLW link