Un­su­per­vised Agent Discovery

Gunnar_Zarncke22 Dec 2025 22:01 UTC
24 points
0 comments6 min readLW link

An­nounc­ing Gemma Scope 2

22 Dec 2025 21:56 UTC
94 points
1 comment2 min readLW link

[Ad­vanced In­tro to AI Align­ment] 0. Overview and Foundations

Towards_Keeperhood22 Dec 2025 21:20 UTC
15 points
0 comments5 min readLW link

$500 Write like lsusr competition

lsusr22 Dec 2025 20:09 UTC
29 points
43 comments3 min readLW link

Ap­pen­dices: Su­per­vised fine­tun­ing on low-harm re­ward hack­ing gen­er­al­ises to high-harm re­ward hacking

22 Dec 2025 19:33 UTC
17 points
0 comments1 min readLW link

Su­per­vised fine­tun­ing on low-harm re­ward hack­ing gen­er­al­ises to high-harm re­ward hacking

22 Dec 2025 19:32 UTC
14 points
0 comments30 min readLW link

Re­cent LLMs can use filler to­kens or prob­lem re­peats to im­prove (no-CoT) math performance

ryan_greenblatt22 Dec 2025 17:21 UTC
152 points
18 comments7 min readLW link

Can we in­ter­pret la­tent rea­son­ing us­ing cur­rent mechanis­tic in­ter­pretabil­ity tools?

22 Dec 2025 16:56 UTC
34 points
0 comments9 min readLW link

[Question] Why does Eliezer make abra­sive pub­lic com­ments?

k6422 Dec 2025 16:45 UTC
96 points
65 comments1 min readLW link

The Revolu­tion of Ris­ing Expectations

Zvi22 Dec 2025 13:40 UTC
71 points
6 comments19 min readLW link
(thezvi.wordpress.com)

Ir­re­spon­si­ble and Un­rea­son­able Takes on Mee­tups Organizing

Screwtape22 Dec 2025 7:42 UTC
66 points
3 comments6 min readLW link

Most suc­cess­ful en­trepreneur­ship is unproductive

lc22 Dec 2025 6:33 UTC
41 points
27 comments3 min readLW link

AIXI with gen­eral util­ity func­tions: “Value un­der ig­no­rance in UAI”

Cole Wyeth22 Dec 2025 5:46 UTC
25 points
0 comments1 min readLW link
(arxiv.org)

Up­date: 5 months of Retatrutide

Brendan Long22 Dec 2025 0:02 UTC
24 points
0 comments1 min readLW link

En­ergy and Ingenuity

datawitch21 Dec 2025 22:22 UTC
9 points
0 comments7 min readLW link

Small Models Can In­tro­spect, Too

vgel21 Dec 2025 22:20 UTC
121 points
8 comments4 min readLW link
(vgel.me)

Two No­tions of a Goal: Tar­get States vs. Suc­cess Metrics

paul_dfr21 Dec 2025 21:28 UTC
10 points
0 comments7 min readLW link

What’s the Cur­rent Stock Mar­ket Bub­ble?

PeterMcCluskey21 Dec 2025 20:08 UTC
46 points
2 comments2 min readLW link
(bayesianinvestor.com)

EA Yale Destiny De­bate Dis­cus­sion:

Nathan Young21 Dec 2025 19:10 UTC
10 points
11 comments1 min readLW link
(www.youtube.com)

Can Claude teach me to make coffee?

philh21 Dec 2025 16:23 UTC
120 points
19 comments16 min readLW link

Ret­ro­spec­tive on Copen­hagen Sec­u­lar Sols­tice 2025

Søren Elverlin21 Dec 2025 15:34 UTC
7 points
0 comments4 min readLW link

Google seem­ingly solved effi­cient attention

ceselder21 Dec 2025 13:54 UTC
26 points
4 comments4 min readLW link

Wit­ness or Wager: En­forc­ing ‘Show Your Work’ in Model Outputs

markacochran21 Dec 2025 13:12 UTC
3 points
2 comments1 min readLW link

Turn­ing 20 in the prob­a­ble pre-apoc­a­lypse

Parv Mahajan21 Dec 2025 10:14 UTC
408 points
65 comments3 min readLW link

Technoromanticism

lsusr21 Dec 2025 9:00 UTC
111 points
18 comments5 min readLW link

Anal­y­sis of Whisper-Tiny Us­ing Sparse Autoencoders

Omar Khursheed21 Dec 2025 8:44 UTC
9 points
0 comments4 min readLW link

A Way to Test and Train Creativity

SebastianT21 Dec 2025 8:43 UTC
3 points
2 comments3 min readLW link

Align­ment Pre­train­ing: AI Dis­course Causes Self-Fulfilling (Mis)alignment

21 Dec 2025 0:53 UTC
184 points
23 comments9 min readLW link

The un­rea­son­able deep­ness of num­ber theory

wingspan20 Dec 2025 22:16 UTC
65 points
6 comments9 min readLW link

Digi­tal in­ten­tion­al­ity: What’s the point?

mingyuan20 Dec 2025 21:46 UTC
45 points
7 comments3 min readLW link
(mingyuan.substack.com)

Con­tra­dict my take on OpenPhil’s past AI beliefs

Eliezer Yudkowsky20 Dec 2025 21:15 UTC
194 points
92 comments3 min readLW link

Why the al­chemists couldn’t build rockets

Garrett Baker20 Dec 2025 20:25 UTC
17 points
1 comment2 min readLW link

Ex­per­i­ments to un­der­stand Sin­gu­lar Learn­ing The­ory’s Free En­ergy & Lo­cal Learn­ing Coeffi­cient (LLC)

anish-lakkapragada20 Dec 2025 17:38 UTC
7 points
0 comments6 min readLW link

Chain-of-Thought as Con­tex­tual Sta­bi­liza­tion and As­so­ci­a­tive Retrieval

Aditya Raj20 Dec 2025 17:32 UTC
5 points
1 comment6 min readLW link

How to game the METR plot

shash4220 Dec 2025 13:46 UTC
236 points
29 comments5 min readLW link

No God Can Help You

Ape in the coat20 Dec 2025 8:32 UTC
36 points
0 comments3 min readLW link
(apeinthecoat102771.substack.com)

Claude Opus 4.5 Achieves 50%-Time Hori­zon Of Around 4 hrs 49 Mins

Michaël Trazzi20 Dec 2025 7:13 UTC
89 points
14 comments1 min readLW link

Show LW: Align­ment Scry

Xyra Sinclair20 Dec 2025 2:48 UTC
16 points
4 comments2 min readLW link

Opinionated Takes on Mee­tups Organizing

jenn20 Dec 2025 0:17 UTC
247 points
34 comments9 min readLW link

A Full Epistemic Stack: Knowl­edge Com­mons for the 21st Century

19 Dec 2025 22:48 UTC
41 points
7 comments11 min readLW link
(www.oliversourbut.net)

Opinion Fuzzing: A Pro­posal for Re­duc­ing & Ex­plor­ing Var­i­ance in LLM Judg­ments Via Sampling

ozziegooen19 Dec 2025 21:41 UTC
11 points
0 comments5 min readLW link

Progress links and short notes, 2025-12-19

jasoncrawford19 Dec 2025 19:44 UTC
8 points
0 comments6 min readLW link
(newsletter.rootsofprogress.org)

Linch’s Top Inkhaven Posts and Reflections

Linch19 Dec 2025 19:40 UTC
38 points
0 comments9 min readLW link
(linch.substack.com)

When Were Things The Best?

Zvi19 Dec 2025 18:00 UTC
62 points
16 comments15 min readLW link
(thezvi.wordpress.com)

Re­sponse to In­tro­spec­tive Aware­ness research

maddi19 Dec 2025 17:23 UTC
6 points
0 comments9 min readLW link

SPAR Spring 2026: 130+ re­search pro­jects now ac­cept­ing applications

agucova19 Dec 2025 14:23 UTC
22 points
0 comments2 min readLW link

Space view

kapedalex19 Dec 2025 14:20 UTC
4 points
0 comments6 min readLW link

Digi­tal Minds in 2025: A Year in Review

19 Dec 2025 14:18 UTC
12 points
0 comments21 min readLW link
(digitalminds.substack.com)

Scratchpad

Karthik Tadepalli19 Dec 2025 14:15 UTC
12 points
0 comments4 min readLW link

AI Safety has a scal­ing problem

beyarkay19 Dec 2025 13:58 UTC
32 points
9 comments4 min readLW link