A Full Epistemic Stack: Knowl­edge Com­mons for the 21st Century

19 Dec 2025 22:48 UTC
41 points
7 comments11 min readLW link
(www.oliversourbut.net)

Opinion Fuzzing: A Pro­posal for Re­duc­ing & Ex­plor­ing Var­i­ance in LLM Judg­ments Via Sampling

ozziegooen19 Dec 2025 21:41 UTC
11 points
0 comments5 min readLW link

Progress links and short notes, 2025-12-19

jasoncrawford19 Dec 2025 19:44 UTC
8 points
0 comments6 min readLW link
(newsletter.rootsofprogress.org)

Linch’s Top Inkhaven Posts and Reflections

Linch19 Dec 2025 19:40 UTC
38 points
0 comments9 min readLW link
(linch.substack.com)

When Were Things The Best?

Zvi19 Dec 2025 18:00 UTC
62 points
16 comments15 min readLW link
(thezvi.wordpress.com)

Re­sponse to In­tro­spec­tive Aware­ness research

maddi19 Dec 2025 17:23 UTC
6 points
0 comments9 min readLW link

SPAR Spring 2026: 130+ re­search pro­jects now ac­cept­ing applications

agucova19 Dec 2025 14:23 UTC
22 points
0 comments2 min readLW link

Space view

kapedalex19 Dec 2025 14:20 UTC
4 points
0 comments6 min readLW link

Digi­tal Minds in 2025: A Year in Review

19 Dec 2025 14:18 UTC
12 points
0 comments21 min readLW link
(digitalminds.substack.com)

Scratchpad

Karthik Tadepalli19 Dec 2025 14:15 UTC
12 points
0 comments4 min readLW link

AI Safety has a scal­ing problem

beyarkay19 Dec 2025 13:58 UTC
32 points
9 comments4 min readLW link

When Are Con­ceal­ment Fea­tures Learned? And Does the Model Know Who’s Watch­ing?

James Hoffend19 Dec 2025 8:19 UTC
13 points
1 comment6 min readLW link

2025-Era “Re­ward Hack­ing” Does Not Show that Re­ward Is the Op­ti­miza­tion Target

TurnTrout19 Dec 2025 6:09 UTC
45 points
9 comments7 min readLW link
(turntrout.com)

Wuck­les!

Raemon19 Dec 2025 3:08 UTC
64 points
15 comments2 min readLW link

Eval­u­a­tion Aware­ness Scales Pre­dictably in Open-Weights Large Lan­guage Models

Maheep Chaudhary19 Dec 2025 2:47 UTC
21 points
0 comments6 min readLW link

A name for the things that AI com­pa­nies are building

DirectedEvolution19 Dec 2025 2:07 UTC
28 points
9 comments4 min readLW link

I made Geneguessr

Brinedew19 Dec 2025 1:55 UTC
26 points
2 comments1 min readLW link

In defence of the hu­man agency: “Cur­ing Cancer” is the new “Think of the Chil­dren”

Rajmohan H19 Dec 2025 0:03 UTC
27 points
9 comments3 min readLW link

Help keep AI un­der hu­man con­trol: Pal­isade Re­search 2026 fundraiser

18 Dec 2025 23:41 UTC
105 points
66 comments6 min readLW link

OpenAI: Sidestep­ping Eval­u­a­tion Aware­ness and An­ti­ci­pat­ing Misal­ign­ment with Pro­duc­tion Evaluations

18 Dec 2025 22:55 UTC
25 points
0 comments1 min readLW link
(alignment.openai.com)

Scal­able End-to-End Interpretability

jsteinhardt18 Dec 2025 22:37 UTC
117 points
2 comments3 min readLW link

My Trip to NeurIPS 2025

Adam Newgas18 Dec 2025 22:31 UTC
15 points
0 comments4 min readLW link
(www.boristhebrave.com)

Lead­ing by example

martinkunev18 Dec 2025 20:30 UTC
3 points
2 comments3 min readLW link

Ac­ti­va­tion Or­a­cles: Train­ing and Eval­u­at­ing LLMs as Gen­eral-Pur­pose Ac­ti­va­tion Explainers

18 Dec 2025 20:21 UTC
153 points
11 comments8 min readLW link
(arxiv.org)

A Study Of Instinct

LoganStrohl18 Dec 2025 20:19 UTC
30 points
0 comments4 min readLW link

Es­ti­mat­ing The Por­tion of In­come Con­sumed By Essen­tials Between 1985 and 2025

Mars_Will_Be_Ours18 Dec 2025 19:19 UTC
2 points
2 comments3 min readLW link
(shoutinginthedarkforest.substack.com)

Chem­i­cal (hunger) ar­gu­ment paraphrased

lemonhope18 Dec 2025 18:58 UTC
10 points
7 comments1 min readLW link

BashArena: A Con­trol Set­ting for Highly Priv­ileged AI Agents

18 Dec 2025 18:19 UTC
58 points
0 comments15 min readLW link
(blog.redwoodresearch.org)

AI Safety Orgs Should Ap­ply for Govern­ment Grants

DusanDNesic18 Dec 2025 18:01 UTC
25 points
0 comments5 min readLW link

Good if make prior af­ter data in­stead of before

dynomight18 Dec 2025 17:53 UTC
113 points
15 comments9 min readLW link
(dynomight.net)

AI #147: Flash Forward

Zvi18 Dec 2025 16:50 UTC
31 points
2 comments58 min readLW link
(thezvi.wordpress.com)

50 Things I Know

Rebecca Dai18 Dec 2025 16:32 UTC
6 points
8 comments7 min readLW link
(rebeccadai.substack.com)

An­nounc­ing Spring 2026 AI Fore­cast­ing Benchmark

Ben Wilson18 Dec 2025 15:43 UTC
2 points
0 comments4 min readLW link
(www.metaculus.com)

Deep Learn­ing and Pre­cip­i­ta­tion Re­ac­tions: A Tale of Universality

Max Hennick18 Dec 2025 14:34 UTC
51 points
4 comments18 min readLW link

A Func­tional Ty­pol­ogy of Cog­ni­tive Ca­pa­bil­ities (In­ter­ac­tive Vi­su­al­iza­tion)

Anurag 18 Dec 2025 14:06 UTC
2 points
0 comments4 min readLW link

[Question] Why would AIs not be likely to be con­scious or morally rele­vant?

Horosphere18 Dec 2025 13:46 UTC
6 points
20 comments1 min readLW link

The Un­der­val­ued Kleene Hierarchy

milanrosko18 Dec 2025 11:57 UTC
10 points
2 comments6 min readLW link

[Paper] Self-Trans­parency Failures in Ex­pert-Per­sona LLMs

Alex Diep18 Dec 2025 9:09 UTC
8 points
0 comments6 min readLW link

Sols­tice Sundowners

teegs18 Dec 2025 8:12 UTC
1 point
0 comments1 min readLW link

A ba­sic case for donat­ing to the Berkeley Ge­nomics Project

TsviBT18 Dec 2025 1:55 UTC
85 points
5 comments4 min readLW link

Ap­ply to MATS Sum­mer 2026!

18 Dec 2025 1:51 UTC
28 points
0 comments1 min readLW link

Mak­ing Lin­ear Probes Interpretable

ZuiderveldTimJ18 Dec 2025 1:48 UTC
11 points
0 comments10 min readLW link

A browser game about AI safety

NickSharp17 Dec 2025 22:36 UTC
18 points
4 comments1 min readLW link

What if we could grow hu­man tis­sue by re­ca­pitu­lat­ing em­bryo­ge­n­e­sis?

Abhishaike Mahajan17 Dec 2025 21:53 UTC
23 points
0 comments1 min readLW link
(www.owlposting.com)

Trans­mit­ting Misal­ign­ment with Sublimi­nal Learn­ing via Paraphrasing

17 Dec 2025 19:34 UTC
38 points
0 comments10 min readLW link

Shal­low re­view of tech­ni­cal AI safety, 2025

17 Dec 2025 18:18 UTC
175 points
9 comments83 min readLW link

An­nounc­ing RoastMyPost: LLMs Eval Blog Posts and More

ozziegooen17 Dec 2025 18:10 UTC
110 points
17 comments5 min readLW link

Align­ment Fine-Tun­ing: Les­sons from Oper­ant Con­di­tion­ing

foodforthought17 Dec 2025 16:57 UTC
5 points
4 comments10 min readLW link

Bryan Ca­plan on Eth­i­cal Intuitionism

vatsal_newsletter17 Dec 2025 16:48 UTC
−5 points
0 comments1 min readLW link
(www.readvatsal.com)

The Bleed­ing Mind

Adele Lopez17 Dec 2025 16:27 UTC
65 points
10 comments6 min readLW link