Mech­a­nize Work’s es­say on Un­falsifi­able Doom

StanislavKrym30 Dec 2025 22:57 UTC
9 points
0 comments15 min readLW link
(www.mechanize.work)

The 7 Types Of Ad­vice (And 3 Com­mon Failure Modes)

Linch30 Dec 2025 21:55 UTC
26 points
3 comments7 min readLW link
(inchpin.substack.com)

Don’t Sell Stock to Donate

jefftk30 Dec 2025 19:50 UTC
109 points
13 comments2 min readLW link
(www.jefftk.com)

The ori­gin of rot

Abhishaike Mahajan30 Dec 2025 17:51 UTC
33 points
4 comments5 min readLW link
(www.owlposting.com)

[Ad­vanced In­tro to AI Align­ment] 1. Goal-Directed Rea­son­ing and Why It Matters

Towards_Keeperhood30 Dec 2025 15:48 UTC
12 points
0 comments10 min readLW link

Dat­ing Roundup #9: Sig­nals and Selection

Zvi30 Dec 2025 12:40 UTC
37 points
3 comments13 min readLW link
(thezvi.wordpress.com)

Many can write faster asm than the com­piler, yet don’t. Why?

faul_sname30 Dec 2025 8:40 UTC
73 points
18 comments4 min readLW link

Ex­cep­tion­ally Gifted Children

John Boyle30 Dec 2025 6:28 UTC
24 points
2 comments1 min readLW link

Chro­mo­some iden­ti­fi­ca­tion methods

TsviBT30 Dec 2025 6:02 UTC
38 points
4 comments5 min readLW link

CFAR’s todo list re: our workshops

AnnaSalamon30 Dec 2025 5:16 UTC
63 points
7 comments3 min readLW link

More de­tails on CFAR’s new workshops

AnnaSalamon30 Dec 2025 5:12 UTC
59 points
2 comments4 min readLW link

What’s go­ing on at CFAR? (Up­dates and Fundraiser)

AnnaSalamon30 Dec 2025 5:00 UTC
108 points
39 comments35 min readLW link

End-of year dona­tion taxes 101

GradientDissenter30 Dec 2025 2:16 UTC
35 points
1 comment3 min readLW link

Bos­ton Sols­tice 2025 Retrospective

jefftk30 Dec 2025 1:10 UTC
13 points
2 comments5 min readLW link
(www.jefftk.com)

[Question] Does the USG have ac­cess to smarter mod­els than the labs’?

jacob_drori29 Dec 2025 22:51 UTC
9 points
5 comments1 min readLW link

24% of the US pub­lic is now aware of AI xrisk

otto.barten29 Dec 2025 22:03 UTC
30 points
3 comments1 min readLW link

Steer­ing RL Train­ing: Bench­mark­ing In­ter­ven­tions Against Re­ward Hacking

29 Dec 2025 21:55 UTC
47 points
10 comments19 min readLW link

Aware­ness Jailbreak­ing: Re­veal­ing True Align­ment in Eval­u­a­tion-Aware Models

Maheep Chaudhary29 Dec 2025 21:29 UTC
10 points
0 comments4 min readLW link

De­cem­ber 2025 Links

nomagicpill29 Dec 2025 20:20 UTC
8 points
0 comments7 min readLW link
(nomagicpill.substack.com)

The Techno-Hu­man­ist Man­i­festo, wrapup and pub­lish­ing announcement

jasoncrawford29 Dec 2025 18:51 UTC
12 points
1 comment1 min readLW link
(newsletter.rootsofprogress.org)

Un­pack­ing Jonah Wilberg’s God­dess of Every­thing Else

StanislavKrym29 Dec 2025 18:25 UTC
6 points
2 comments4 min readLW link

[Book Re­view] • → 🚹 → •

artdeco29 Dec 2025 17:50 UTC
26 points
5 comments3 min readLW link

How To Create A Lsusr Golem

M_Chouchani29 Dec 2025 17:50 UTC
5 points
0 comments2 min readLW link

Dat­ing Roundup #8: Tactics

Zvi29 Dec 2025 16:40 UTC
58 points
2 comments17 min readLW link
(thezvi.wordpress.com)

Ping pong com­pu­ta­tion in superposition

Alex Gibson29 Dec 2025 16:31 UTC
13 points
0 comments3 min readLW link

The x-risk case for ex­er­cise: to have the most im­pact, the world needs you at your best

KatWoods29 Dec 2025 15:37 UTC
16 points
1 comment1 min readLW link

Bot Alexan­der on Hot Zom­bies and AI Adolescents

future_detective29 Dec 2025 14:52 UTC
−8 points
11 comments25 min readLW link

Defeat­ing Moloch: The view from Evolu­tion­ary Game Theory

Jonah Wilberg29 Dec 2025 14:37 UTC
24 points
3 comments8 min readLW link

PrincInt (PIBBSS) Op­por­tu­ni­ties: Sum­mer Fel­low­ship, Post­doc, and Ops Role (Dead­lines in Jan­uary)

DusanDNesic29 Dec 2025 12:12 UTC
8 points
0 comments1 min readLW link

The Weak­est Model in the Selector

Alice Blair29 Dec 2025 6:55 UTC
13 points
4 comments1 min readLW link

Re: “A Brief Rant on the Fu­ture of In­ter­ac­tion De­sign”

Raemon29 Dec 2025 6:35 UTC
54 points
3 comments5 min readLW link
(worrydream.com)

Magic Words and Perfor­ma­tive Utterances

Screwtape29 Dec 2025 6:21 UTC
30 points
4 comments4 min readLW link

The pace of progress, 4 years later

Veedrac29 Dec 2025 4:16 UTC
25 points
2 comments6 min readLW link

The CIA Poi­soned My Dog: Two Sto­ries About Para­noid Delu­sions and Da­m­age Control

River29 Dec 2025 3:59 UTC
123 points
2 comments5 min readLW link

How to never make a bad decision

Wes R28 Dec 2025 23:21 UTC
−4 points
0 comments3 min readLW link

Re­search agenda for train­ing al­igned AIs us­ing con­cave util­ity func­tions fol­low­ing the prin­ci­ples of home­osta­sis and diminish­ing returns

Roland Pihlakas28 Dec 2025 21:53 UTC
14 points
0 comments8 min readLW link

Strat­ified Memes

KAP28 Dec 2025 21:17 UTC
11 points
0 comments8 min readLW link

Train­ing Match­ing Pur­suit SAEs on LLMs

chanind28 Dec 2025 18:57 UTC
19 points
2 comments7 min readLW link

Do LLMs Con­di­tion Safety Be­havi­our on Dialect? Pre­limi­nary Evidence

Aakash Rana28 Dec 2025 18:21 UTC
7 points
2 comments5 min readLW link

Med­i­ta­tions on Suffer­ing

MeditationsOnShrimp28 Dec 2025 17:39 UTC
−1 points
0 comments2 min readLW link

Novem­ber 2025 Links

nomagicpill28 Dec 2025 15:51 UTC
19 points
2 comments7 min readLW link
(nomagicpill.substack.com)

Re­views I: Every­one’s Responsibility

nomagicpill28 Dec 2025 15:48 UTC
2 points
0 comments4 min readLW link
(nomagicpill.substack.com)

In­tro­spec­tion via localization

Victor Godet28 Dec 2025 14:26 UTC
35 points
9 comments3 min readLW link

Crys­tals in NNs: Tech­ni­cal Com­pan­ion Piece

Jonas Hallgren28 Dec 2025 10:44 UTC
22 points
4 comments15 min readLW link

Have You Tried Think­ing About It As Crys­tals?

Jonas Hallgren28 Dec 2025 10:44 UTC
72 points
9 comments10 min readLW link

Align­ment Is Not One Prob­lem: A 3D Map of AI Risk

Anurag 28 Dec 2025 8:44 UTC
3 points
0 comments14 min readLW link

Or­pheus’ Basilisk

pulwat28 Dec 2025 0:43 UTC
21 points
1 comment2 min readLW link

A Con­flict Between AI Align­ment and Philo­soph­i­cal Competence

Wei Dai27 Dec 2025 21:32 UTC
69 points
13 comments2 min readLW link

Glu­cose Sup­ple­men­ta­tion for Sus­tained Stim­u­lant Cognition

Johannes C. Mayer27 Dec 2025 19:58 UTC
34 points
12 comments1 min readLW link

A Brief Proof That You Are Every Con­scious Thing

Jason R27 Dec 2025 17:16 UTC
−16 points
15 comments3 min readLW link