TT Self Study Jour­nal # 5

TristanTrim9 Dec 2025 22:16 UTC
4 points
2 comments5 min readLW link

Lorxus Does Halfhaven: 11/​29, 11/​30, High­lights, Postmortem

Lorxus9 Dec 2025 21:00 UTC
6 points
0 comments3 min readLW link

Tris­tan’s list of things to write

TristanTrim9 Dec 2025 20:28 UTC
5 points
21 comments1 min readLW link

Tate Modern 2150

GenericModel9 Dec 2025 19:15 UTC
15 points
2 comments9 min readLW link
(enrichedjamsham.substack.com)

Sel­ling H200s to China Is Un­wise and Unpopular

Zvi9 Dec 2025 19:11 UTC
47 points
3 comments13 min readLW link
(thezvi.wordpress.com)

Non-op­ti­mized beauty

Alexandre Variengien9 Dec 2025 19:04 UTC
7 points
0 comments3 min readLW link
(alexandrevariengien.com)

Au­dit­ing Games for Sand­bag­ging [pa­per]

9 Dec 2025 18:37 UTC
103 points
4 comments10 min readLW link

A Cat­a­log of AI Evaluations

Anurag 9 Dec 2025 17:05 UTC
2 points
0 comments1 min readLW link

In­sights into Claude Opus 4.5 from Pokémon

Julian Bradshaw9 Dec 2025 16:57 UTC
206 points
24 comments10 min readLW link

Lo­cal­iz­ing Fine­tuned In­for­ma­tion in Trans­form­ers with Dy­namic Weight Grafting

toddknife9 Dec 2025 16:20 UTC
6 points
0 comments5 min readLW link

Grad­ual Disem­pow­er­ment Monthly Roundup #3

Raymond Douglas9 Dec 2025 16:02 UTC
49 points
0 comments4 min readLW link

Every house has a chem­istry lab

Alexandre Variengien9 Dec 2025 14:17 UTC
5 points
0 comments1 min readLW link
(alexandrevariengien.com)

Ways we can fail to answer

technicalities9 Dec 2025 13:10 UTC
12 points
0 comments5 min readLW link

[Question] Do you take joy in effec­tive al­tru­ism?

SpectrumDT9 Dec 2025 10:52 UTC
12 points
1 comment1 min readLW link

My ex­pe­rience run­ning a 100k

Alexandre Variengien9 Dec 2025 8:30 UTC
50 points
0 comments6 min readLW link
(alexandrevariengien.com)

Se­ri­ously, use text expansions

Parv Mahajan9 Dec 2025 5:08 UTC
12 points
0 comments1 min readLW link
(parvmahajan.com)

The re­verse sear as a worth­while life skill

Adam Zerner9 Dec 2025 2:47 UTC
29 points
11 comments8 min readLW link

Every point of intervention

TsviBT9 Dec 2025 2:14 UTC
78 points
2 comments8 min readLW link

D&D Sci Thanks­giv­ing: the Fes­ti­val Feast Eval­u­a­tion & Ruleset

aphyer9 Dec 2025 1:38 UTC
30 points
8 comments3 min readLW link

Towards a Cat­e­go­riza­tion of Adle­rian Excuses

romeostevensit8 Dec 2025 23:22 UTC
89 points
11 comments6 min readLW link

A Falsifi­able Causal Ar­gu­ment for Sub­strate Independence

rife8 Dec 2025 22:47 UTC
10 points
0 comments5 min readLW link

Prompt­ing Models to Obfus­cate Their CoT

8 Dec 2025 21:00 UTC
15 points
4 comments7 min readLW link

Gödel’s On­tolog­i­cal Proof

GenericModel8 Dec 2025 20:49 UTC
19 points
74 comments13 min readLW link
(enrichedjamsham.substack.com)

High-level ap­proaches to rigor in interpretability

David Scott Krueger (formerly: capybaralet)8 Dec 2025 20:46 UTC
24 points
0 comments1 min readLW link

If It Can Learn It, It Can Un­learn It: AI Safety as Ar­chi­tec­ture, Not Training

Timothy Danforth8 Dec 2025 20:38 UTC
1 point
0 comments4 min readLW link

Hu­man Dig­nity: a review

owencb8 Dec 2025 20:37 UTC
32 points
0 comments7 min readLW link
(strangecities.substack.com)

A few quick thoughts on mea­sur­ing disempowerment

David Scott Krueger (formerly: capybaralet)8 Dec 2025 20:03 UTC
29 points
3 comments1 min readLW link

How Stealth Works

Linch8 Dec 2025 19:46 UTC
48 points
5 comments3 min readLW link
(linch.substack.com)

Re­ward Func­tion De­sign: a starter pack

Steven Byrnes8 Dec 2025 19:15 UTC
80 points
10 comments16 min readLW link

We need a field of Re­ward Func­tion Design

Steven Byrnes8 Dec 2025 19:15 UTC
118 points
12 comments5 min readLW link

When cir­cu­lar rea­son­ing is log­i­cal evidence

ConformalInfinity8 Dec 2025 19:09 UTC
6 points
7 comments2 min readLW link

I have hope

TristanTrim8 Dec 2025 18:20 UTC
12 points
0 comments2 min readLW link

The Pos­si­bil­ity of an On­go­ing Mo­ral Catastrophe

Bentham's Bulldog8 Dec 2025 16:40 UTC
10 points
6 comments4 min readLW link

Build­ing an AI Oracle

Gordon Seidoh Worley8 Dec 2025 16:10 UTC
16 points
0 comments6 min readLW link
(www.uncertainupdates.com)

[Paper] Does Self-Eval­u­a­tion En­able Wire­head­ing in Lan­guage Models?

David Africa8 Dec 2025 16:03 UTC
25 points
2 comments2 min readLW link

Al­gorith­mic ther­mo­dy­nam­ics and three types of op­ti­miza­tion

8 Dec 2025 15:40 UTC
11 points
0 comments12 min readLW link

Lit­tle Echo

Zvi8 Dec 2025 15:30 UTC
160 points
15 comments2 min readLW link
(thezvi.wordpress.com)

From Bar­ri­ers to Align­ment to the First For­mal Cor­rigi­bil­ity Guarantees

Aran Nayebi8 Dec 2025 12:31 UTC
61 points
11 comments11 min readLW link

Scal­ing what used not to scale

Alexandre Variengien8 Dec 2025 8:40 UTC
11 points
0 comments12 min readLW link
(alexandrevariengien.com)

The effec­tive­ness of sys­tem­atic thinking

Alexandre Variengien8 Dec 2025 8:38 UTC
12 points
0 comments6 min readLW link
(alexandrevariengien.com)

I said hello and greeted 1,000 peo­ple at 5am this morning

Declan Molony8 Dec 2025 3:35 UTC
128 points
7 comments2 min readLW link

Your Digi­tal Foot­print Could Make You Unemployable

Declan Molony7 Dec 2025 23:50 UTC
38 points
13 comments3 min readLW link

2025 Unoffi­cial LessWrong Cen­sus/​Survey

Screwtape7 Dec 2025 22:08 UTC
69 points
33 comments1 min readLW link

AI in 2025: gestalt

technicalities7 Dec 2025 21:25 UTC
246 points
44 comments20 min readLW link

Think­ing in Predictions

Julius7 Dec 2025 21:11 UTC
20 points
0 comments8 min readLW link
(thegreymatter.substack.com)

[Linkpost] The­ory and AI Align­ment (Scott Aaron­son)

Oliver Daniels7 Dec 2025 19:17 UTC
15 points
1 comment3 min readLW link
(scottaaronson.blog)

About Nat­u­ral & Syn­thetic Be­ings (In­ter­ac­tive Ty­pol­ogy)

Anurag 7 Dec 2025 16:59 UTC
2 points
2 comments3 min readLW link

Lawyers are uniquely well-placed to re­sist AI job automation

beyarkay7 Dec 2025 16:28 UTC
18 points
18 comments2 min readLW link
(boydkane.com)

[Question] Have there been any ra­tio­nal analy­ses of mind­body tech­niques for chronic pain/​ill­ness?

Liface7 Dec 2025 16:13 UTC
4 points
5 comments1 min readLW link

How a bug of AI hard­ware may be­come a fea­ture for AI governance

Naci Cankaya7 Dec 2025 14:55 UTC
9 points
0 comments1 min readLW link
(nacicankaya.substack.com)