All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 91011 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

TT Self Study Journal # 5

TristanTrim9 Dec 2025 22:16 UTC

4 points

2 comments5 min readLW link

Lorxus Does Halfhaven: 11/29, 11/30, Highlights, Postmortem

Lorxus9 Dec 2025 21:00 UTC

6 points

0 comments3 min readLW link

(tiled-with-pentagons.blogspot.com)

Tristan’s list of things to write

TristanTrim9 Dec 2025 20:28 UTC

5 points

21 comments1 min readLW link

Tate Modern 2150

GenericModel9 Dec 2025 19:15 UTC

15 points

2 comments9 min readLW link

(enrichedjamsham.substack.com)

Selling H200s to China Is Unwise and Unpopular

Zvi9 Dec 2025 19:11 UTC

47 points

3 comments13 min readLW link

(thezvi.wordpress.com)

Non-optimized beauty

Alexandre Variengien9 Dec 2025 19:04 UTC

7 points

0 comments3 min readLW link

(alexandrevariengien.com)

Auditing Games for Sandbagging [paper]

Jordan Taylor and Joseph Bloom

9 Dec 2025 18:37 UTC

103 points

4 comments10 min readLW link

A Catalog of AI Evaluations

Anurag 9 Dec 2025 17:05 UTC

2 points

0 comments1 min readLW link

Insights into Claude Opus 4.5 from Pokémon

Julian Bradshaw9 Dec 2025 16:57 UTC

222 points

24 comments10 min readLW link

Localizing Finetuned Information in Transformers with Dynamic Weight Grafting

toddknife9 Dec 2025 16:20 UTC

6 points

0 comments5 min readLW link

Gradual Disempowerment Monthly Roundup #3

Raymond Douglas9 Dec 2025 16:02 UTC

49 points

0 comments4 min readLW link

Every house has a chemistry lab

Alexandre Variengien9 Dec 2025 14:17 UTC

5 points

0 comments1 min readLW link

(alexandrevariengien.com)

Ways we can fail to answer

technicalities9 Dec 2025 13:10 UTC

13 points

0 comments5 min readLW link

[Question] Do you take joy in effective altruism?

SpectrumDT9 Dec 2025 10:52 UTC

12 points

1 comment1 min readLW link

My experience running a 100k

Alexandre Variengien9 Dec 2025 8:30 UTC

52 points

0 comments6 min readLW link

(alexandrevariengien.com)

Seriously, use text expansions

Parv Mahajan9 Dec 2025 5:08 UTC

12 points

0 comments1 min readLW link

(parvmahajan.com)

The reverse sear as a worthwhile life skill

Adam Zerner9 Dec 2025 2:47 UTC

29 points

11 comments8 min readLW link

Every point of intervention

TsviBT9 Dec 2025 2:14 UTC

92 points

2 comments8 min readLW link

D&D Sci Thanksgiving: the Festival Feast Evaluation & Ruleset

aphyer9 Dec 2025 1:38 UTC

30 points

8 comments3 min readLW link

Towards a Categorization of Adlerian Excuses

romeostevensit8 Dec 2025 23:22 UTC

90 points

12 comments6 min readLW link

A Falsifiable Causal Argument for Substrate Independence

rife8 Dec 2025 22:47 UTC

10 points

0 comments5 min readLW link

Prompting Models to Obfuscate Their CoT

Josh Engels and Felix Tudose

8 Dec 2025 21:00 UTC

16 points

4 comments7 min readLW link

Gödel’s Ontological Proof

GenericModel8 Dec 2025 20:49 UTC

19 points

74 comments13 min readLW link

(enrichedjamsham.substack.com)

High-level approaches to rigor in interpretability

David Scott Krueger8 Dec 2025 20:46 UTC

24 points

0 comments1 min readLW link

If It Can Learn It, It Can Unlearn It: AI Safety as Architecture, Not Training

Timothy Danforth8 Dec 2025 20:38 UTC

1 point

0 comments4 min readLW link

Human Dignity: a review

owencb8 Dec 2025 20:37 UTC

32 points

0 comments7 min readLW link

(strangecities.substack.com)

A few quick thoughts on measuring disempowerment

David Scott Krueger8 Dec 2025 20:03 UTC

30 points

3 comments1 min readLW link

How Stealth Works

Linch8 Dec 2025 19:46 UTC

48 points

5 comments3 min readLW link

(linch.substack.com)

Reward Function Design: a starter pack

Steven Byrnes8 Dec 2025 19:15 UTC

82 points

13 comments3 min readLW link

We need a field of Reward Function Design

Steven Byrnes8 Dec 2025 19:15 UTC

118 points

12 comments5 min readLW link

I have hope

TristanTrim8 Dec 2025 18:20 UTC

12 points

0 comments2 min readLW link

The Possibility of an Ongoing Moral Catastrophe

Bentham's Bulldog8 Dec 2025 16:40 UTC

8 points

6 comments4 min readLW link

Building an AI Oracle

Gordon Seidoh Worley8 Dec 2025 16:10 UTC

16 points

0 comments6 min readLW link

(www.uncertainupdates.com)

[Paper] Does Self-Evaluation Enable Wireheading in Language Models?

David Africa8 Dec 2025 16:03 UTC

25 points

2 comments2 min readLW link

Algorithmic thermodynamics and three types of optimization

Daniel C and Aram Ebtekar

8 Dec 2025 15:40 UTC

11 points

0 comments9 min readLW link

Little Echo

Zvi8 Dec 2025 15:30 UTC

161 points

15 comments2 min readLW link

(thezvi.wordpress.com)

From Barriers to Alignment to the First Formal Corrigibility Guarantees

Aran Nayebi8 Dec 2025 12:31 UTC

64 points

11 comments11 min readLW link

Scaling what used not to scale

Alexandre Variengien8 Dec 2025 8:40 UTC

11 points

0 comments12 min readLW link

(alexandrevariengien.com)

The effectiveness of systematic thinking

Alexandre Variengien8 Dec 2025 8:38 UTC

12 points

0 comments6 min readLW link

(alexandrevariengien.com)

I said hello and greeted 1,000 people at 5am this morning

Declan Molony8 Dec 2025 3:35 UTC

141 points

7 comments2 min readLW link

Your Digital Footprint Could Make You Unemployable

Declan Molony7 Dec 2025 23:50 UTC

32 points

15 comments3 min readLW link

2025 Unofficial LessWrong Census/Survey

Screwtape7 Dec 2025 22:08 UTC

70 points

33 comments1 min readLW link

AI in 2025: gestalt

technicalities7 Dec 2025 21:25 UTC

248 points

44 comments20 min readLW link

Thinking in Predictions

Julius7 Dec 2025 21:11 UTC

20 points

0 comments8 min readLW link

(thegreymatter.substack.com)

[Linkpost] Theory and AI Alignment (Scott Aaronson)

Oliver Daniels7 Dec 2025 19:17 UTC

15 points

1 comment3 min readLW link

(scottaaronson.blog)

About Natural & Synthetic Beings (Interactive Typology)

Anurag 7 Dec 2025 16:59 UTC

2 points

2 comments3 min readLW link

Lawyers are uniquely well-placed to resist AI job automation

beyarkay (Boyd Kane)7 Dec 2025 16:28 UTC

18 points

19 comments2 min readLW link

(boydkane.com)

[Question] Have there been any rational analyses of mindbody techniques for chronic pain/illness?

Liface7 Dec 2025 16:13 UTC

7 points

8 comments1 min readLW link

How a bug of AI hardware may become a feature for AI governance

Naci Cankaya7 Dec 2025 14:55 UTC

9 points

0 comments1 min readLW link

(nacicankaya.substack.com)

Karlsruhe—If Anyone Builds It, Everyone Dies

wilm7 Dec 2025 14:49 UTC

2 points

0 comments1 min readLW link