All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 123 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Why I Transitioned: A Case Study

Fiora Starlight1 Nov 2025 22:58 UTC

324 points

80 comments10 min readLW link

Economics and Transformative AI (by Tom Cunningham)

reallyeli1 Nov 2025 22:42 UTC

19 points

0 comments1 min readLW link

(tecunningham.github.io)

Decision theory when you can’t make decisions

Nina Panickssery1 Nov 2025 22:36 UTC

11 points

25 comments7 min readLW link

(blog.ninapanickssery.com)

You’re always stressed, your mind is always busy, you never have enough time

mingyuan1 Nov 2025 22:07 UTC

226 points

6 comments3 min readLW link

(mingyuan.substack.com)

Re-rolling environment

Raemon1 Nov 2025 21:46 UTC

140 points

2 comments2 min readLW link

Why Is Printing So Bad?

johnswentworth1 Nov 2025 21:37 UTC

50 points

25 comments2 min readLW link

Some Good Meetups (2025 Q2)

jenn1 Nov 2025 18:28 UTC

11 points

0 comments6 min readLW link

[Question] Shouldn’t taking over the world be easier than recursively self-improving, as an AI?

KvmanThinking1 Nov 2025 17:26 UTC

6 points

18 comments1 min readLW link

ACX Atlanta November Meetup

Steve French1 Nov 2025 16:32 UTC

2 points

0 comments1 min readLW link

Seattle Secular Solstice 2025 – Dec 20th

datawitch1 Nov 2025 16:03 UTC

10 points

0 comments2 min readLW link

Fermi Paradox, Ethics and Astronomical waste

StanislavKrym1 Nov 2025 15:24 UTC

6 points

0 comments1 min readLW link

LLM-generated text is not testimony

TsviBT1 Nov 2025 14:47 UTC

104 points

89 comments11 min readLW link

Apply to the Cooperative AI PhD Fellowship by November 16th!

Lewis Hammond1 Nov 2025 12:15 UTC

7 points

0 comments1 min readLW link

Vaccination against ASI

dscft1 Nov 2025 10:58 UTC

−21 points

3 comments1 min readLW link

What’s So Good About Ender’s Game?

Joy21 Nov 2025 9:34 UTC

2 points

3 comments1 min readLW link

Automated Circuit Interpretation via Probe Prompting

Giuseppe Birardi1 Nov 2025 7:57 UTC

18 points

0 comments27 min readLW link

Strategy-Stealing Argument Against AI Dealmaking

Cleo Nardo1 Nov 2025 4:39 UTC

17 points

3 comments2 min readLW link

Evidence on language model consciousness

dsj1 Nov 2025 4:01 UTC

19 points

0 comments2 min readLW link

(thedavidsj.substack.com)

Asking AI What Writing Advice Paul Fussell Would Give

Taylor G. Lunt1 Nov 2025 3:37 UTC

7 points

2 comments8 min readLW link

Freewriting in my head, and overcoming the “twinge of starting”

ParrotRobot1 Nov 2025 1:12 UTC

23 points

1 comment6 min readLW link

2025 NYC Secular Solstice & East Coast Rationalist Megameetup

Screwtape1 Nov 2025 1:06 UTC

13 points

0 comments1 min readLW link

Supervillain Monologues Are Unrealistic

Algon31 Oct 2025 23:58 UTC

82 points

18 comments2 min readLW link

Secretly Loyal AIs: Threat Vectors and Mitigation Strategies

Dave Banerjee31 Oct 2025 23:31 UTC

8 points

0 comments19 min readLW link

(substack.com)

Ink without haven

Dentosal31 Oct 2025 22:50 UTC

4 points

0 comments2 min readLW link

Apply to the Cambridge ERA:AI Winter 2026 Fellowship

Kyle O’Brien31 Oct 2025 22:26 UTC

5 points

3 comments1 min readLW link

FAQ: Expert Survey on Progress in AI methodology

KatjaGrace31 Oct 2025 16:51 UTC

14 points

0 comments19 min readLW link

(blog.aiimpacts.org)

Social media feeds ‘misaligned’ when viewed through AI safety framework, show researchers

Mordechai Rorvig31 Oct 2025 16:40 UTC

13 points

3 comments1 min readLW link

(www.foommagazine.org)

Crossword Halloween 2025: Manmade Horrors

jchan31 Oct 2025 16:19 UTC

7 points

0 comments1 min readLW link

Debugging Despair ~> A bet about Satisfaction and Values

P. João31 Oct 2025 14:00 UTC

2 points

0 comments2 min readLW link

Halfhaven Digest #3

Taylor G. Lunt31 Oct 2025 13:41 UTC

7 points

0 comments2 min readLW link

OpenAI Moves To Complete Potentially The Largest Theft In Human History

Zvi31 Oct 2025 13:20 UTC

76 points

12 comments19 min readLW link

(thezvi.wordpress.com)

A (bad) Definition of AGI

spookyuser31 Oct 2025 7:55 UTC

4 points

0 comments5 min readLW link

Modelling, Measuring, and Intervening on Goal-directed Behaviour in AI Systems

Mario Giulianelli, Raghu Arghal, Fade Chen, ndalton, Evgenii Kortukov, Calum McNamara, Angelos Nalmpantis, Moksh Nirvaan and Gabriele Sarti

31 Oct 2025 1:28 UTC

14 points

0 comments8 min readLW link

Resampling Conserves Redundancy & Mediation (Approximately) Under the Jensen-Shannon Divergence

David Lorell31 Oct 2025 1:07 UTC

41 points

7 comments4 min readLW link

Centralization begets stagnation

Algon30 Oct 2025 23:49 UTC

6 points

0 comments2 min readLW link

Summary and Comments on Anthropic’s Pilot Sabotage Risk Report

GradientDissenter30 Oct 2025 20:19 UTC

29 points

0 comments5 min readLW link

Critical Fallibilism and Theory of Constraints in One Analyzed Paragraph

Elliot Temple30 Oct 2025 20:06 UTC

2 points

0 comments28 min readLW link

AI #140: Trying To Hold The Line

Zvi30 Oct 2025 18:30 UTC

26 points

1 comment52 min readLW link

(thezvi.wordpress.com)

Anthropic’s Pilot Sabotage Risk Report

dmz30 Oct 2025 17:50 UTC

32 points

2 comments3 min readLW link

(alignment.anthropic.com)

AISLE discovered three new OpenSSL vulnerabilities

Jan_Kulveit30 Oct 2025 16:32 UTC

64 points

7 comments1 min readLW link

(aisle.com)

Sonnet 4.5′s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals

Alexa Pan and ryan_greenblatt

30 Oct 2025 15:34 UTC

144 points

21 comments14 min readLW link

Steering Evaluation-Aware Models to Act Like They Are Deployed

Tim Hua, andrq, Sam Marks and Neel Nanda

30 Oct 2025 15:03 UTC

61 points

12 comments18 min readLW link

On The Conservation of Rights

Roman Maksimovich30 Oct 2025 13:48 UTC

−2 points

2 comments8 min readLW link

When “HDMI-1” Lies To You

Gunnar_Zarncke30 Oct 2025 12:23 UTC

18 points

0 comments1 min readLW link

[Question] Why there is still one instance of Eliezer Yudkowsky?

RomanS30 Oct 2025 12:00 UTC

−9 points

8 comments1 min readLW link

Interview on the Hengshui Model High School

L.M.Sherlock30 Oct 2025 10:26 UTC

21 points

2 comments30 min readLW link

(lmsherlock.substack.com)

Transcendental Argumentation and the Epistemics of Discourse

0xA30 Oct 2025 6:37 UTC

1 point

2 comments3 min readLW link

Emergent Introspective Awareness in Large Language Models

Drake Thomas30 Oct 2025 4:42 UTC

130 points

19 comments1 min readLW link

(transformer-circuits.pub)

Introducing Aeonisk: an Open Source Game and Dataset with Graded Outcome Tiers of Counterfactual Reasoning

threeriversainexus30 Oct 2025 3:02 UTC

1 point

0 comments4 min readLW link

ImpossibleBench: Measuring Reward Hacking in LLM Coding Agents

Ziqian Zhong30 Oct 2025 2:52 UTC

60 points

5 comments3 min readLW link

(arxiv.org)