All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 3031

Centralization begets stagnation

Algon30 Oct 2025 23:49 UTC

6 points

0 comments2 min readLW link

Summary and Comments on Anthropic’s Pilot Sabotage Risk Report

GradientDissenter30 Oct 2025 20:19 UTC

29 points

0 comments5 min readLW link

Critical Fallibilism and Theory of Constraints in One Analyzed Paragraph

Elliot Temple30 Oct 2025 20:06 UTC

2 points

0 comments28 min readLW link

AI #140: Trying To Hold The Line

Zvi30 Oct 2025 18:30 UTC

26 points

1 comment52 min readLW link

(thezvi.wordpress.com)

Anthropic’s Pilot Sabotage Risk Report

dmz30 Oct 2025 17:50 UTC

32 points

2 comments3 min readLW link

(alignment.anthropic.com)

AISLE discovered three new OpenSSL vulnerabilities

Jan_Kulveit30 Oct 2025 16:32 UTC

64 points

7 comments1 min readLW link

(aisle.com)

Sonnet 4.5′s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals

Alexa Pan and ryan_greenblatt

30 Oct 2025 15:34 UTC

144 points

21 comments14 min readLW link

Steering Evaluation-Aware Models to Act Like They Are Deployed

Tim Hua, andrq, Sam Marks and Neel Nanda

30 Oct 2025 15:03 UTC

61 points

12 comments18 min readLW link

On The Conservation of Rights

Roman Maksimovich30 Oct 2025 13:48 UTC

−2 points

2 comments8 min readLW link

When “HDMI-1” Lies To You

Gunnar_Zarncke30 Oct 2025 12:23 UTC

18 points

0 comments1 min readLW link

[Question] Why there is still one instance of Eliezer Yudkowsky?

RomanS30 Oct 2025 12:00 UTC

−9 points

8 comments1 min readLW link

Interview on the Hengshui Model High School

L.M.Sherlock30 Oct 2025 10:26 UTC

21 points

2 comments30 min readLW link

(lmsherlock.substack.com)

Transcendental Argumentation and the Epistemics of Discourse

0xA30 Oct 2025 6:37 UTC

1 point

2 comments3 min readLW link

Emergent Introspective Awareness in Large Language Models

Drake Thomas30 Oct 2025 4:42 UTC

130 points

19 comments1 min readLW link

(transformer-circuits.pub)

Introducing Aeonisk: an Open Source Game and Dataset with Graded Outcome Tiers of Counterfactual Reasoning

threeriversainexus30 Oct 2025 3:02 UTC

1 point

0 comments4 min readLW link

ImpossibleBench: Measuring Reward Hacking in LLM Coding Agents

Ziqian Zhong30 Oct 2025 2:52 UTC

60 points

5 comments3 min readLW link

(arxiv.org)

LLM Hallucinations: An Internal Tug of War

violazhong30 Oct 2025 1:21 UTC

9 points

0 comments3 min readLW link

Genius is Not About Genius

Algon30 Oct 2025 0:00 UTC

14 points

1 comment2 min readLW link

Quotes on OpenAI’s timelines to automated research, safety research, and safety collaborations before recursive self improvement

TheManxLoiner29 Oct 2025 21:47 UTC

17 points

0 comments3 min readLW link

An Opinionated Guide to Privacy Despite Authoritarianism

TurnTrout29 Oct 2025 20:32 UTC

180 points

29 comments4 min readLW link

(turntrout.com)

Unsureism: The Rational Approach to Religious Uncertainty

Taylor G. Lunt29 Oct 2025 19:45 UTC

−7 points

3 comments5 min readLW link

Why you shouldn’t eat meat if you hate factory farming

ceselder29 Oct 2025 17:00 UTC

5 points

4 comments4 min readLW link

The End of OpenAI’s Nonprofit Era

garrison29 Oct 2025 16:28 UTC

41 points

0 comments9 min readLW link

(www.obsolete.pub)

An intro to the Tensor Economics blog

harsimony29 Oct 2025 16:24 UTC

15 points

0 comments12 min readLW link

(splittinginfinity.substack.com)

Uncertain Updates: October 2025

Gordon Seidoh Worley29 Oct 2025 16:10 UTC

3 points

0 comments1 min readLW link

(www.uncertainupdates.com)

AI Doomers Should Raise Hell

James_Miller29 Oct 2025 16:10 UTC

−2 points

9 comments6 min readLW link

AISN #65: Measuring Automation and Superintelligence Moratorium Letter

Alice Blair and Dan H

29 Oct 2025 16:05 UTC

5 points

0 comments3 min readLW link

(newsletter.safe.ai)

TBC Episode with Max Harms—Red Heart and If Anyone Builds It, Everyone Dies

Steven K Zuber29 Oct 2025 15:49 UTC

13 points

0 comments1 min readLW link

(www.thebayesianconspiracy.com)

[Question] Thresholds for Pascal’s Mugging?

MattAlexander29 Oct 2025 14:54 UTC

22 points

12 comments8 min readLW link

Please Do Not Sell B30A Chips to China

Zvi29 Oct 2025 14:50 UTC

62 points

6 comments7 min readLW link

(thezvi.wordpress.com)

Why Civilizations Are Unstable (And What This Means for AI Alignment)

Elias_Kunnas29 Oct 2025 12:27 UTC

10 points

6 comments5 min readLW link

What can we learn from parent-child-alignment for AI?

Karl von Wendt29 Oct 2025 8:02 UTC

16 points

4 comments3 min readLW link

Some data from LeelaPieceOdds

Jeremy Gillen29 Oct 2025 4:27 UTC

66 points

21 comments3 min readLW link

How Do We Evaluate the Quality of LLMs’ Mathematical Responses?

Miguel Angel29 Oct 2025 1:37 UTC

5 points

0 comments13 min readLW link

Visualizing a Platform for Live World Models

Kuil29 Oct 2025 1:24 UTC

16 points

0 comments14 min readLW link

[Question] Why Would we get Inner Misalignment by Default?

Coil29 Oct 2025 1:23 UTC

3 points

0 comments2 min readLW link

A Very Simple Model of AI Dealmaking

Cleo Nardo29 Oct 2025 0:33 UTC

18 points

0 comments9 min readLW link

Upcoming Workshop on Post-AGI Economics, Culture, and Governance

David Duvenaud, Raymond Douglas, Jan_Kulveit, scasper and MariaK

28 Oct 2025 21:55 UTC

43 points

1 comment2 min readLW link

AI Craziness Mitigation Efforts

Zvi28 Oct 2025 19:00 UTC

37 points

4 comments11 min readLW link

(thezvi.wordpress.com)

When Will AI Transform the Economy?

Andre.Infante28 Oct 2025 18:55 UTC

60 points

2 comments8 min readLW link

Introducing the Epoch Capabilities Index (ECI)

luke_emberson, YafahEdelman and Jsevillamol

28 Oct 2025 18:23 UTC

65 points

9 comments1 min readLW link

(epoch.ai)

Mottes and Baileys in AI discourse

Raemon28 Oct 2025 17:50 UTC

51 points

9 comments9 min readLW link

Temporarily Losing My Ego

Logan Riggs28 Oct 2025 16:41 UTC

21 points

4 comments3 min readLW link

The Memetics of AI Successionism

Jan_Kulveit28 Oct 2025 15:04 UTC

214 points

54 comments9 min readLW link

New 80,000 Hours problem profile on the risks of power-seeking AI

Zershaaneh Qureshi28 Oct 2025 14:37 UTC

7 points

0 comments2 min readLW link

LLM robots can’t pass butter (and they are having an existential crisis about it)

Lukas Petersson28 Oct 2025 14:14 UTC

105 points

7 comments4 min readLW link

Call for mentors from AI Safety and academia. Sci.STEPS mentorship program

Valentin202628 Oct 2025 13:41 UTC

7 points

0 comments2 min readLW link

Heuristics for assessing how much of a bubble AI is in/will be

Remmelt28 Oct 2025 8:08 UTC

8 points

2 comments2 min readLW link

(www.wired.com)

Q2 AI Benchmark Results: Pros Maintain Clear Lead

Ben Wilson and John Bash

28 Oct 2025 5:40 UTC

14 points

0 comments24 min readLW link

(www.metaculus.com)

A Sketch of Helpfulness Theory With Equivocal Principals

Lorxus28 Oct 2025 4:11 UTC

7 points

1 comment6 min readLW link

(tiled-with-pentagons.blogspot.com)