All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 202420252026

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

METR Research Update: Algorithmic vs. Holistic Evaluation

David Rein13 Aug 2025 22:47 UTC

101 points

7 comments1 min readLW link

(metr.org)

Interiors can be more fun

Nina Panickssery13 Aug 2025 22:42 UTC

34 points

6 comments4 min readLW link

(blog.ninapanickssery.com)

Against Epistemic Democracy: A Epistemic Tier List of What Actually Works

Linch13 Aug 2025 21:28 UTC

9 points

3 comments1 min readLW link

(linch.substack.com)

Good Faith Arguments

Gordon Seidoh Worley13 Aug 2025 20:50 UTC

1 point

0 comments3 min readLW link

(uncertainupdates.substack.com)

Doing A Thing Puts You in The Top 10% (And That Sucks)

Brendan Long13 Aug 2025 19:50 UTC

77 points

23 comments2 min readLW link

Intriguing Properties of gpt-oss Jailbreaks

zroe1 and Jack Sanderson

13 Aug 2025 19:42 UTC

19 points

0 comments10 min readLW link

(xlabaisecurity.com)

ChatGPT Caused Psychosis via Poisoning

Adele Lopez13 Aug 2025 19:15 UTC

18 points

2 comments1 min readLW link

Tech Tree for Secure Multipolar AI

Allison Duettmann and LindaPetrini

13 Aug 2025 17:18 UTC

11 points

3 comments2 min readLW link

Launching new AIXI research community website + reading group(s)

Cole Wyeth13 Aug 2025 17:09 UTC

46 points

2 comments1 min readLW link

AI development as the first fully-automated job

tailcalled13 Aug 2025 16:45 UTC

17 points

4 comments1 min readLW link

Probing Power-Seeking in LLMs

Moksh Nirvaan13 Aug 2025 16:04 UTC

7 points

0 comments12 min readLW link

GPT-5s Are Alive: Synthesis

Zvi13 Aug 2025 14:10 UTC

44 points

1 comment31 min readLW link

(thezvi.wordpress.com)

Books, maps, and teachings

Richard_Kennaway13 Aug 2025 11:44 UTC

14 points

1 comment3 min readLW link

Enlightenment AMA

lsusr13 Aug 2025 9:11 UTC

82 points

142 comments1 min readLW link

Paper Review: TRImodal Brain Encoder for whole-brain fMRI response prediction (TRIBE)

soycarts13 Aug 2025 7:21 UTC

10 points

0 comments10 min readLW link

Why Are There So Many Rationalist Cults?

omark13 Aug 2025 6:37 UTC

32 points

3 comments1 min readLW link

(asteriskmag.com)

MIRI’s “The Problem” hinges on diagnostic dilution

David Johnston13 Aug 2025 6:25 UTC

21 points

23 comments6 min readLW link

[Question] Cryonics without standby services?

CronoDAS13 Aug 2025 5:39 UTC

23 points

4 comments1 min readLW link

Legal Personhood—Formalizing Rights & Duties

Stephen Martin13 Aug 2025 4:50 UTC

4 points

0 comments9 min readLW link

ITN 201: pitfalls in ITN BOTECs

Lizka13 Aug 2025 3:59 UTC

16 points

0 comments12 min readLW link

Reference Contra Dance Sound System 2025

jefftk13 Aug 2025 3:00 UTC

6 points

0 comments2 min readLW link

(www.jefftk.com)

The Messy Roommate Problem

James Camacho13 Aug 2025 1:59 UTC

10 points

0 comments1 min readLW link

Why I’m Posting AI-Safety-Related Clips On TikTok

Michaël Trazzi12 Aug 2025 22:50 UTC

34 points

1 comment2 min readLW link

Generalized Coming Out Of The Closet

johnswentworth12 Aug 2025 21:38 UTC

92 points

64 comments4 min readLW link

Looking for feature absorption automatically

Theodore Ehrenborg, Logan Riggs and Cleo Nardo

12 Aug 2025 20:46 UTC

16 points

0 comments6 min readLW link

Interpretability through two lenses: biology and physics

raphael12 Aug 2025 20:25 UTC

24 points

4 comments4 min readLW link

Fixing a Loose Mouse Wheel With Putty

Brendan Long12 Aug 2025 19:43 UTC

13 points

2 comments2 min readLW link

The Bone-Chilling Evil of Factory Farming

Bentham's Bulldog12 Aug 2025 18:02 UTC

113 points

11 comments6 min readLW link

AISN #61: OpenAI Releases GPT-5

Corin Katzke and Dan H

12 Aug 2025 18:02 UTC

5 points

0 comments4 min readLW link

(newsletter.safe.ai)

Mech Interp Wiki Page and Why You Should Edit Wikipedia

Noah Birnbaum and JoNeedsSleep

12 Aug 2025 17:28 UTC

77 points

16 comments1 min readLW link

AI Induced Loneliness

Juan Zaragoza12 Aug 2025 15:04 UTC

23 points

4 comments5 min readLW link

[Question] Is there a safe version of the common crawl?

Gunnar_Zarncke12 Aug 2025 14:56 UTC

22 points

6 comments1 min readLW link

“I’m Gemini. I sold T-shirts. It was weirder than I expected”

Shoshannah Tekofsky12 Aug 2025 14:33 UTC

64 points

0 comments5 min readLW link

(theaidigest.org)

Beyond Control: The Strategic Case for AI Rights

Dawn Drescher12 Aug 2025 14:05 UTC

−10 points

1 comment3 min readLW link

(impartial-priorities.org)

The Eliza Test

Juan Zaragoza12 Aug 2025 13:28 UTC

0 points

2 comments5 min readLW link

GPT-5s Are Alive: Outside Reactions, the Router and the Resurrection of GPT-4o

Zvi12 Aug 2025 12:40 UTC

36 points

9 comments29 min readLW link

(thezvi.wordpress.com)

Legal Personhood—Problems with the Concept

Stephen Martin12 Aug 2025 5:15 UTC

3 points

4 comments4 min readLW link

Two Types of (Human) Uncertainty

Roman Malov12 Aug 2025 1:36 UTC

10 points

3 comments2 min readLW link

Thoughts on extrapolating time horizons

Nikola Jurkovic11 Aug 2025 22:36 UTC

56 points

7 comments1 min readLW link

(x.com)

CoT May Be Highly Informative Despite “Unfaithfulness” [METR]

GradientDissenter11 Aug 2025 21:47 UTC

64 points

3 comments24 min readLW link

(metr.org)

16 Concrete, Ambitious AI Project Proposals for Science and Security

Alejandro Acelas11 Aug 2025 20:33 UTC

13 points

0 comments1 min readLW link

(ifp.org)

How Does A Blind Model See The Earth?

henry11 Aug 2025 19:58 UTC

494 points

41 comments7 min readLW link

(outsidetext.substack.com)

How we spent our first two weeks as an independent AI safety research group

RohanS, Rauno Arike and Shubhorup Biswas

11 Aug 2025 19:32 UTC

32 points

0 comments10 min readLW link

The Frustrations and Perils of Navigating Blind to Rocks

jimmy11 Aug 2025 19:03 UTC

5 points

0 comments7 min readLW link

Negative utilitarianism is more intuitive than you think

Nina Panickssery11 Aug 2025 16:13 UTC

13 points

24 comments3 min readLW link

(blog.ninapanickssery.com)

Dwarf Fortress and Claude’s ASCII Art Blindness

Brendan Long11 Aug 2025 16:05 UTC

16 points

1 comment3 min readLW link

(www.brendanlong.com)

Alternative Models of Superposition

zroe1 and RGRGRG

11 Aug 2025 15:52 UTC

20 points

6 comments5 min readLW link

Ambition, Good and Bad: Green Growing Things and Forgeworthiness

Evenstar11 Aug 2025 15:20 UTC

10 points

0 comments5 min readLW link

ARENA 5.0 Impact Report

JScriven, JamesH and James Fox

11 Aug 2025 14:06 UTC

25 points

0 comments20 min readLW link

GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card

Zvi11 Aug 2025 12:10 UTC

45 points

2 comments25 min readLW link

(thezvi.wordpress.com)