All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 20242025

AllJan Feb Mar Apr May Jun

The Lizardman and the Black Hat Bobcat

ScrewtapeApr 6, 2025, 7:02 PM

107 points

15 comments9 min readLW link

How training-gamers might function (and win)

Vivek HebbarApr 11, 2025, 9:26 PM

107 points

5 comments13 min readLW link

Attribution-based parameter decomposition

Lucius Bushnaq, Dan Braun, StefanHex, jake_mendel and Lee Sharkey

Jan 25, 2025, 1:12 PM

107 points

21 comments4 min readLW link

(publications.apolloresearch.ai)

We’re Not Advertising Enough (Post 3 of 6 on AI Governance)

Mass_DriverMay 22, 2025, 5:05 PM

107 points

10 comments28 min readLW link

My supervillain origin story

Dmitry VaintrobJan 27, 2025, 12:20 PM

106 points

2 comments5 min readLW link

How do you deal w/ Super Stimuli?

Logan RiggsJan 14, 2025, 3:14 PM

106 points

25 comments3 min readLW link

AI 2027: Responses

ZviApr 8, 2025, 12:50 PM

106 points

3 comments30 min readLW link

(thezvi.wordpress.com)

Prioritizing Work

jefftkMay 1, 2025, 2:00 AM

106 points

11 comments1 min readLW link

(www.jefftk.com)

AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions

peterbarnett and Aaron_Scher

May 1, 2025, 10:46 PM

105 points

7 comments8 min readLW link

(techgov.intelligence.org)

Steering Gemini with BiDPO

TurnTroutJan 31, 2025, 2:37 AM

104 points

5 comments1 min readLW link

(turntrout.com)

My model of what is going on with LLMs

Cole WyethFeb 13, 2025, 3:43 AM

104 points

49 comments7 min readLW link

Show, not tell: GPT-4o is more opinionated in images than in text

Daniel Tan and eggsyntax

Apr 2, 2025, 8:51 AM

103 points

41 comments3 min readLW link

A short course on AGI safety from the GDM Alignment team

Vika and Rohin Shah

Feb 14, 2025, 3:43 PM

103 points

2 comments1 min readLW link

(deepmindsafetyresearch.medium.com)

Comment on “Death and the Gorgon”

Zack_M_DavisJan 1, 2025, 5:47 AM

103 points

33 comments8 min readLW link

Judgements: Merging Prediction & Evidence

abramdemskiFeb 23, 2025, 7:35 PM

103 points

5 comments6 min readLW link

AGI Safety & Alignment @ Google DeepMind is hiring

Rohin ShahFeb 17, 2025, 9:11 PM

102 points

19 comments10 min readLW link

How I talk to those above me

Maxwell PetersonMar 30, 2025, 6:54 AM

102 points

16 comments8 min readLW link

Detecting Strategic Deception Using Linear Probes

Nicholas Goldowsky-Dill, bilalchughtai, StefanHex and Marius Hobbhahn

Feb 6, 2025, 3:46 PM

102 points

9 comments2 min readLW link

(arxiv.org)

RA x ControlAI video: What if AI just keeps getting smarter?

WriterMay 2, 2025, 2:19 PM

100 points

17 comments9 min readLW link

Reasons for and against working on technical AI safety at a frontier AI lab

bilalchughtaiJan 5, 2025, 2:49 PM

100 points

12 comments12 min readLW link

C’mon guys, Deliberate Practice is Real

RaemonFeb 5, 2025, 10:33 PM

99 points

25 comments9 min readLW link

Generating the Funniest Joke with RL (according to GPT-4.1)

aggMay 16, 2025, 5:09 AM

99 points

22 comments4 min readLW link

Association taxes are collusion subsidies

KatjaGraceMay 27, 2025, 6:50 AM

99 points

7 comments1 min readLW link

(worldspiritsockpuppet.com)

Timaeus in 2024

Jesse Hoogland, Stan van Wingerden, Alexander Gietelink Oldenziel and Daniel Murfet

Feb 20, 2025, 11:54 PM

99 points

1 comment8 min readLW link

Third-wave AI safety needs sociopolitical thinking

Richard_NgoMar 27, 2025, 12:55 AM

99 points

23 comments26 min readLW link

The Ukraine War and the Kill Market

Martin SustrikMay 4, 2025, 7:50 AM

98 points

13 comments5 min readLW link

(250bpm.substack.com)

The purposeful drunkard

Dmitry VaintrobJan 12, 2025, 12:27 PM

98 points

13 comments6 min readLW link

AI Control May Increase Existential Risk

Jan_KulveitMar 11, 2025, 2:30 PM

98 points

13 comments1 min readLW link

What the Headlines Miss About the Latest Decision in the Musk vs. OpenAI Lawsuit

garrisonMar 6, 2025, 7:49 PM

98 points

0 comments LW link

(garrisonlovely.substack.com)

Vacuum Decay: Expert Survey Results

JessRiedelMar 13, 2025, 6:31 PM

96 points

26 comments LW link

Reviewing LessWrong: Screwtape’s Basic Answer

ScrewtapeFeb 5, 2025, 4:30 AM

96 points

18 comments6 min readLW link

Towards a scale-free theory of intelligent agency

Richard_NgoMar 21, 2025, 1:39 AM

96 points

44 comments13 min readLW link

(www.mindthefuture.info)

How to Build a Third Place on Focusmate

Parker ConleyApr 28, 2025, 11:46 PM

96 points

10 comments5 min readLW link

(parconley.com)

The Sweet Lesson: AI Safety Should Scale With Compute

Jesse HooglandMay 5, 2025, 7:03 PM

95 points

3 comments3 min readLW link

The subset parity learning problem: much more than you wanted to know

Dmitry VaintrobJan 3, 2025, 9:13 AM

94 points

18 comments11 min readLW link

Tips and Code for Empirical Research Workflows

John Hughes and Ethan Perez

Jan 20, 2025, 10:31 PM

94 points

14 comments20 min readLW link

On Eating the Sun

jessicataJan 8, 2025, 4:57 AM

94 points

96 comments3 min readLW link

(unstablerontology.substack.com)

We probably won’t just play status games with each other after AGI

Matthew BarnettJan 15, 2025, 4:56 AM

93 points

21 comments4 min readLW link

Implications of the inference scaling paradigm for AI safety

Ryan KiddJan 14, 2025, 2:14 AM

93 points

70 comments5 min readLW link

Five Recent AI Tutoring Studies

Arjun PanicksseryJan 19, 2025, 3:53 AM

93 points

0 comments2 min readLW link

(arjunpanickssery.substack.com)

Elite Coordination via the Consensus of Power

Richard_NgoMar 19, 2025, 6:56 AM

92 points

15 comments12 min readLW link

(www.mindthefuture.info)

The Rising Sea

Jesse HooglandJan 25, 2025, 8:48 PM

92 points

2 comments2 min readLW link

a confusion about preference orderings

nostalgebraistMay 11, 2025, 7:30 PM

92 points

38 comments11 min readLW link

Introducing Squiggle AI

ozziegooenJan 3, 2025, 5:53 PM

92 points

15 comments LW link

ASI existential risk: Reconsidering Alignment as a Goal

habrykaApr 15, 2025, 7:57 PM

91 points

14 comments19 min readLW link

(michaelnotebook.com)

How I force LLMs to generate correct code

claudioMar 21, 2025, 2:40 PM

91 points

7 comments5 min readLW link

Thoughts on the conservative assumptions in AI control

BuckJan 17, 2025, 7:23 PM

91 points

5 comments13 min readLW link

Slow corporations as an intuition pump for AI R&D automation

ryan_greenblatt and elifland

May 9, 2025, 2:49 PM

91 points

23 comments9 min readLW link

Six Thoughts on AI Safety

boazbarakJan 24, 2025, 10:20 PM

91 points

55 comments15 min readLW link

Tips On Empirical Research Slides

James Chua, John Hughes, Ethan Perez and Owain_Evans

Jan 8, 2025, 5:06 AM

90 points

4 comments6 min readLW link