All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

AllJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

«Boundaries», Part 1: a key missing concept from utility theory

Andrew_CritchJul 26, 2022, 11:03 PM

158 points

33 comments7 min readLW link

Why all the fuss about recursive self-improvement?

So8resJun 12, 2022, 8:53 PM

158 points

62 comments7 min readLW link 1 review

Limits to Legibility

Jan_KulveitJun 29, 2022, 5:42 PM

157 points

11 comments5 min readLW link 1 review

Your posts should be on arXiv

JanBAug 25, 2022, 10:35 AM

156 points

44 comments3 min readLW link

Nonprofit Boards are Weird

HoldenKarnofskyJun 23, 2022, 2:40 PM

156 points

26 comments20 min readLW link 1 review

(www.cold-takes.com)

What’s General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworthAug 15, 2022, 10:48 PM

156 points

18 comments10 min readLW link

Nate Soares’ Life Advice

CatGoddessAug 23, 2022, 2:46 AM

155 points

41 comments3 min readLW link

LessWrong Has Agree/Disagree Voting On All New Comment Threads

Ben PaceJun 24, 2022, 12:43 AM

154 points

217 comments2 min readLW link 1 review

Emotionally Confronting a Probably-Doomed World: Against Motivation Via Dignity Points

TurnTroutApr 10, 2022, 6:45 PM

154 points

7 comments9 min readLW link

Staying Split: Sabatini and Social Justice

Duncan Sabien (Deactivated)Jun 8, 2022, 8:32 AM

153 points

28 comments21 min readLW link

Learning By Writing

HoldenKarnofskyFeb 22, 2022, 3:50 PM

151 points

25 comments10 min readLW link 3 reviews

(www.cold-takes.com)

[Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey, Dan Braun and beren

Dec 13, 2022, 3:41 PM

150 points

23 comments22 min readLW link 2 reviews

Prizes for ELK proposals

paulfchristianoJan 3, 2022, 8:23 PM

150 points

152 comments7 min readLW link

DeepMind is hiring for the Scalable Alignment and Alignment Teams

Rohin Shah and Geoffrey Irving

May 13, 2022, 12:17 PM

150 points

34 comments9 min readLW link

Alignment research exercises

Richard_NgoFeb 21, 2022, 8:24 PM

150 points

17 comments8 min readLW link

Shard Theory in Nine Theses: a Distillation and Critical Appraisal

LawrenceCDec 19, 2022, 10:52 PM

150 points

30 comments18 min readLW link

The metaphor you want is “color blindness,” not “blind spot.”

Duncan Sabien (Deactivated)Feb 14, 2022, 12:28 AM

150 points

17 comments3 min readLW link 2 reviews

Steam

abramdemskiJun 20, 2022, 5:38 PM

149 points

13 comments5 min readLW link 1 review

Inner and outer alignment decompose one hard problem into two extremely hard problems

TurnTroutDec 2, 2022, 2:43 AM

149 points

22 comments47 min readLW link 3 reviews

Public-facing Censorship Is Safety Theater, Causing Reputational Damage

YitzSep 23, 2022, 5:08 AM

149 points

42 comments6 min readLW link

Use Normal Predictions

Jan Christian RefsgaardJan 9, 2022, 3:01 PM

148 points

67 comments6 min readLW link

A Year of AI Increasing AI Progress

TW123Dec 30, 2022, 2:09 AM

148 points

3 comments2 min readLW link

AI coordination needs clear wins

evhubSep 1, 2022, 11:41 PM

147 points

16 comments2 min readLW link 1 review

Reshaping the AI Industry

Thane RuthenisMay 29, 2022, 10:54 PM

147 points

35 comments21 min readLW link

K-complexity is silly; use cross-entropy instead

So8resDec 20, 2022, 11:06 PM

147 points

54 comments14 min readLW link 2 reviews

[Question] why assume AGIs will optimize for fixed goals?

nostalgebraistJun 10, 2022, 1:28 AM

147 points

60 comments4 min readLW link 2 reviews

We’re already in AI takeoff

ValentineMar 8, 2022, 11:09 PM

146 points

119 comments7 min readLW link

Updating my AI timelines

Matthew BarnettDec 5, 2022, 8:46 PM

145 points

50 comments2 min readLW link

Supervise Process, not Outcomes

stuhlmueller and jungofthewon

Apr 5, 2022, 10:18 PM

145 points

9 comments10 min readLW link

Interpretability/Tool-ness/Alignment/Corrigibility are not Composable

johnswentworthAug 8, 2022, 6:05 PM

144 points

13 comments3 min readLW link

Public beliefs vs. Private beliefs

Eli TyreJun 1, 2022, 9:33 PM

144 points

30 comments5 min readLW link

Interpreting Neural Networks through the Polytope Lens

Sid Black, Lee Sharkey, Connor Leahy, beren, CRG, merizian, Eric Winsor and Dan Braun

Sep 23, 2022, 5:58 PM

144 points

29 comments33 min readLW link

Refine: An Incubator for Conceptual Alignment Research Bets

adamShimiApr 15, 2022, 8:57 AM

144 points

13 comments4 min readLW link

Takeaways from our robust injury classifier project [Redwood Research]

dmzSep 17, 2022, 3:55 AM

143 points

12 comments6 min readLW link 1 review

Twitter thread on postrationalists

Eli TyreFeb 17, 2022, 9:02 AM

143 points

32 comments5 min readLW link

High-stakes alignment via adversarial training [Redwood Research report]

dmz, LawrenceC and Nate Thomas

May 5, 2022, 12:59 AM

142 points

29 comments9 min readLW link

Age changes what you care about

DentinOct 16, 2022, 3:36 PM

141 points

37 comments2 min readLW link

[Question] How to Convince my Son that Drugs are Bad

concerned_dadDec 17, 2022, 6:47 PM

140 points

84 comments2 min readLW link

The Parable of the Boy Who Cried 5% Chance of Wolf

KatWoodsAug 15, 2022, 2:33 PM

140 points

24 comments2 min readLW link

How might we align transformative AI if it’s developed very soon?

HoldenKarnofskyAug 29, 2022, 3:42 PM

140 points

55 comments45 min readLW link 1 review

Understanding Infra-Bayesianism: A Beginner-Friendly Video Series

Jack Parker and Connall Garrod

Sep 22, 2022, 1:25 PM

140 points

6 comments2 min readLW link

More Is Different for AI

jsteinhardtJan 4, 2022, 7:30 PM

140 points

24 comments3 min readLW link 1 review

(bounded-regret.ghost.io)

Resolve Cycles

CFAR!DuncanJul 16, 2022, 11:17 PM

140 points

8 comments10 min readLW link

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

Andrew_CritchApr 19, 2022, 8:25 PM

139 points

55 comments7 min readLW link 1 review

Takeoff speeds have a huge effect on what it means to work on AI x-risk

BuckApr 13, 2022, 5:38 PM

139 points

27 comments2 min readLW link 2 reviews

A descriptive, not prescriptive, overview of current AI Alignment Research

Jan, Logan Riggs, jacquesthibs and janus

Jun 6, 2022, 9:59 PM

139 points

21 comments7 min readLW link

ELK prize results

paulfchristiano and Mark Xu

Mar 9, 2022, 12:01 AM

138 points

50 comments21 min readLW link

Mechanistic anomaly detection and ELK

paulfchristianoNov 25, 2022, 6:50 PM

138 points

22 comments21 min readLW link

(ai-alignment.com)

AI Timelines via Cumulative Optimization Power: Less Long, More Short

jacob_cannellOct 6, 2022, 12:21 AM

138 points

33 comments6 min readLW link

Contra EY: Can AGI destroy us without trial & error?

nsokolskyJun 13, 2022, 6:26 PM

137 points

72 comments15 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer