All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

AllJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

re: Yudkowsky on biological materials

bhauthDec 11, 2023, 1:28 PM

182 points

30 comments5 min readLW link

A report about LessWrong karma volatility from a different universe

Ben PaceApr 1, 2023, 9:48 PM

181 points

7 comments1 min readLW link

There should be more AI safety orgs

Marius HobbhahnSep 21, 2023, 2:53 PM

181 points

25 comments17 min readLW link

Neural networks generalize because of this one weird trick

Jesse HooglandJan 18, 2023, 12:10 AM

181 points

34 comments15 min readLW link 1 review

(www.jessehoogland.com)

ChatGPT (and now GPT4) is very easily distracted from its rules

dmcsMar 15, 2023, 5:55 PM

180 points

42 comments1 min readLW link

[Link] A community alert about Ziz

DanielFilanFeb 24, 2023, 12:06 AM

180 points

166 comments2 min readLW link 4 reviews

(medium.com)

Talking publicly about AI risk

Jan_KulveitApr 21, 2023, 11:28 AM

180 points

9 comments6 min readLW link

I still think it’s very unlikely we’re observing alien aircraft

dynomightJun 15, 2023, 1:01 PM

180 points

70 comments5 min readLW link

(dynomight.net)

When is Goodhart catastrophic?

Drake Thomas and Thomas Kwa

May 9, 2023, 3:59 AM

180 points

29 comments8 min readLW link 1 review

Alexander and Yudkowsky on AGI goals

Scott Alexander and Eliezer Yudkowsky

Jan 24, 2023, 9:09 PM

178 points

53 comments26 min readLW link 1 review

LLMs Sometimes Generate Purely Negatively-Reinforced Text

Fabien RogerJun 16, 2023, 4:31 PM

177 points

11 comments7 min readLW link

The ‘Neglected Approaches’ Approach: AE Studio’s Alignment Agenda

Cameron Berg, Judd Rosenblatt, AE Studio and Marc Carauleanu

Dec 18, 2023, 8:35 PM

177 points

23 comments12 min readLW link 1 review

Critical review of Christiano’s disagreements with Yudkowsky

Vanessa KosoyDec 27, 2023, 4:02 PM

176 points

40 comments15 min readLW link

A rough and incomplete review of some of John Wentworth’s research

So8resMar 28, 2023, 6:52 PM

175 points

18 comments18 min readLW link

[Linkpost] Introducing Superalignment

berenJul 5, 2023, 6:23 PM

175 points

69 comments1 min readLW link

(openai.com)

Decision Theory with the Magic Parts Highlighted

moridinamaelMay 16, 2023, 5:39 PM

175 points

24 comments5 min readLW link

Defunding My Mistake

ymeskhoutSep 4, 2023, 2:43 PM

175 points

41 comments6 min readLW link

Thomas Kwa’s MIRI research experience

Thomas Kwa, peterbarnett, Vivek Hebbar, Jeremy Gillen, Bird Concept and Raemon

Oct 2, 2023, 4:42 PM

173 points

53 comments1 min readLW link

Tuning your Cognitive Strategies

Raemon and SquirrelInHell

Apr 27, 2023, 8:32 PM

173 points

59 comments9 min readLW link 1 review

(bewelltuned.com)

Anthropic’s Core Views on AI Safety

Zac Hatfield-DoddsMar 9, 2023, 4:55 PM

172 points

39 comments2 min readLW link

(www.anthropic.com)

Why Are Bacteria So Simple?

aysjaFeb 6, 2023, 3:00 AM

172 points

33 comments10 min readLW link

Parametrically retargetable decision-makers tend to seek power

TurnTroutFeb 18, 2023, 6:41 PM

172 points

10 comments2 min readLW link

(arxiv.org)

AI #1: Sydney and Bing

ZviFeb 21, 2023, 2:00 PM

171 points

45 comments61 min readLW link 1 review

(thezvi.wordpress.com)

What I mean by “alignment is in large part about making cognition aimable at all”

So8resJan 30, 2023, 3:22 PM

171 points

25 comments2 min readLW link

President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

Tristan WilliamsOct 30, 2023, 11:15 AM

171 points

39 comments LW link

(www.whitehouse.gov)

How to (hopefully ethically) make money off of AGI

habryka, Zvi, Cosmos and NoahK

Nov 6, 2023, 11:35 PM

171 points

95 comments32 min readLW link 1 review

[April Fools’] Definitive confirmation of shard theory

TurnTroutApr 1, 2023, 7:27 AM

170 points

8 comments2 min readLW link

Will the growing deer prion epidemic spread to humans? Why not?

eukaryoteJun 25, 2023, 4:31 AM

170 points

33 comments13 min readLW link

(eukaryotewritesblog.com)

Architects of Our Own Demise: We Should Stop Developing AI Carelessly

RokoOct 26, 2023, 12:36 AM

170 points

75 comments3 min readLW link

Rationality !== Winning

RaemonJul 24, 2023, 2:53 AM

170 points

51 comments4 min readLW link

Thoughts on the AI Safety Summit company policy requests and responses

So8resOct 31, 2023, 11:54 PM

169 points

14 comments10 min readLW link

2023 Unofficial LessWrong Census/Survey

ScrewtapeDec 2, 2023, 4:41 AM

169 points

81 comments1 min readLW link

A stylized dialogue on John Wentworth’s claims about markets and optimization

So8resMar 25, 2023, 10:32 PM

169 points

22 comments8 min readLW link

Davidad’s Bold Plan for Alignment: An In-Depth Explanation

Charbel-Raphaël and Gabin

Apr 19, 2023, 4:09 PM

168 points

40 comments21 min readLW link 2 reviews

How useful is mechanistic interpretability?

ryan_greenblatt, Neel Nanda, Buck and habryka

Dec 1, 2023, 2:54 AM

167 points

54 comments25 min readLW link

Why it’s so hard to talk about Consciousness

Rafael HarthJul 2, 2023, 3:56 PM

167 points

215 comments9 min readLW link 3 reviews

The Brain is Not Close to Thermodynamic Limits on Computation

DaemonicSigilApr 24, 2023, 8:21 AM

167 points

58 comments5 min readLW link

You can just spontaneously call people you haven’t met in years

lcNov 13, 2023, 5:21 AM

167 points

21 comments1 min readLW link

My understanding of Anthropic strategy

Swimmer963 (Miranda Dixon-Luinenburg) Feb 15, 2023, 1:56 AM

166 points

31 comments4 min readLW link

When can we trust model evaluations?

evhubJul 28, 2023, 7:42 PM

166 points

10 comments10 min readLW link 1 review

What Discovering Latent Knowledge Did and Did Not Find

Fabien RogerMar 13, 2023, 7:29 PM

166 points

17 comments11 min readLW link

A list of core AI safety problems and how I hope to solve them

davidadAug 26, 2023, 3:12 PM

165 points

29 comments5 min readLW link

$20 Million in NSF Grants for Safety Research

Dan HFeb 28, 2023, 4:44 AM

165 points

12 comments1 min readLW link

Loudly Give Up, Don’t Quietly Fade

ScrewtapeNov 13, 2023, 11:30 PM

165 points

12 comments6 min readLW link 1 review

Gradient hacking is extremely difficult

berenJan 24, 2023, 3:45 PM

164 points

22 comments5 min readLW link

Towards understanding-based safety evaluations

evhubMar 15, 2023, 6:18 PM

164 points

16 comments5 min readLW link

Prizes for matrix completion problems

paulfchristianoMay 3, 2023, 11:30 PM

164 points

52 comments1 min readLW link

(www.alignment.org)

RSPs are pauses done right

evhubOct 14, 2023, 4:06 AM

164 points

73 comments7 min readLW link 1 review

Holly Elmore and Rob Miles dialogue on AI Safety Advocacy

Bird Concept, Robert Miles and Holly_Elmore

Oct 20, 2023, 9:04 PM

162 points

30 comments27 min readLW link

The Dial of Progress

ZviJun 13, 2023, 1:40 PM

161 points

119 comments11 min readLW link

(thezvi.wordpress.com)

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer