All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025 2026

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

How it feels to have your mind hacked by an AI

blaked12 Jan 2023 0:33 UTC

377 points

222 comments17 min readLW link

On not getting contaminated by the wrong obesity ideas

Natália28 Jan 2023 20:18 UTC

310 points

69 comments30 min readLW link

Basics of Rationalist Discourse

Duncan Sabien (Inactive)27 Jan 2023 2:40 UTC

288 points

193 comments31 min readLW link 4 reviews

We don’t trade with ants

KatjaGrace10 Jan 2023 23:50 UTC

283 points

110 comments7 min readLW link 1 review

(worldspiritsockpuppet.com)

My Model Of EA Burnout

LoganStrohl25 Jan 2023 17:52 UTC

282 points

50 comments5 min readLW link 1 review

Thoughts on the impact of RLHF research

paulfchristiano25 Jan 2023 17:23 UTC

256 points

102 comments9 min readLW link

Recursive Middle Manager Hell

Raemon1 Jan 2023 4:33 UTC

226 points

46 comments11 min readLW link 1 review

Neural networks generalize because of this one weird trick

Jesse Hoogland18 Jan 2023 0:10 UTC

215 points

35 comments15 min readLW link 1 review

(www.jessehoogland.com)

What a compute-centric framework says about AI takeoff speeds

Tom Davidson23 Jan 2023 4:02 UTC

188 points

30 comments16 min readLW link 1 review

Alexander and Yudkowsky on AGI goals

Scott Alexander and Eliezer Yudkowsky

24 Jan 2023 21:09 UTC

179 points

53 comments26 min readLW link 1 review

What I mean by “alignment is in large part about making cognition aimable at all”

So8res30 Jan 2023 15:22 UTC

176 points

25 comments2 min readLW link

Gradient hacking is extremely difficult

beren24 Jan 2023 15:45 UTC

175 points

23 comments5 min readLW link

Sapir-Whorf for Rationalists

Duncan Sabien (Inactive)25 Jan 2023 7:58 UTC

165 points

49 comments20 min readLW link

Why didn’t we get the four-hour workday?

jasoncrawford6 Jan 2023 21:29 UTC

147 points

34 comments6 min readLW link

(rootsofprogress.org)

“Heretical Thoughts on AI” by Eli Dourado

DragonGod19 Jan 2023 16:11 UTC

146 points

38 comments3 min readLW link

(www.elidourado.com)

How to slow down scientific progress, according to Leo Szilard

jasoncrawford5 Jan 2023 18:26 UTC

141 points

18 comments2 min readLW link

(rootsofprogress.org)

Induction heads—illustrated

CallumMcDougall2 Jan 2023 15:35 UTC

138 points

14 comments3 min readLW link

Wolf Incident Postmortem

jefftk9 Jan 2023 3:20 UTC

138 points

13 comments1 min readLW link

(www.jefftk.com)

Basic Facts about Language Model Internals

beren and Eric Winsor

4 Jan 2023 13:01 UTC

134 points

19 comments9 min readLW link

AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years

basil.halperin, J. Zachary Mazlish and tmychow

10 Jan 2023 16:06 UTC

127 points

45 comments26 min readLW link

Compendium of problems with RLHF

Charbel-Raphaël29 Jan 2023 11:40 UTC

123 points

16 comments10 min readLW link

Soft optimization makes the value target bigger

Jeremy Gillen2 Jan 2023 16:06 UTC

123 points

20 comments12 min readLW link

How to Bounded Distrust

Zvi9 Jan 2023 13:10 UTC

122 points

17 comments4 min readLW link 1 review

(thezvi.wordpress.com)

Transcript of Sam Altman’s interview touching on AI safety

Andy_McKenzie20 Jan 2023 16:14 UTC

121 points

42 comments10 min readLW link

Touch reality as soon as possible (when doing machine learning research)

LawrenceC3 Jan 2023 19:11 UTC

121 points

9 comments8 min readLW link 1 review

Why I’m joining Anthropic

evhub5 Jan 2023 1:12 UTC

120 points

4 comments2 min readLW link

Running by Default

jefftk5 Jan 2023 13:50 UTC

117 points

40 comments1 min readLW link

(www.jefftk.com)

The Fountain of Health: a First Principles Guide to Rejuvenation

PhilJackson7 Jan 2023 18:34 UTC

115 points

39 comments41 min readLW link

Iron deficiencies are very bad and you should treat them

Elizabeth12 Jan 2023 9:10 UTC

109 points

34 comments11 min readLW link 1 review

(acesounderglass.com)

Vegan Nutrition Testing Project: Interim Report

Elizabeth20 Jan 2023 5:50 UTC

105 points

37 comments8 min readLW link

(acesounderglass.com)

Large language models learn to represent the world

gjm22 Jan 2023 13:10 UTC

102 points

20 comments3 min readLW link 1 review

2022 was the year AGI arrived (Just don’t call it that)

Logan Zoellner4 Jan 2023 15:19 UTC

101 points

60 comments3 min readLW link

Parameter Scaling Comes for RL, Maybe

1a3orn24 Jan 2023 13:55 UTC

100 points

3 comments14 min readLW link

2022 Unofficial LessWrong General Census

Screwtape30 Jan 2023 18:36 UTC

98 points

33 comments2 min readLW link

Concrete Reasons for Hope about AI

Zac Hatfield-Dodds14 Jan 2023 1:22 UTC

94 points

13 comments1 min readLW link

Categorizing failures as “outer” or “inner” misalignment is often confused

Rohin Shah6 Jan 2023 15:48 UTC

93 points

21 comments8 min readLW link

Disentangling Shard Theory into Atomic Claims

Leon Lang13 Jan 2023 4:23 UTC

86 points

6 comments18 min readLW link

Book Review: Worlds of Flow

remember16 Jan 2023 20:17 UTC

85 points

3 comments9 min readLW link

Review AI Alignment posts to help figure out how to make a proper AI Alignment review

habryka and Raemon

10 Jan 2023 0:19 UTC

85 points

31 comments2 min readLW link

“Endgame safety” for AGI

Steven Byrnes24 Jan 2023 14:15 UTC

85 points

10 comments6 min readLW link

Childhood Roundup #1

Zvi6 Jan 2023 13:00 UTC

84 points

27 comments8 min readLW link

(thezvi.wordpress.com)

The Alignment Problem from a Deep Learning Perspective (major rewrite)

SoerenMind, Richard_Ngo and LawrenceC

10 Jan 2023 16:06 UTC

84 points

9 comments39 min readLW link

(arxiv.org)

Thoughts on hardware / compute requirements for AGI

Steven Byrnes24 Jan 2023 14:03 UTC

83 points

33 comments25 min readLW link

Simulacra Levels Summary

Zvi30 Jan 2023 13:40 UTC

81 points

14 comments7 min readLW link

(thezvi.wordpress.com)

On AI and Interest Rates

Zvi17 Jan 2023 15:00 UTC

80 points

13 comments8 min readLW link

(thezvi.wordpress.com)

Confusing the ideal for the necessary

adamShimi16 Jan 2023 17:29 UTC

80 points

6 comments1 min readLW link

(epistemologicalvigilance.substack.com)

Compounding Resource X

Raemon11 Jan 2023 3:14 UTC

77 points

6 comments9 min readLW link

Against Boltzmann mesaoptimizers

porby30 Jan 2023 2:55 UTC

77 points

6 comments4 min readLW link

Some Thoughts on AI Art

abramdemski25 Jan 2023 14:18 UTC

75 points

20 comments7 min readLW link

Infohazards vs Fork Hazards

jimrandomh5 Jan 2023 9:45 UTC

75 points

16 comments1 min readLW link