How it feels to have your mind hacked by an AI

blaked12 Jan 2023 0:33 UTC
355 points
219 comments17 min readLW link

On not get­ting con­tam­i­nated by the wrong obe­sity ideas

Natália28 Jan 2023 20:18 UTC
308 points
67 comments30 min readLW link

We don’t trade with ants

KatjaGrace10 Jan 2023 23:50 UTC
264 points
108 comments7 min readLW link
(worldspiritsockpuppet.com)

Ba­sics of Ra­tion­al­ist Discourse

[DEACTIVATED] Duncan Sabien27 Jan 2023 2:40 UTC
260 points
180 comments36 min readLW link

My Model Of EA Burnout

LoganStrohl25 Jan 2023 17:52 UTC
237 points
49 comments5 min readLW link

Thoughts on the im­pact of RLHF research

paulfchristiano25 Jan 2023 17:23 UTC
236 points
101 comments9 min readLW link

Re­cur­sive Mid­dle Man­ager Hell

Raemon1 Jan 2023 4:33 UTC
218 points
45 comments11 min readLW link

What a com­pute-cen­tric frame­work says about AI take­off speeds

Tom Davidson23 Jan 2023 4:02 UTC
179 points
29 comments16 min readLW link

Alexan­der and Yud­kowsky on AGI goals

24 Jan 2023 21:09 UTC
174 points
52 comments26 min readLW link

What I mean by “al­ign­ment is in large part about mak­ing cog­ni­tion aimable at all”

So8res30 Jan 2023 15:22 UTC
167 points
25 comments2 min readLW link

Neu­ral net­works gen­er­al­ize be­cause of this one weird trick

Jesse Hoogland18 Jan 2023 0:10 UTC
166 points
28 comments53 min readLW link
(www.jessehoogland.com)

Gra­di­ent hack­ing is ex­tremely difficult

beren24 Jan 2023 15:45 UTC
161 points
22 comments5 min readLW link

Sapir-Whorf for Rationalists

[DEACTIVATED] Duncan Sabien25 Jan 2023 7:58 UTC
147 points
48 comments19 min readLW link

“Hereti­cal Thoughts on AI” by Eli Dourado

DragonGod19 Jan 2023 16:11 UTC
145 points
38 comments3 min readLW link
(www.elidourado.com)

Why didn’t we get the four-hour work­day?

jasoncrawford6 Jan 2023 21:29 UTC
137 points
34 comments6 min readLW link
(rootsofprogress.org)

How to slow down sci­en­tific progress, ac­cord­ing to Leo Szilard

jasoncrawford5 Jan 2023 18:26 UTC
134 points
18 comments2 min readLW link
(rootsofprogress.org)

Ba­sic Facts about Lan­guage Model Internals

4 Jan 2023 13:01 UTC
130 points
18 comments9 min readLW link

Wolf In­ci­dent Postmortem

jefftk9 Jan 2023 3:20 UTC
129 points
13 comments1 min readLW link
(www.jefftk.com)

Why I’m join­ing Anthropic

evhub5 Jan 2023 1:12 UTC
121 points
4 comments1 min readLW link

Com­pendium of prob­lems with RLHF

Charbel-Raphaël29 Jan 2023 11:40 UTC
120 points
16 comments10 min readLW link

How to Bounded Distrust

Zvi9 Jan 2023 13:10 UTC
119 points
15 comments4 min readLW link
(thezvi.wordpress.com)

Soft op­ti­miza­tion makes the value tar­get bigger

Jeremy Gillen2 Jan 2023 16:06 UTC
117 points
20 comments12 min readLW link

AGI and the EMH: mar­kets are not ex­pect­ing al­igned or un­al­igned AI in the next 30 years

10 Jan 2023 16:06 UTC
117 points
44 comments26 min readLW link

Tran­script of Sam Alt­man’s in­ter­view touch­ing on AI safety

Andy_McKenzie20 Jan 2023 16:14 UTC
115 points
41 comments10 min readLW link

The Foun­tain of Health: a First Prin­ci­ples Guide to Rejuvenation

PhilJackson7 Jan 2023 18:34 UTC
114 points
38 comments41 min readLW link

Touch re­al­ity as soon as pos­si­ble (when do­ing ma­chine learn­ing re­search)

LawrenceC3 Jan 2023 19:11 UTC
107 points
7 comments8 min readLW link

Run­ning by Default

jefftk5 Jan 2023 13:50 UTC
104 points
39 comments1 min readLW link
(www.jefftk.com)

Large lan­guage mod­els learn to rep­re­sent the world

gjm22 Jan 2023 13:10 UTC
102 points
19 comments3 min readLW link

Ve­gan Nutri­tion Test­ing Pro­ject: In­terim Report

Elizabeth20 Jan 2023 5:50 UTC
102 points
37 comments8 min readLW link
(acesounderglass.com)

2022 was the year AGI ar­rived (Just don’t call it that)

Logan Zoellner4 Jan 2023 15:19 UTC
101 points
59 comments3 min readLW link

Con­crete Rea­sons for Hope about AI

Zac Hatfield-Dodds14 Jan 2023 1:22 UTC
101 points
13 comments1 min readLW link

Pa­ram­e­ter Scal­ing Comes for RL, Maybe

1a3orn24 Jan 2023 13:55 UTC
98 points
3 comments14 min readLW link

2022 Unoffi­cial LessWrong Gen­eral Cen­sus

Screwtape30 Jan 2023 18:36 UTC
97 points
33 comments2 min readLW link

In­duc­tion heads—illustrated

CallumMcDougall2 Jan 2023 15:35 UTC
93 points
8 comments3 min readLW link

Iron defi­cien­cies are very bad and you should treat them

Elizabeth12 Jan 2023 9:10 UTC
93 points
29 comments11 min readLW link
(acesounderglass.com)

Cat­e­go­riz­ing failures as “outer” or “in­ner” mis­al­ign­ment is of­ten confused

Rohin Shah6 Jan 2023 15:48 UTC
86 points
21 comments8 min readLW link

Disen­tan­gling Shard The­ory into Atomic Claims

Leon Lang13 Jan 2023 4:23 UTC
85 points
6 comments18 min readLW link

“Endgame safety” for AGI

Steven Byrnes24 Jan 2023 14:15 UTC
84 points
10 comments6 min readLW link

Re­view AI Align­ment posts to help figure out how to make a proper AI Align­ment review

10 Jan 2023 0:19 UTC
84 points
31 comments2 min readLW link

The Align­ment Prob­lem from a Deep Learn­ing Per­spec­tive (ma­jor rewrite)

10 Jan 2023 16:06 UTC
83 points
8 comments39 min readLW link
(arxiv.org)

Book Re­view: Wor­lds of Flow

remember16 Jan 2023 20:17 UTC
83 points
3 comments9 min readLW link

Child­hood Roundup #1

Zvi6 Jan 2023 13:00 UTC
83 points
27 comments8 min readLW link
(thezvi.wordpress.com)

Con­fus­ing the ideal for the necessary

adamShimi16 Jan 2023 17:29 UTC
79 points
6 comments1 min readLW link
(epistemologicalvigilance.substack.com)

On AI and In­ter­est Rates

Zvi17 Jan 2023 15:00 UTC
79 points
13 comments8 min readLW link
(thezvi.wordpress.com)

Against Boltz­mann mesaoptimizers

porby30 Jan 2023 2:55 UTC
76 points
6 comments4 min readLW link

Spread­ing mes­sages to help with the most im­por­tant century

HoldenKarnofsky25 Jan 2023 18:20 UTC
75 points
4 comments18 min readLW link
(www.cold-takes.com)

Some Thoughts on AI Art

abramdemski25 Jan 2023 14:18 UTC
74 points
20 comments7 min readLW link

Went­worth and Larsen on buy­ing time

9 Jan 2023 21:31 UTC
73 points
6 comments12 min readLW link

Pes­simistic Shard Theory

Garrett Baker25 Jan 2023 0:59 UTC
72 points
13 comments3 min readLW link

Si­mu­lacra Levels Summary

Zvi30 Jan 2023 13:40 UTC
71 points
12 comments7 min readLW link
(thezvi.wordpress.com)