LessWrong’s (first) album: I Have Been A Good Bing

1 Apr 2024 7:33 UTC
517 points
156 comments11 min readLW link

There is way too much serendipity

Malmesbury19 Jan 2024 19:37 UTC
349 points
56 comments7 min readLW link

[April Fools’ Day] In­tro­duc­ing Open As­teroid Impact

Linch1 Apr 2024 8:14 UTC
321 points
29 comments1 min readLW link
(openasteroidimpact.org)

Trans­form­ers Rep­re­sent Belief State Geom­e­try in their Resi­d­ual Stream

Adam Shai16 Apr 2024 21:16 UTC
303 points
63 comments12 min readLW link

The Best Tacit Knowl­edge Videos on Every Subject

Parker Conley31 Mar 2024 17:14 UTC
302 points
123 comments14 min readLW link

Sleeper Agents: Train­ing De­cep­tive LLMs that Per­sist Through Safety Training

12 Jan 2024 19:51 UTC
291 points
94 comments3 min readLW link
(arxiv.org)

Gentle­ness and the ar­tifi­cial Other

Joe Carlsmith2 Jan 2024 18:21 UTC
265 points
33 comments11 min readLW link

Ex­press in­ter­est in an “FHI of the West”

habryka18 Apr 2024 3:32 UTC
262 points
39 comments3 min readLW link

Scale Was All We Needed, At First

Gabriel Mukobi14 Feb 2024 1:49 UTC
262 points
31 comments8 min readLW link
(aiacumen.substack.com)

Thoughts on seed oil

dynomight20 Apr 2024 12:29 UTC
257 points
79 comments17 min readLW link
(dynomight.net)

On green

Joe Carlsmith21 Mar 2024 17:38 UTC
257 points
33 comments31 min readLW link

Paul Chris­ti­ano named as US AI Safety In­sti­tute Head of AI Safety

Joel Burget16 Apr 2024 16:22 UTC
250 points
55 comments1 min readLW link
(www.commerce.gov)

My PhD the­sis: Al­gorith­mic Bayesian Epistemology

Eric Neyman16 Mar 2024 22:56 UTC
248 points
14 comments7 min readLW link
(arxiv.org)

The case for en­sur­ing that pow­er­ful AIs are controlled

24 Jan 2024 16:11 UTC
243 points
66 comments28 min readLW link

Failures in Kindness

silentbob26 Mar 2024 21:30 UTC
242 points
26 comments9 min readLW link

“No-one in my org puts money in their pen­sion”

Tobes16 Feb 2024 18:33 UTC
241 points
7 comments9 min readLW link
(seekingtobejolly.substack.com)

My Clients, The Liars

ymeskhout5 Mar 2024 21:06 UTC
229 points
85 comments7 min readLW link

Brute Force Man­u­fac­tured Con­sen­sus is Hid­ing the Crime of the Century

Roko3 Feb 2024 20:36 UTC
220 points
156 comments9 min readLW link

MIRI 2024 Mis­sion and Strat­egy Update

Malo5 Jan 2024 0:20 UTC
216 points
44 comments8 min readLW link

CFAR Take­aways: An­drew Critch

Raemon14 Feb 2024 1:37 UTC
213 points
62 comments5 min readLW link

ChatGPT can learn in­di­rect control

Raymond D21 Mar 2024 21:11 UTC
212 points
23 comments1 min readLW link

Believ­ing In

AnnaSalamon8 Feb 2024 7:06 UTC
208 points
49 comments13 min readLW link

“How could I have thought that faster?”

mesaoptimizer11 Mar 2024 10:56 UTC
200 points
30 comments2 min readLW link
(twitter.com)

Sam Alt­man’s Chip Am­bi­tions Un­der­cut OpenAI’s Safety Strategy

garrison10 Feb 2024 19:52 UTC
198 points
52 comments1 min readLW link
(garrisonlovely.substack.com)

Modern Trans­form­ers are AGI, and Hu­man-Level

abramdemski26 Mar 2024 17:46 UTC
197 points
89 comments5 min readLW link

My In­ter­view With Cade Metz on His Re­port­ing About Slate Star Codex

Zack_M_Davis26 Mar 2024 17:18 UTC
188 points
186 comments6 min readLW link

Con­tra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki Heicklen22 Feb 2024 23:56 UTC
184 points
5 comments4 min readLW link
(bayesshammai.substack.com)

Daniel Kah­ne­man has died

DanielFilan27 Mar 2024 15:59 UTC
183 points
11 comments1 min readLW link
(www.washingtonpost.com)

Toward A Math­e­mat­i­cal Frame­work for Com­pu­ta­tion in Superposition

18 Jan 2024 21:06 UTC
182 points
16 comments73 min readLW link

The im­pos­si­ble prob­lem of due process

mingyuan16 Jan 2024 5:18 UTC
180 points
63 comments14 min readLW link

This might be the last AI Safety Camp

24 Jan 2024 9:33 UTC
180 points
33 comments1 min readLW link

Funny Anec­dote of Eliezer From His Sister

Daniel Birnbaum22 Apr 2024 22:05 UTC
180 points
4 comments2 min readLW link

In­tro­duc­ing Align­ment Stress-Test­ing at Anthropic

evhub12 Jan 2024 23:51 UTC
179 points
23 comments2 min readLW link

OMMC An­nounces RIP

1 Apr 2024 23:20 UTC
178 points
5 comments2 min readLW link

Every “Every Bay Area House Party” Bay Area House Party

Richard_Ngo16 Feb 2024 18:53 UTC
174 points
6 comments4 min readLW link

FHI (Fu­ture of Hu­man­ity In­sti­tute) has shut down (2005–2024)

gwern17 Apr 2024 13:54 UTC
173 points
21 comments1 min readLW link
(www.futureofhumanityinstitute.org)

Toward a Broader Con­cep­tion of Ad­verse Selection

Ricki Heicklen14 Mar 2024 22:40 UTC
171 points
61 comments13 min readLW link
(bayesshammai.substack.com)

Ti­maeus’s First Four Months

28 Feb 2024 17:01 UTC
166 points
6 comments6 min readLW link

Re­con­sider the anti-cav­ity bac­te­ria if you are Asian

Lao Mein15 Apr 2024 7:02 UTC
165 points
40 comments4 min readLW link

‘Em­piri­cism!’ as Anti-Epistemology

Eliezer Yudkowsky14 Mar 2024 2:02 UTC
161 points
84 comments25 min readLW link

Without fun­da­men­tal ad­vances, mis­al­ign­ment and catas­tro­phe are the de­fault out­comes of train­ing pow­er­ful AI

26 Jan 2024 7:22 UTC
159 points
60 comments57 min readLW link

Why Would Belief-States Have A Frac­tal Struc­ture, And Why Would That Mat­ter For In­ter­pretabil­ity? An Explainer

18 Apr 2024 0:27 UTC
157 points
17 comments7 min readLW link

What’s up with LLMs rep­re­sent­ing XORs of ar­bi­trary fea­tures?

Sam Marks3 Jan 2024 19:44 UTC
154 points
61 comments16 min readLW link

Many ar­gu­ments for AI x-risk are wrong

TurnTrout5 Mar 2024 2:31 UTC
153 points
76 comments12 min readLW link

Apol­o­giz­ing is a Core Ra­tion­al­ist Skill

johnswentworth2 Jan 2024 17:47 UTC
152 points
42 comments5 min readLW link

[Question] Ex­am­ples of Highly Coun­ter­fac­tual Dis­cov­er­ies?

johnswentworth23 Apr 2024 22:19 UTC
152 points
80 comments1 min readLW link

Mak­ing ev­ery re­searcher seek grants is a bro­ken model

jasoncrawford26 Jan 2024 16:06 UTC
150 points
41 comments4 min readLW link
(rootsofprogress.org)

2023 Sur­vey Results

Screwtape16 Feb 2024 22:24 UTC
150 points
26 comments44 min readLW link

Ver­nor Vinge, who coined the term “Tech­nolog­i­cal Sin­gu­lar­ity”, dies at 79

Kaj_Sotala21 Mar 2024 22:14 UTC
148 points
24 comments1 min readLW link
(arstechnica.com)

On Devin

Zvi18 Mar 2024 13:20 UTC
147 points
30 comments11 min readLW link
(thezvi.wordpress.com)