AGI Ruin: A List of Lethalities

Eliezer Yudkowsky5 Jun 2022 22:05 UTC
901 points
690 comments30 min readLW link3 reviews

Where I agree and dis­agree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC
875 points
219 comments18 min readLW link2 reviews

It’s Prob­a­bly Not Lithium

Natália28 Jun 2022 21:24 UTC
442 points
186 comments28 min readLW link1 review

Se­cu­rity Mind­set: Les­sons from 20+ years of Soft­ware Se­cu­rity Failures Rele­vant to AGI Alignment

elspood21 Jun 2022 23:55 UTC
361 points
42 comments7 min readLW link1 review

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 13:10 UTC
279 points
53 comments10 min readLW link1 review

What Are You Track­ing In Your Head?

johnswentworth28 Jun 2022 19:30 UTC
276 points
81 comments4 min readLW link1 review

Hu­mans are very re­li­able agents

alyssavance16 Jun 2022 22:02 UTC
266 points
35 comments3 min readLW link

Com­ment re­ply: my low-qual­ity thoughts on why CFAR didn’t get farther with a “real/​effi­ca­cious art of ra­tio­nal­ity”

AnnaSalamon9 Jun 2022 2:12 UTC
253 points
62 comments17 min readLW link1 review

Slow mo­tion videos as AI risk in­tu­ition pumps

Andrew_Critch14 Jun 2022 19:31 UTC
237 points
41 comments2 min readLW link1 review

Con­tra Hofs­tadter on GPT-3 Nonsense

rictic15 Jun 2022 21:53 UTC
236 points
24 comments2 min readLW link

AGI Safety FAQ /​ all-dumb-ques­tions-al­lowed thread

Aryeh Englander7 Jun 2022 5:47 UTC
226 points
526 comments4 min readLW link

The in­or­di­nately slow spread of good AGI con­ver­sa­tions in ML

Rob Bensinger21 Jun 2022 16:09 UTC
173 points
62 comments8 min readLW link

AI Could Defeat All Of Us Combined

HoldenKarnofsky9 Jun 2022 15:50 UTC
170 points
42 comments17 min readLW link
(www.cold-takes.com)

An­nounc­ing the In­verse Scal­ing Prize ($250k Prize Pool)

27 Jun 2022 15:58 UTC
169 points
14 comments7 min readLW link

The pro­to­typ­i­cal catas­trophic AI ac­tion is get­ting root ac­cess to its datacenter

Buck2 Jun 2022 23:46 UTC
164 points
13 comments2 min readLW link1 review

A trans­parency and in­ter­pretabil­ity tech tree

evhub16 Jun 2022 23:44 UTC
163 points
11 comments18 min readLW link1 review

On A List of Lethalities

Zvi13 Jun 2022 12:30 UTC
161 points
49 comments54 min readLW link1 review
(thezvi.wordpress.com)

Why all the fuss about re­cur­sive self-im­prove­ment?

So8res12 Jun 2022 20:53 UTC
158 points
62 comments7 min readLW link1 review

Non­profit Boards are Weird

HoldenKarnofsky23 Jun 2022 14:40 UTC
154 points
26 comments20 min readLW link1 review
(www.cold-takes.com)

LessWrong Has Agree/​Disagree Vot­ing On All New Com­ment Threads

Ben Pace24 Jun 2022 0:43 UTC
151 points
217 comments2 min readLW link1 review

Stay­ing Split: Sa­ba­tini and So­cial Justice

[DEACTIVATED] Duncan Sabien8 Jun 2022 8:32 UTC
151 points
28 comments21 min readLW link

Godzilla Strategies

johnswentworth11 Jun 2022 15:44 UTC
146 points
71 comments3 min readLW link

Public be­liefs vs. Pri­vate beliefs

Eli Tyre1 Jun 2022 21:33 UTC
143 points
30 comments5 min readLW link

[Question] why as­sume AGIs will op­ti­mize for fixed goals?

nostalgebraist10 Jun 2022 1:28 UTC
143 points
55 comments4 min readLW link2 reviews

Deep Learn­ing Sys­tems Are Not Less In­ter­pretable Than Logic/​Prob­a­bil­ity/​Etc

johnswentworth4 Jun 2022 5:41 UTC
142 points
53 comments2 min readLW link1 review

A de­scrip­tive, not pre­scrip­tive, overview of cur­rent AI Align­ment Research

6 Jun 2022 21:59 UTC
138 points
21 comments7 min readLW link

An­nounc­ing the LessWrong Cu­rated Podcast

22 Jun 2022 22:16 UTC
137 points
27 comments1 min readLW link

Limits to Legibility

Jan_Kulveit29 Jun 2022 17:42 UTC
137 points
11 comments5 min readLW link1 review

AI-Writ­ten Cri­tiques Help Hu­mans No­tice Flaws

paulfchristiano25 Jun 2022 17:22 UTC
137 points
5 comments3 min readLW link
(openai.com)

Con­tra EY: Can AGI de­stroy us with­out trial & er­ror?

Nikita Sokolsky13 Jun 2022 18:26 UTC
136 points
72 comments15 min readLW link

Steam

abramdemski20 Jun 2022 17:38 UTC
134 points
13 comments5 min readLW link1 review

Con­fused why a “ca­pa­bil­ities re­search is good for al­ign­ment progress” po­si­tion isn’t dis­cussed more

Kaj_Sotala2 Jun 2022 21:41 UTC
129 points
27 comments4 min readLW link

In­ter­gen­er­a­tional trauma im­ped­ing co­op­er­a­tive ex­is­ten­tial safety efforts

Andrew_Critch3 Jun 2022 8:13 UTC
128 points
29 comments3 min readLW link

“Pivotal Acts” means some­thing specific

Raemon7 Jun 2022 21:56 UTC
127 points
23 comments2 min readLW link

Let’s See You Write That Cor­rigi­bil­ity Tag

Eliezer Yudkowsky19 Jun 2022 21:11 UTC
125 points
69 comments1 min readLW link

Will Ca­pa­bil­ities Gen­er­al­ise More?

Ramana Kumar29 Jun 2022 17:12 UTC
122 points
39 comments4 min readLW link

Con­ver­sa­tion with Eliezer: What do you want the sys­tem to do?

Akash25 Jun 2022 17:36 UTC
120 points
38 comments2 min readLW link

Scott Aaron­son is join­ing OpenAI to work on AI safety

peterbarnett18 Jun 2022 4:06 UTC
117 points
31 comments1 min readLW link
(scottaaronson.blog)

Leav­ing Google, Join­ing the Nu­cleic Acid Observatory

jefftk10 Jun 2022 17:00 UTC
114 points
4 comments3 min readLW link
(www.jefftk.com)

Who mod­els the mod­els that model mod­els? An ex­plo­ra­tion of GPT-3′s in-con­text model fit­ting ability

Lovre7 Jun 2022 19:37 UTC
112 points
16 comments9 min readLW link

CFAR Hand­book: Introduction

CFAR!Duncan28 Jun 2022 16:53 UTC
109 points
12 comments1 min readLW link

wrap­per-minds are the enemy

nostalgebraist17 Jun 2022 1:58 UTC
104 points
41 comments8 min readLW link

Yes, AI re­search will be sub­stan­tially cur­tailed if a lab causes a ma­jor disaster

lc14 Jun 2022 22:17 UTC
103 points
31 comments2 min readLW link

Re­la­tion­ship Ad­vice Repository

Ruby20 Jun 2022 14:39 UTC
102 points
36 comments39 min readLW link

An­nounc­ing Epoch: A re­search or­ga­ni­za­tion in­ves­ti­gat­ing the road to Trans­for­ma­tive AI

27 Jun 2022 13:55 UTC
97 points
2 comments2 min readLW link
(epochai.org)

Con­test: An Alien Message

DaemonicSigil27 Jun 2022 5:54 UTC
95 points
100 comments1 min readLW link

The Moun­tain Troll

lsusr11 Jun 2022 9:14 UTC
95 points
25 comments2 min readLW link

Pivotal out­comes and pivotal processes

Andrew_Critch17 Jun 2022 23:43 UTC
95 points
31 comments4 min readLW link

Units of Exchange

CFAR!Duncan28 Jun 2022 16:53 UTC
95 points
28 comments11 min readLW link

An­nounc­ing the Align­ment of Com­plex Sys­tems Re­search Group

4 Jun 2022 4:10 UTC
91 points
20 comments5 min readLW link