RSS

[Linkpost] Ex­is­ten­tial Risk Anal­y­sis in Em­piri­cal Re­search Papers

Dan Hendrycks2 Jul 2022 0:09 UTC
30 points
0 comments1 min readLW link
(arxiv.org)

Where I agree and dis­agree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC
684 points
191 comments20 min readLW link

An­nounc­ing the In­verse Scal­ing Prize ($250k Prize Pool)

27 Jun 2022 15:58 UTC
157 points
12 comments7 min readLW link

What Is The True Name of Mo­du­lar­ity?

1 Jul 2022 14:55 UTC
19 points
3 comments12 min readLW link

AGI Ruin: A List of Lethalities

Eliezer Yudkowsky5 Jun 2022 22:05 UTC
666 points
629 comments30 min readLW link

Will Ca­pa­bil­ities Gen­er­al­ise More?

Ramana Kumar29 Jun 2022 17:12 UTC
52 points
10 comments4 min readLW link

An­nounc­ing Epoch: A re­search or­ga­ni­za­tion in­ves­ti­gat­ing the road to Trans­for­ma­tive AI

27 Jun 2022 13:55 UTC
92 points
2 comments2 min readLW link
(epochai.org)

AI-Writ­ten Cri­tiques Help Hu­mans No­tice Flaws

paulfchristiano25 Jun 2022 17:22 UTC
129 points
5 comments3 min readLW link
(openai.com)

AXRP Epi­sode 16 - Prepar­ing for De­bate AI with Ge­offrey Irving

DanielFilan1 Jul 2022 22:20 UTC
11 points
0 comments37 min readLW link

For­mal Philos­o­phy and Align­ment Pos­si­ble Projects

Whispermute30 Jun 2022 10:42 UTC
31 points
5 comments8 min readLW link

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 13:10 UTC
200 points
36 comments10 min readLW link

Let’s See You Write That Cor­rigi­bil­ity Tag

Eliezer Yudkowsky19 Jun 2022 21:11 UTC
106 points
65 comments1 min readLW link

Gra­di­ent hack­ing: defi­ni­tions and examples

Richard_Ngo29 Jun 2022 21:35 UTC
19 points
0 comments5 min readLW link

La­tent Ad­ver­sar­ial Training

Adam Jermyn29 Jun 2022 20:04 UTC
18 points
3 comments5 min readLW link

Six Di­men­sions of Oper­a­tional Ad­e­quacy in AGI Projects

Eliezer Yudkowsky30 May 2022 17:00 UTC
255 points
65 comments13 min readLW link

A trans­parency and in­ter­pretabil­ity tech tree

evhub16 Jun 2022 23:44 UTC
111 points
9 comments19 min readLW link

Causal con­fu­sion as an ar­gu­ment against the scal­ing hypothesis

20 Jun 2022 10:54 UTC
80 points
26 comments18 min readLW link

Godzilla Strategies

johnswentworth11 Jun 2022 15:44 UTC
134 points
61 comments3 min readLW link

What suc­cess looks like

28 Jun 2022 14:38 UTC
19 points
4 comments1 min readLW link
(forum.effectivealtruism.org)

Pivotal out­comes and pivotal processes

Andrew_Critch17 Jun 2022 23:43 UTC
72 points
32 comments4 min readLW link