Please help us com­mu­ni­cate AI xrisk. It could save the world.

otto.barten4 Jul 2022 21:47 UTC
4 points
7 comments2 min readLW link

Bench­mark for suc­cess­ful con­cept ex­trap­o­la­tion/​avoid­ing goal misgeneralization

Stuart_Armstrong4 Jul 2022 20:48 UTC
82 points
12 comments4 min readLW link

Pro­ce­du­ral Ex­ec­u­tive Func­tion, Part 1

DaystarEld4 Jul 2022 18:51 UTC
33 points
2 comments13 min readLW link
(daystareld.com)

An­thropic’s SoLU (Soft­max Lin­ear Unit)

Joel Burget4 Jul 2022 18:38 UTC
21 points
1 comment4 min readLW link
(transformer-circuits.pub)

Book Re­view: The Righ­teous Mind

ErnestScribbler4 Jul 2022 17:45 UTC
33 points
8 comments35 min readLW link

My Most Likely Rea­son to Die Young is AI X-Risk

AISafetyIsNotLongtermist4 Jul 2022 17:08 UTC
61 points
24 comments4 min readLW link
(forum.effectivealtruism.org)

Is Gen­eral In­tel­li­gence “Com­pact”?

DragonGod4 Jul 2022 13:27 UTC
27 points
6 comments22 min readLW link

Re­mak­ing Effi­cien­tZero (as best I can)

Hoagy4 Jul 2022 11:03 UTC
36 points
9 comments22 min readLW link

We Need a Con­soli­dated List of Bad AI Align­ment Solutions

Double4 Jul 2022 6:54 UTC
9 points
14 comments1 min readLW link

AI Fore­cast­ing: One Year In

jsteinhardt4 Jul 2022 5:10 UTC
132 points
12 comments6 min readLW link
(bounded-regret.ghost.io)

A com­pressed take on re­cent disagreements

kman4 Jul 2022 4:39 UTC
33 points
9 comments1 min readLW link

New US Se­nate Bill on X-Risk Miti­ga­tion [Linkpost]

Evan R. Murphy4 Jul 2022 1:25 UTC
35 points
12 comments1 min readLW link
(www.hsgac.senate.gov)

Monthly Shorts 6/​22

Celer3 Jul 2022 23:40 UTC
5 points
2 comments5 min readLW link
(keller.substack.com)

De­ci­sion the­ory and dy­namic inconsistency

paulfchristiano3 Jul 2022 22:20 UTC
79 points
33 comments10 min readLW link
(sideways-view.com)

Five routes of ac­cess to sci­en­tific literature

DirectedEvolution3 Jul 2022 20:53 UTC
13 points
4 comments6 min readLW link

Toni Kurz and the In­san­ity of Climb­ing Mountains

GeneSmith3 Jul 2022 20:51 UTC
268 points
67 comments11 min readLW link2 reviews

Won­der and The Golden AI Rule

JeffreyK3 Jul 2022 18:21 UTC
0 points
4 comments6 min readLW link

Evolu­tion Doesn’t Have Feelings

UtilityMonster3 Jul 2022 17:13 UTC
−1 points
0 comments1 min readLW link

Na­ture ab­hors an im­mutable repli­ca­tor… usually

MSRayne3 Jul 2022 15:08 UTC
28 points
10 comments3 min readLW link

Post hoc jus­tifi­ca­tions as Com­pres­sion Algorithm

Johannes C. Mayer3 Jul 2022 5:02 UTC
8 points
0 comments1 min readLW link

SOMA—A story about Consciousness

Johannes C. Mayer3 Jul 2022 4:46 UTC
10 points
0 comments1 min readLW link
(www.youtube.com)

Sex­ual self-acceptance

Johannes C. Mayer3 Jul 2022 4:26 UTC
11 points
6 comments1 min readLW link

Dono­hue, Le­vitt, Roe, and Wade: T-minus 20 years to a mas­sive crime wave?

Paul Logan3 Jul 2022 3:03 UTC
−24 points
6 comments3 min readLW link
(laulpogan.substack.com)

Can we achieve AGI Align­ment by bal­anc­ing mul­ti­ple hu­man ob­jec­tives?

Ben Smith3 Jul 2022 2:51 UTC
11 points
1 comment4 min readLW link

Trig­ger-Ac­tion Planning

CFAR!Duncan3 Jul 2022 1:42 UTC
81 points
14 comments13 min readLW link2 reviews

[Question] Which one of these two aca­demic routes should I take to end up in AI Safety?

Martín Soto3 Jul 2022 1:05 UTC
5 points
2 comments1 min readLW link

Naive Hy­pothe­ses on AI Alignment

Shoshannah Tekofsky2 Jul 2022 19:03 UTC
98 points
29 comments5 min readLW link

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

Gabriel Mukobi2 Jul 2022 18:36 UTC
24 points
0 comments14 min readLW link

Fol­low along with Columbia EA’s Ad­vanced AI Safety Fel­low­ship!

RohanS2 Jul 2022 17:45 UTC
3 points
0 comments2 min readLW link
(forum.effectivealtruism.org)

Wel­come to Analo­gia! (Chap­ter 7)

Justin Bullock2 Jul 2022 17:04 UTC
5 points
0 comments11 min readLW link

[Question] What about tran­shu­mans and be­yond?

AlignmentMirror2 Jul 2022 13:58 UTC
7 points
6 comments1 min readLW link

Goal-di­rect­ed­ness: tack­ling complexity

Morgan_Rogers2 Jul 2022 13:51 UTC
8 points
0 comments38 min readLW link

Liter­a­ture recom­men­da­tions July 2022

ChristianKl2 Jul 2022 9:14 UTC
17 points
9 comments1 min readLW link

Deon­tolog­i­cal Evil

lsusr2 Jul 2022 6:57 UTC
38 points
4 comments2 min readLW link

Could an AI Align­ment Sand­box be use­ful?

Michael Soareverix2 Jul 2022 5:06 UTC
2 points
1 comment1 min readLW link

Five views of Bayes’ Theorem

Adam Scherlis2 Jul 2022 2:25 UTC
38 points
4 comments1 min readLW link

[Linkpost] Ex­is­ten­tial Risk Anal­y­sis in Em­piri­cal Re­search Papers

Dan H2 Jul 2022 0:09 UTC
40 points
0 comments1 min readLW link
(arxiv.org)

Agenty AGI – How Tempt­ing?

PeterMcCluskey1 Jul 2022 23:40 UTC
22 points
3 comments5 min readLW link
(www.bayesianinvestor.com)

AXRP Epi­sode 16 - Prepar­ing for De­bate AI with Ge­offrey Irving

DanielFilan1 Jul 2022 22:20 UTC
20 points
0 comments37 min readLW link

[Question] Ex­am­ples of prac­ti­cal im­pli­ca­tions of Judea Pearl’s Causal­ity work

ChristianKl1 Jul 2022 20:58 UTC
23 points
6 comments1 min readLW link

Minerva

Algon1 Jul 2022 20:06 UTC
36 points
6 comments2 min readLW link
(ai.googleblog.com)

Disarm­ing status

sano1 Jul 2022 20:00 UTC
−4 points
1 comment6 min readLW link

Paper: Fore­cast­ing world events with neu­ral nets

1 Jul 2022 19:40 UTC
39 points
3 comments4 min readLW link

Refram­ing the AI Risk

Thane Ruthenis1 Jul 2022 18:44 UTC
26 points
7 comments6 min readLW link

Who is this MSRayne per­son any­way?

MSRayne1 Jul 2022 17:32 UTC
32 points
30 comments11 min readLW link

Limer­ence Messes Up Your Ra­tion­al­ity Real Bad, Yo

Raemon1 Jul 2022 16:53 UTC
121 points
41 comments3 min readLW link2 reviews

[Link] On the para­dox of tol­er­ance in re­la­tion to fas­cism and on­line con­tent mod­er­a­tion – Un­sta­ble Ontology

Kenny1 Jul 2022 16:43 UTC
5 points
0 comments1 min readLW link

Trends in GPU price-performance

1 Jul 2022 15:51 UTC
85 points
12 comments1 min readLW link1 review
(epochai.org)

[Question] How to deal with non-schedu­la­ble one-off stim­u­lus-re­sponse-pair-like situ­a­tions when plan­ning/​or­ganis­ing pro­jects?

mikbp1 Jul 2022 15:22 UTC
2 points
3 comments1 min readLW link

What Is The True Name of Mo­du­lar­ity?

1 Jul 2022 14:55 UTC
38 points
10 comments12 min readLW link