A trans­parency and in­ter­pretabil­ity tech tree

evhub16 Jun 2022 23:44 UTC
163 points
11 comments18 min readLW link1 review

BBC Fu­ture cov­ers progress studies

jasoncrawford16 Jun 2022 22:44 UTC
21 points
6 comments3 min readLW link
(rootsofprogress.org)

Hu­mans are very re­li­able agents

alyssavance16 Jun 2022 22:02 UTC
266 points
35 comments3 min readLW link

Towards Gears-Level Un­der­stand­ing of Agency

Thane Ruthenis16 Jun 2022 22:00 UTC
23 points
4 comments18 min readLW link

A pos­si­ble AI-in­oc­u­la­tion due to early “robot up­ris­ing”

shminux16 Jun 2022 21:21 UTC
16 points
2 comments1 min readLW link

AI Risk, as Seen on Snapchat

dkirmani16 Jun 2022 19:31 UTC
23 points
8 comments1 min readLW link

[Link] “The mad­ness of re­duced med­i­cal di­ag­nos­tics” by Dynomight

Kenny16 Jun 2022 19:20 UTC
16 points
25 comments1 min readLW link

Break­ing Down Goal-Directed Behaviour

Oliver Sourbut16 Jun 2022 18:45 UTC
11 points
1 comment2 min readLW link

Per­ils of op­ti­miz­ing in so­cial contexts

owencb16 Jun 2022 17:40 UTC
50 points
1 comment2 min readLW link

Don’t Over-Op­ti­mize Things

owencb16 Jun 2022 16:33 UTC
27 points
6 comments4 min readLW link

[Question] Se­cu­rity anal­y­sis of ‘cloud chem­istry labs’?

Kenny16 Jun 2022 16:06 UTC
6 points
2 comments1 min readLW link

Covid 6/​16/​22: Do Not Hand it to Them

Zvi16 Jun 2022 14:40 UTC
29 points
5 comments7 min readLW link
(thezvi.wordpress.com)

[Question] Is there a worked ex­am­ple of Ge­or­gian taxes?

Dagon16 Jun 2022 14:07 UTC
8 points
12 comments1 min readLW link

Against Ac­tive Shooter Drills

Zvi16 Jun 2022 13:40 UTC
91 points
30 comments7 min readLW link
(thezvi.wordpress.com)

Ten ex­per­i­ments in mod­u­lar­ity, which we’d like you to run!

16 Jun 2022 9:17 UTC
62 points
3 comments9 min readLW link

[Question] What if LaMDA is in­deed sen­tient /​ self-aware /​ worth hav­ing rights?

RomanS16 Jun 2022 9:10 UTC
22 points
13 comments1 min readLW link

Lifeguards

Akash15 Jun 2022 23:03 UTC
12 points
3 comments2 min readLW link
(forum.effectivealtruism.org)

Ra­tion­al­ity Vienna Hike

Laszlo_Treszkai15 Jun 2022 22:11 UTC
3 points
0 comments1 min readLW link

Con­tra Hofs­tadter on GPT-3 Nonsense

rictic15 Jun 2022 21:53 UTC
236 points
24 comments2 min readLW link

Progress links and tweets, 2022-06-13

jasoncrawford15 Jun 2022 19:47 UTC
12 points
0 comments1 min readLW link
(rootsofprogress.org)

I ap­plied for a MIRI job in 2020. Here’s what hap­pened next.

ViktoriaMalyasova15 Jun 2022 19:37 UTC
82 points
17 comments7 min readLW link

Con­tex­tual Evil

ACrackedPot15 Jun 2022 19:32 UTC
1 point
12 comments2 min readLW link

Multi­gate Priors

Adam Jermyn15 Jun 2022 19:30 UTC
4 points
0 comments3 min readLW link

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee15 Jun 2022 18:08 UTC
42 points
15 comments2 min readLW link

[Question] What are all the AI Align­ment and AI Safety Com­mu­ni­ca­tion Hubs?

Gunnar_Zarncke15 Jun 2022 16:16 UTC
27 points
5 comments1 min readLW link

Ge­or­gism, in theory

Stuart_Armstrong15 Jun 2022 15:20 UTC
40 points
22 comments4 min readLW link

Ber­lin AI Safety Open Meetup June 2022

pranomostro15 Jun 2022 14:33 UTC
12 points
0 comments1 min readLW link

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 13:10 UTC
279 points
53 comments10 min readLW link1 review

Our men­tal build­ing blocks are more differ­ent than I thought

Marius Hobbhahn15 Jun 2022 11:07 UTC
44 points
11 comments14 min readLW link

[Question] Has there been any work on at­tempt­ing to use Pas­cal’s Mug­ging to make an AGI be­have?

Chris_Leong15 Jun 2022 8:33 UTC
7 points
17 comments1 min readLW link

Align­ment Risk Doesn’t Re­quire Superintelligence

JustisMills15 Jun 2022 3:12 UTC
35 points
4 comments2 min readLW link

A But­terfly’s View of Probability

Gabriel Wu15 Jun 2022 2:14 UTC
29 points
17 comments11 min readLW link

[Question] Favourite new AI pro­duc­tivity tools?

Gabe M15 Jun 2022 1:08 UTC
14 points
5 comments1 min readLW link

Will vague “AI sen­tience” con­cerns do more for AI safety than any­thing else we might do?

Aryeh Englander14 Jun 2022 23:53 UTC
15 points
2 comments1 min readLW link

Yes, AI re­search will be sub­stan­tially cur­tailed if a lab causes a ma­jor disaster

lc14 Jun 2022 22:17 UTC
103 points
31 comments2 min readLW link

Slow mo­tion videos as AI risk in­tu­ition pumps

Andrew_Critch14 Jun 2022 19:31 UTC
237 points
41 comments2 min readLW link1 review

Cryp­to­graphic Life: How to tran­scend in a sub-light­speed world via Ho­mo­mor­phic encryption

Golol14 Jun 2022 19:22 UTC
1 point
0 comments3 min readLW link

Blake Richards on Why he is Skep­ti­cal of Ex­is­ten­tial Risk from AI

Michaël Trazzi14 Jun 2022 19:09 UTC
41 points
12 comments4 min readLW link
(theinsideview.ai)

[Question] How Do You Quan­tify [Physics In­ter­fac­ing] Real World Ca­pa­bil­ities?

DragonGod14 Jun 2022 14:49 UTC
17 points
1 comment4 min readLW link

Was the In­dus­trial Revolu­tion The In­dus­trial Revolu­tion?

Davis Kedrosky14 Jun 2022 14:48 UTC
29 points
0 comments12 min readLW link
(daviskedrosky.substack.com)

In­ves­ti­gat­ing causal un­der­stand­ing in LLMs

14 Jun 2022 13:57 UTC
28 points
6 comments13 min readLW link

Why multi-agent safety is im­por­tant

Akbir Khan14 Jun 2022 9:23 UTC
10 points
2 comments10 min readLW link

[Question] Was Eliezer Yud­kowsky right to give him­self 10% to suc­ceed with HPMoR in 2010?

momom214 Jun 2022 7:00 UTC
2 points
2 comments1 min readLW link

Re­sources I send to AI re­searchers about AI safety

Vael Gates14 Jun 2022 2:24 UTC
69 points
12 comments1 min readLW link

Vael Gates: Risks from Ad­vanced AI (June 2022)

Vael Gates14 Jun 2022 0:54 UTC
38 points
2 comments30 min readLW link

Cam­bridge LW Meetup: Per­sonal Finance

Tony Wang14 Jun 2022 0:12 UTC
3 points
0 comments1 min readLW link

OpenAI: GPT-based LLMs show abil­ity to dis­crim­i­nate be­tween its own wrong an­swers, but in­abil­ity to ex­plain how/​why it makes that dis­crim­i­na­tion, even as model scales

Aditya Jain13 Jun 2022 23:33 UTC
14 points
5 comments1 min readLW link
(openai.com)

[Question] Who said some­thing like “The fact that putting 2 ap­ples next to 2 other ap­ples leads to there be­ing 4 ap­ples there has noth­ing to do with the fact that 2 + 2 = 4”?

hunterglenn13 Jun 2022 22:23 UTC
1 point
2 comments1 min readLW link

Con­ti­nu­ity Assumptions

Jan_Kulveit13 Jun 2022 21:31 UTC
35 points
13 comments4 min readLW link

Crypto-fed Computation

aaguirre13 Jun 2022 21:20 UTC
23 points
7 comments7 min readLW link