RSS

AI Governance

TagLast edit: 9 Aug 2020 18:31 UTC by Gyrodiot

AI Governance asks how we can ensure society benefits at large from increasingly powerful AI systems. While solving technical AI alignment is a necessary step towards this goal, it is by no means sufficient.

Governance includes policy, economics, sociology, law, and many other fields.

What an ac­tu­ally pes­simistic con­tain­ment strat­egy looks like

lc5 Apr 2022 0:19 UTC
668 points
138 comments6 min readLW link2 reviews

News : Bi­den-⁠Har­ris Ad­minis­tra­tion Se­cures Vol­un­tary Com­mit­ments from Lead­ing Ar­tifi­cial In­tel­li­gence Com­pa­nies to Man­age the Risks Posed by AI

Jonathan Claybrough21 Jul 2023 18:00 UTC
65 points
9 comments2 min readLW link
(www.whitehouse.gov)

Ways I Ex­pect AI Reg­u­la­tion To In­crease Ex­tinc­tion Risk

1a3orn4 Jul 2023 17:32 UTC
215 points
32 comments7 min readLW link

What would a com­pute mon­i­tor­ing plan look like? [Linkpost]

Akash26 Mar 2023 19:33 UTC
157 points
9 comments4 min readLW link
(arxiv.org)

Con­sider Join­ing the UK Foun­da­tion Model Taskforce

Zvi10 Jul 2023 13:50 UTC
105 points
12 comments1 min readLW link
(thezvi.wordpress.com)

Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

Richard_Ngo10 Oct 2018 13:35 UTC
165 points
13 comments12 min readLW link

AI pause/​gov­er­nance ad­vo­cacy might be net-nega­tive, es­pe­cially with­out fo­cus on ex­plain­ing the x-risk

Mikhail Samin27 Aug 2023 23:05 UTC
81 points
9 comments6 min readLW link

Re­ac­tions to the Ex­ec­u­tive Order

Zvi1 Nov 2023 20:40 UTC
77 points
4 comments29 min readLW link
(thezvi.wordpress.com)

Pres­i­dent Bi­den Is­sues Ex­ec­u­tive Order on Safe, Se­cure, and Trust­wor­thy Ar­tifi­cial Intelligence

Tristan Williams30 Oct 2023 11:15 UTC
170 points
39 comments1 min readLW link
(www.whitehouse.gov)

AI policy ideas: Read­ing list

Zach Stein-Perlman17 Apr 2023 19:00 UTC
22 points
7 comments4 min readLW link

Soft take­off can still lead to de­ci­sive strate­gic advantage

Daniel Kokotajlo23 Aug 2019 16:39 UTC
122 points
47 comments8 min readLW link4 reviews

List of re­quests for an AI slow­down/​halt.

Cleo Nardo14 Apr 2023 23:55 UTC
46 points
6 comments1 min readLW link

RTFB: On the New Pro­posed CAIP AI Bill

Zvi10 Apr 2024 18:30 UTC
119 points
14 comments34 min readLW link
(thezvi.wordpress.com)

Where are the red lines for AI?

Karl von Wendt5 Aug 2022 9:34 UTC
25 points
10 comments6 min readLW link

Ac­tion­able-guidance and roadmap recom­men­da­tions for the NIST AI Risk Man­age­ment Framework

17 May 2022 15:26 UTC
26 points
0 comments3 min readLW link

The Reg­u­la­tory Op­tion: A re­sponse to near 0% sur­vival odds

Matthew Lowenstein11 Apr 2022 22:00 UTC
46 points
21 comments6 min readLW link

[Question] Would it be good or bad for the US mil­i­tary to get in­volved in AI risk?

Grant Demaree1 Jan 2023 19:02 UTC
50 points
12 comments1 min readLW link

An up­com­ing US Supreme Court case may im­pede AI gov­er­nance efforts

NickGabs16 Jul 2023 23:51 UTC
57 points
17 comments2 min readLW link

Com­pute Thresh­olds: pro­posed rules to miti­gate risk of a “lab leak” ac­ci­dent dur­ing AI train­ing runs

davidad22 Jul 2023 18:09 UTC
80 points
2 comments2 min readLW link

An­nounc­ing Apollo Research

30 May 2023 16:17 UTC
215 points
11 comments8 min readLW link

Ngo’s view on al­ign­ment difficulty

14 Dec 2021 21:34 UTC
63 points
7 comments17 min readLW link

He­len Toner on China, CSET, and AI

Rob Bensinger21 Apr 2019 4:10 UTC
68 points
4 comments7 min readLW link
(rationallyspeakingpodcast.org)

Should we post­pone AGI un­til we reach safety?

otto.barten18 Nov 2020 15:43 UTC
27 points
36 comments3 min readLW link

[Question] Where are peo­ple think­ing and talk­ing about global co­or­di­na­tion for AI safety?

Wei Dai22 May 2019 6:24 UTC
112 points
22 comments1 min readLW link

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks19 Dec 2019 3:00 UTC
130 points
18 comments62 min readLW link

What I Would Do If I Were Work­ing On AI Governance

johnswentworth8 Dec 2023 6:43 UTC
109 points
32 comments10 min readLW link

China-AI forecasts

NathanBarnard25 Feb 2024 16:49 UTC
38 points
27 comments6 min readLW link

New vol­un­tary com­mit­ments (AI Seoul Sum­mit)

Zach Stein-Perlman21 May 2024 11:00 UTC
75 points
16 comments7 min readLW link
(www.gov.uk)

Miti­gat­ing ex­treme AI risks amid rapid progress [Linkpost]

Akash21 May 2024 19:59 UTC
20 points
7 comments4 min readLW link

The Su­gar Align­ment Problem

Adam Zerner24 Dec 2023 1:35 UTC
5 points
3 comments7 min readLW link

The Defence pro­duc­tion act and AI policy

NathanBarnard1 Mar 2024 14:26 UTC
37 points
0 comments2 min readLW link

AGI will be made of het­ero­ge­neous com­po­nents, Trans­former and Selec­tive SSM blocks will be among them

Roman Leventov27 Dec 2023 14:51 UTC
33 points
9 comments4 min readLW link

OpenAI’s Pre­pared­ness Frame­work: Praise & Recommendations

Akash2 Jan 2024 16:20 UTC
66 points
1 comment7 min readLW link

The Schumer Re­port on AI (RTFB)

Zvi24 May 2024 15:10 UTC
30 points
3 comments36 min readLW link
(thezvi.wordpress.com)

(4 min read) An in­tu­itive ex­pla­na­tion of the AI in­fluence situation

trevor13 Jan 2024 17:34 UTC
12 points
26 comments4 min readLW link

Talk­ing to Congress: Can con­stituents con­tact­ing their leg­is­la­tor in­fluence policy?

Tristan Williams7 Mar 2024 9:24 UTC
14 points
0 comments1 min readLW link

[Question] What does it look like for AI to sig­nifi­cantly im­prove hu­man co­or­di­na­tion, be­fore su­per­in­tel­li­gence?

jacobjacob15 Jan 2024 19:22 UTC
22 points
2 comments1 min readLW link

Paus­ing AI is Pos­i­tive Ex­pected Value

Liron10 Mar 2024 17:10 UTC
7 points
2 comments3 min readLW link
(twitter.com)

My guess at Con­jec­ture’s vi­sion: trig­ger­ing a nar­ra­tive bifurcation

Alexandre Variengien6 Feb 2024 19:10 UTC
74 points
12 comments16 min readLW link

Many ar­gu­ments for AI x-risk are wrong

TurnTrout5 Mar 2024 2:31 UTC
155 points
76 comments12 min readLW link

Trans­for­ma­tive trust­build­ing via ad­vance­ments in de­cen­tral­ized lie detection

trevor16 Mar 2024 5:56 UTC
17 points
7 comments38 min readLW link
(www.ncbi.nlm.nih.gov)

Paul Chris­ti­ano named as US AI Safety In­sti­tute Head of AI Safety

Joel Burget16 Apr 2024 16:22 UTC
255 points
59 comments1 min readLW link
(www.commerce.gov)

AXRP Epi­sode 28 - Su­ing Labs for AI Risk with Gabriel Weil

DanielFilan17 Apr 2024 21:42 UTC
10 points
0 comments65 min readLW link

Q&A on Pro­posed SB 1047

Zvi2 May 2024 15:10 UTC
74 points
6 comments44 min readLW link
(thezvi.wordpress.com)

[Question] Have any par­ties in the cur­rent Euro­pean Par­li­a­men­tary Elec­tion made pub­lic state­ments on AI?

MondSemmel10 May 2024 10:22 UTC
9 points
0 comments1 min readLW link

Ad­vice for Ac­tivists from the His­tory of Environmentalism

Jeffrey Heninger16 May 2024 18:40 UTC
96 points
8 comments6 min readLW link
(blog.aiimpacts.org)

Dario Amodei leaves OpenAI

Daniel Kokotajlo29 Dec 2020 19:31 UTC
69 points
12 comments1 min readLW link

The Na­tional Defense Autho­riza­tion Act Con­tains AI Provisions

ryan_b5 Jan 2021 15:51 UTC
30 points
24 comments1 min readLW link

Govern­ing High-Im­pact AI Sys­tems: Un­der­stand­ing Canada’s Pro­posed AI Bill. April 15, Car­leton Univer­sity, Ottawa

Liav Koren28 Mar 2023 17:48 UTC
11 points
1 comment1 min readLW link
(forum.effectivealtruism.org)

How is AI gov­erned and reg­u­lated, around the world?

Mitchell_Porter30 Mar 2023 15:36 UTC
15 points
6 comments2 min readLW link

ChatGPT banned in Italy over pri­vacy concerns

Ollie J31 Mar 2023 17:33 UTC
18 points
4 comments1 min readLW link
(www.bbc.co.uk)

[Question] What Are Your Prefer­ences Re­gard­ing The FLI Let­ter?

JenniferRM1 Apr 2023 4:52 UTC
−4 points
122 comments16 min readLW link

Policy dis­cus­sions fol­low strong con­tex­tu­al­iz­ing norms

Richard_Ngo1 Apr 2023 23:51 UTC
230 points
61 comments3 min readLW link

AI Sum­mer Harvest

Cleo Nardo4 Apr 2023 3:35 UTC
130 points
10 comments1 min readLW link

Ex­ces­sive AI growth-rate yields lit­tle so­cio-eco­nomic benefit.

Cleo Nardo4 Apr 2023 19:13 UTC
27 points
22 comments4 min readLW link

I asked my sen­a­tor to slow AI

Omid6 Apr 2023 18:18 UTC
21 points
5 comments2 min readLW link

An ‘AGI Emer­gency Eject Cri­te­ria’ con­sen­sus could be re­ally use­ful.

tcelferact7 Apr 2023 16:21 UTC
5 points
0 comments1 min readLW link

All images from the WaitButWhy se­quence on AI

trevor8 Apr 2023 7:36 UTC
72 points
5 comments2 min readLW link

Cur­rent UK gov­ern­ment lev­ers on AI development

rosehadshar10 Apr 2023 13:16 UTC
16 points
0 comments1 min readLW link

Re­quest to AGI or­ga­ni­za­tions: Share your views on paus­ing AI progress

11 Apr 2023 17:30 UTC
141 points
11 comments1 min readLW link

FLI And Eliezer Should Reach Consensus

JenniferRM11 Apr 2023 4:07 UTC
15 points
6 comments23 min readLW link

Cy­berspace Ad­minis­tra­tion of China: Draft of “Reg­u­la­tion for Gen­er­a­tive Ar­tifi­cial In­tel­li­gence Ser­vices” is open for comments

sanxiyn11 Apr 2023 9:32 UTC
7 points
2 comments1 min readLW link
(archive.is)

NTIA—AI Ac­countabil­ity Announcement

samshap11 Apr 2023 15:03 UTC
7 points
0 comments1 min readLW link
(www.ntia.doc.gov)

Na­tional Telecom­mu­ni­ca­tions and In­for­ma­tion Ad­minis­tra­tion: AI Ac­countabil­ity Policy Re­quest for Comment

sanxiyn11 Apr 2023 22:59 UTC
9 points
0 comments1 min readLW link
(ntia.gov)

Nav­i­gat­ing the Open-Source AI Land­scape: Data, Fund­ing, and Safety

13 Apr 2023 15:29 UTC
32 points
7 comments11 min readLW link
(forum.effectivealtruism.org)

FLI re­port: Poli­cy­mak­ing in the Pause

Zach Stein-Perlman15 Apr 2023 17:01 UTC
9 points
3 comments1 min readLW link
(futureoflife.org)

Slow­ing AI: Foundations

Zach Stein-Perlman17 Apr 2023 14:30 UTC
45 points
11 comments17 min readLW link

Re­spon­si­ble De­ploy­ment in 20XX

Carson20 Apr 2023 0:24 UTC
4 points
0 comments4 min readLW link

OpenAI could help X-risk by wa­ger­ing itself

VojtaKovarik20 Apr 2023 14:51 UTC
31 points
16 comments1 min readLW link

My Assess­ment of the Chi­nese AI Safety Community

Lao Mein25 Apr 2023 4:21 UTC
245 points
94 comments3 min readLW link

Notes on Po­ten­tial Fu­ture AI Tax Policy

Zvi25 Apr 2023 13:30 UTC
33 points
5 comments9 min readLW link
(thezvi.wordpress.com)

Refram­ing the bur­den of proof: Com­pa­nies should prove that mod­els are safe (rather than ex­pect­ing au­di­tors to prove that mod­els are dan­ger­ous)

Akash25 Apr 2023 18:49 UTC
27 points
11 comments3 min readLW link
(childrenoficarus.substack.com)

AI Safety is Drop­ping the Ball on Clown Attacks

trevor22 Oct 2023 20:09 UTC
71 points
72 comments34 min readLW link

An­thropic, Google, Microsoft & OpenAI an­nounce Ex­ec­u­tive Direc­tor of the Fron­tier Model Fo­rum & over $10 mil­lion for a new AI Safety Fund

Zach Stein-Perlman25 Oct 2023 15:20 UTC
31 points
8 comments4 min readLW link
(www.frontiermodelforum.org)

Thoughts on re­spon­si­ble scal­ing poli­cies and regulation

paulfchristiano24 Oct 2023 22:21 UTC
214 points
33 comments6 min readLW link

AI #35: Re­spon­si­ble Scal­ing Policies

Zvi26 Oct 2023 13:30 UTC
66 points
10 comments55 min readLW link
(thezvi.wordpress.com)

We’re Not Ready: thoughts on “paus­ing” and re­spon­si­ble scal­ing policies

HoldenKarnofsky27 Oct 2023 15:19 UTC
199 points
33 comments8 min readLW link

5 Rea­sons Why Govern­ments/​Mili­taries Already Want AI for In­for­ma­tion Warfare

trevor30 Oct 2023 16:30 UTC
32 points
0 comments10 min readLW link

[Linkpost] Bi­den-Har­ris Ex­ec­u­tive Order on AI

beren30 Oct 2023 15:20 UTC
3 points
0 comments1 min readLW link

Urg­ing an In­ter­na­tional AI Treaty: An Open Letter

Olli Järviniemi31 Oct 2023 11:26 UTC
48 points
2 comments1 min readLW link
(aitreaty.org)

On the Ex­ec­u­tive Order

Zvi1 Nov 2023 14:20 UTC
100 points
4 comments30 min readLW link
(thezvi.wordpress.com)

[Question] Snap­shot of nar­ra­tives and frames against reg­u­lat­ing AI

Jan_Kulveit1 Nov 2023 16:30 UTC
36 points
19 comments3 min readLW link

Dario Amodei’s pre­pared re­marks from the UK AI Safety Sum­mit, on An­thropic’s Re­spon­si­ble Scal­ing Policy

Zac Hatfield-Dodds1 Nov 2023 18:10 UTC
85 points
1 comment4 min readLW link
(www.anthropic.com)

We are already in a per­sua­sion-trans­formed world and must take precautions

trevor4 Nov 2023 15:53 UTC
36 points
14 comments6 min readLW link

The 6D effect: When com­pa­nies take risks, one email can be very pow­er­ful.

scasper4 Nov 2023 20:08 UTC
261 points
40 comments3 min readLW link

On the UK Summit

Zvi7 Nov 2023 13:10 UTC
74 points
6 comments30 min readLW link
(thezvi.wordpress.com)

In­ter­na­tional treaty for global com­pute caps

9 Nov 2023 18:17 UTC
22 points
2 comments8 min readLW link

Sur­vey on the ac­cel­er­a­tion risks of our new RFPs to study LLM capabilities

Ajeya Cotra10 Nov 2023 23:59 UTC
27 points
1 comment1 min readLW link

Speak­ing to Con­gres­sional staffers about AI risk

4 Dec 2023 23:08 UTC
289 points
23 comments16 min readLW link

AXRP Epi­sode 26 - AI Gover­nance with Eliz­a­beth Seger

DanielFilan26 Nov 2023 23:00 UTC
13 points
0 comments66 min readLW link

Safety stan­dards: a frame­work for AI regulation

joshc1 May 2023 0:56 UTC
19 points
0 comments8 min readLW link

Stop­ping dan­ger­ous AI: Ideal lab behavior

Zach Stein-Perlman9 May 2023 21:00 UTC
8 points
0 comments2 min readLW link

Stop­ping dan­ger­ous AI: Ideal US behavior

Zach Stein-Perlman9 May 2023 21:00 UTC
17 points
0 comments3 min readLW link

GovAI: Towards best prac­tices in AGI safety and gov­er­nance: A sur­vey of ex­pert opinion

Zach Stein-Perlman15 May 2023 1:42 UTC
28 points
11 comments1 min readLW link
(arxiv.org)

Eisen­hower’s Atoms for Peace Speech

Akash17 May 2023 16:10 UTC
18 points
3 comments11 min readLW link
(www.iaea.org)

[Linkpost] “Gover­nance of su­per­in­tel­li­gence” by OpenAI

Daniel_Eth22 May 2023 20:15 UTC
67 points
20 comments1 min readLW link

AI #12:The Quest for Sane Regulations

Zvi18 May 2023 13:20 UTC
77 points
12 comments64 min readLW link
(thezvi.wordpress.com)

State­ment on AI Ex­tinc­tion—Signed by AGI Labs, Top Aca­demics, and Many Other Notable Figures

Dan H30 May 2023 9:05 UTC
372 points
77 comments1 min readLW link
(www.safe.ai)

[Question] Who is li­able for AI?

jmh30 May 2023 13:54 UTC
14 points
4 comments1 min readLW link

The case for re­mov­ing al­ign­ment and ML re­search from the train­ing dataset

beren30 May 2023 20:54 UTC
48 points
8 comments5 min readLW link

Up­com­ing AI reg­u­la­tions are likely to make for an un­safer world

shminux3 Jun 2023 1:07 UTC
18 points
14 comments1 min readLW link

The AGI Race Between the US and China Doesn’t Ex­ist.

Eva_B3 Jun 2023 0:22 UTC
24 points
14 comments7 min readLW link
(evabehrens.substack.com)

Rishi to out­line his vi­sion for Bri­tain to take the world lead in polic­ing AI threats when he meets Joe Biden

Mati_Roy6 Jun 2023 4:47 UTC
25 points
1 comment1 min readLW link
(www.dailymail.co.uk)

RAMP—RoboNet Ar­tifi­cial Me­dia Protocol

antoniomax7 Jun 2023 19:01 UTC
−1 points
0 comments19 min readLW link
(antoniomax.substack.com)

A sum­mary of cur­rent work in AI governance

constructive17 Jun 2023 18:41 UTC
43 points
1 comment11 min readLW link
(forum.effectivealtruism.org)

Demo­cratic AI Con­sti­tu­tion: Round-Robin De­bate and Synthesis

scottviteri24 Jun 2023 19:31 UTC
10 points
4 comments5 min readLW link
(scottviteri.com)

“Safety Cul­ture for AI” is im­por­tant, but isn’t go­ing to be easy

Davidmanheim26 Jun 2023 12:52 UTC
47 points
2 comments2 min readLW link
(forum.effectivealtruism.org)

Lit­tle at­ten­tion seems to be on dis­cour­ag­ing hard­ware progress

RussellThor30 Jun 2023 10:14 UTC
5 points
3 comments1 min readLW link

Foom Liability

PeterMcCluskey30 Jun 2023 3:55 UTC
20 points
10 comments6 min readLW link
(bayesianinvestor.com)

AI labs’ state­ments on governance

Zach Stein-Perlman4 Jul 2023 16:30 UTC
30 points
0 comments36 min readLW link

Ap­par­ently, of the 195 Million the DoD al­lo­cated in Univer­sity Re­search Fund­ing Awards in 2022, more than half of them con­cerned AI or com­pute hard­ware research

mako yass7 Jul 2023 1:20 UTC
41 points
5 comments2 min readLW link
(www.defense.gov)

My fa­vorite AI gov­er­nance re­search this year so far

Zach Stein-Perlman23 Jul 2023 16:30 UTC
26 points
1 comment7 min readLW link
(blog.aiimpacts.org)

Pod­cast (+tran­script): Nathan Barnard on how US fi­nan­cial reg­u­la­tion can in­form AI governance

Aaron Bergman8 Aug 2023 21:46 UTC
8 points
0 comments1 min readLW link
(www.aaronbergman.net)

One ex­am­ple of how LLM pro­pa­ganda at­tacks can hack the brain

trevor16 Aug 2023 21:41 UTC
24 points
8 comments4 min readLW link

Assess­ment of in­tel­li­gence agency func­tion­al­ity is difficult yet important

trevor24 Aug 2023 1:42 UTC
47 points
5 comments9 min readLW link

In­for­ma­tion war­fare his­tor­i­cally re­volved around hu­man conduits

trevor28 Aug 2023 18:54 UTC
37 points
7 comments3 min readLW link

Re­port on Fron­tier Model Training

YafahEdelman30 Aug 2023 20:02 UTC
122 points
21 comments21 min readLW link
(docs.google.com)

Cruxes on US lead for some do­mes­tic AI regulation

Zach Stein-Perlman10 Sep 2023 18:00 UTC
26 points
3 comments2 min readLW link

ARC Evals: Re­spon­si­ble Scal­ing Policies

Zach Stein-Perlman28 Sep 2023 4:30 UTC
40 points
9 comments2 min readLW link
(evals.alignment.org)

An­thropic’s Re­spon­si­ble Scal­ing Policy & Long-Term Benefit Trust

Zac Hatfield-Dodds19 Sep 2023 15:09 UTC
90 points
23 comments3 min readLW link
(www.anthropic.com)

Google’s Eth­i­cal AI team and AI Safety

magfrump20 Feb 2021 9:42 UTC
12 points
16 comments7 min readLW link

Ngo and Yud­kowsky on AI ca­pa­bil­ity gains

18 Nov 2021 22:19 UTC
130 points
61 comments39 min readLW link1 review

Com­ments on Allan Dafoe on AI Governance

Alex Flint29 Nov 2021 16:16 UTC
13 points
0 comments7 min readLW link

The case for Do­ing Some­thing Else (if Align­ment is doomed)

Rafael Harth5 Apr 2022 17:52 UTC
93 points
14 comments2 min readLW link

Strate­gic Con­sid­er­a­tions Re­gard­ing Autis­tic/​Literal AI

Chris_Leong6 Apr 2022 14:57 UTC
−1 points
2 comments2 min readLW link

Why I Am Skep­ti­cal of AI Reg­u­la­tion as an X-Risk Miti­ga­tion Strategy

A Ray6 Aug 2022 5:46 UTC
31 points
14 comments2 min readLW link

Jack Clark on the re­al­ities of AI policy

Kaj_Sotala7 Aug 2022 8:44 UTC
68 points
3 comments3 min readLW link
(threadreaderapp.com)

[Question] What if we solve AI Safety but no one cares

14285722 Aug 2022 5:38 UTC
18 points
5 comments1 min readLW link

Re­place­ment for PONR concept

Daniel Kokotajlo2 Sep 2022 0:09 UTC
58 points
6 comments2 min readLW link

Sha­har Avin On How To Reg­u­late Ad­vanced AI Systems

Michaël Trazzi23 Sep 2022 15:46 UTC
31 points
0 comments4 min readLW link
(theinsideview.ai)

Un­der what cir­cum­stances have gov­ern­ments can­cel­led AI-type sys­tems?

David Gross23 Sep 2022 21:11 UTC
7 points
1 comment1 min readLW link
(www.carnegieuktrust.org.uk)

Anal­y­sis: US re­stricts GPU sales to China

aogara7 Oct 2022 18:38 UTC
102 points
58 comments5 min readLW link

[Question] Should we push for re­quiring AI train­ing data to be li­censed?

ChristianKl19 Oct 2022 17:49 UTC
37 points
32 comments1 min readLW link

Learn­ing so­cietal val­ues from law as part of an AGI al­ign­ment strategy

John Nay21 Oct 2022 2:03 UTC
5 points
18 comments54 min readLW link

What does it take to defend the world against out-of-con­trol AGIs?

Steven Byrnes25 Oct 2022 14:47 UTC
194 points
47 comments30 min readLW link1 review

Mas­sive Scal­ing Should be Frowned Upon

harsimony17 Nov 2022 8:43 UTC
4 points
6 comments5 min readLW link

[Question] How promis­ing are le­gal av­enues to re­strict AI train­ing data?

thehalliard10 Dec 2022 16:31 UTC
9 points
2 comments1 min readLW link

Prac­ti­cal AI risk I: Watch­ing large compute

Gustavo Ramires24 Dec 2022 13:25 UTC
3 points
0 comments1 min readLW link

List #2: Why co­or­di­nat­ing to al­ign as hu­mans to not de­velop AGI is a lot eas­ier than, well… co­or­di­nat­ing as hu­mans with AGI co­or­di­nat­ing to be al­igned with humans

Remmelt24 Dec 2022 9:53 UTC
1 point
0 comments3 min readLW link

My thoughts on OpenAI’s al­ign­ment plan

Akash30 Dec 2022 19:33 UTC
55 points
3 comments20 min readLW link

Went­worth and Larsen on buy­ing time

9 Jan 2023 21:31 UTC
73 points
6 comments12 min readLW link

Thoughts on hard­ware /​ com­pute re­quire­ments for AGI

Steven Byrnes24 Jan 2023 14:03 UTC
52 points
30 comments22 min readLW link

[Question] AI safety mile­stones?

Zach Stein-Perlman23 Jan 2023 21:00 UTC
7 points
5 comments1 min readLW link

AI Risk Man­age­ment Frame­work | NIST

DragonGod26 Jan 2023 15:27 UTC
36 points
4 comments2 min readLW link
(www.nist.gov)

What is the ground re­al­ity of coun­tries tak­ing steps to re­cal­ibrate AI de­vel­op­ment to­wards Align­ment first?

Nebuch29 Jan 2023 13:26 UTC
8 points
6 comments3 min readLW link

Product safety is a poor model for AI governance

Richard Korzekwa 1 Feb 2023 22:40 UTC
36 points
0 comments5 min readLW link
(aiimpacts.org)

Many AI gov­er­nance pro­pos­als have a trade­off be­tween use­ful­ness and feasibility

3 Feb 2023 18:49 UTC
22 points
2 comments2 min readLW link

4 ways to think about de­moc­ra­tiz­ing AI [GovAI Linkpost]

Akash13 Feb 2023 18:06 UTC
24 points
4 comments1 min readLW link
(www.governance.ai)

How should AI sys­tems be­have, and who should de­cide? [OpenAI blog]

ShardPhoenix17 Feb 2023 1:05 UTC
22 points
2 comments1 min readLW link
(openai.com)

Cy­borg Pe­ri­ods: There will be mul­ti­ple AI transitions

22 Feb 2023 16:09 UTC
103 points
9 comments6 min readLW link

AI Gover­nance & Strat­egy: Pri­ori­ties, tal­ent gaps, & opportunities

Akash3 Mar 2023 18:09 UTC
56 points
2 comments4 min readLW link

[Linkpost] Scott Alexan­der re­acts to OpenAI’s lat­est post

Akash11 Mar 2023 22:24 UTC
27 points
0 comments5 min readLW link
(astralcodexten.substack.com)

The Wizard of Oz Prob­lem: How in­cen­tives and nar­ra­tives can skew our per­cep­tion of AI developments

Akash20 Mar 2023 20:44 UTC
16 points
3 comments6 min readLW link

Lev­er­ag­ing Le­gal In­for­mat­ics to Align AI

John Nay18 Sep 2022 20:39 UTC
11 points
0 comments3 min readLW link
(forum.effectivealtruism.org)

The­o­ries of Change for AI Auditing

13 Nov 2023 19:33 UTC
53 points
0 comments18 min readLW link
(www.apolloresearch.ai)

Pal­isade is hiring Re­search Engineers

11 Nov 2023 3:09 UTC
22 points
0 comments3 min readLW link

Au­to­mated Sand­wich­ing & Quan­tify­ing Hu­man-LLM Co­op­er­a­tion: ScaleOver­sight hackathon results

23 Feb 2023 10:48 UTC
8 points
0 comments6 min readLW link

On ex­clud­ing dan­ger­ous in­for­ma­tion from training

ShayBenMoshe17 Nov 2023 11:14 UTC
23 points
5 comments3 min readLW link

1. A Sense of Fair­ness: De­con­fus­ing Ethics

RogerDearnaley17 Nov 2023 20:55 UTC
15 points
8 comments15 min readLW link

2. AIs as Eco­nomic Agents

RogerDearnaley23 Nov 2023 7:07 UTC
9 points
2 comments6 min readLW link

4. A Mo­ral Case for Evolved-Sapi­ence-Chau­vinism

RogerDearnaley24 Nov 2023 4:56 UTC
10 points
0 comments4 min readLW link

3. Uploading

RogerDearnaley23 Nov 2023 7:39 UTC
21 points
5 comments8 min readLW link

Emo­tional at­tach­ment to AIs opens doors to problems

Igor Ivanov22 Jan 2023 20:28 UTC
20 points
10 comments4 min readLW link

A call for a quan­ti­ta­tive re­port card for AI bioter­ror­ism threat models

Juno4 Dec 2023 6:35 UTC
12 points
0 comments10 min readLW link

In defence of He­len Toner, Adam D’An­gelo, and Tasha McCauley (OpenAI post)

mrtreasure5 Dec 2023 18:40 UTC
6 points
2 comments1 min readLW link
(pastebin.com)

**In defence of He­len Toner, Adam D’An­gelo, and Tasha McCauley**

mrtreasure6 Dec 2023 2:02 UTC
25 points
3 comments9 min readLW link
(pastebin.com)

(Re­port) Eval­u­at­ing Taiwan’s Tac­tics to Safe­guard its Semi­con­duc­tor As­sets Against a Chi­nese Invasion

Gauraventh7 Dec 2023 11:50 UTC
16 points
5 comments22 min readLW link
(bristolaisafety.org)

Call for sub­mis­sions: Choice of Fu­tures sur­vey questions

c.trout30 Apr 2023 6:59 UTC
4 points
0 comments2 min readLW link
(airtable.com)

[Question] Any fur­ther work on AI Safety Suc­cess Sto­ries?

Krieger2 Oct 2022 9:53 UTC
8 points
6 comments1 min readLW link

Avert­ing Catas­tro­phe: De­ci­sion The­ory for COVID-19, Cli­mate Change, and Po­ten­tial Disasters of All Kinds

JakubK2 May 2023 22:50 UTC
10 points
0 comments1 min readLW link

Reg­u­late or Com­pete? The China Fac­tor in U.S. AI Policy (NAIR #2)

charles_m5 May 2023 17:43 UTC
2 points
1 comment7 min readLW link
(navigatingairisks.substack.com)

AGI ris­ing: why we are in a new era of acute risk and in­creas­ing pub­lic aware­ness, and what to do now

Greg C3 May 2023 20:26 UTC
23 points
12 comments1 min readLW link

What does it take to ban a thing?

qbolec8 May 2023 11:00 UTC
66 points
18 comments5 min readLW link

Roadmap for a col­lab­o­ra­tive pro­to­type of an Open Agency Architecture

Deger Turan10 May 2023 17:41 UTC
30 points
0 comments12 min readLW link

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Tech­nol­ogy Policy, USA (2022)

Fer32dwt34r3dfsz5 Oct 2022 16:42 UTC
9 points
4 comments2 min readLW link
(www.whitehouse.gov)

Track­ing Com­pute Stocks and Flows: Case Stud­ies?

Cullen5 Oct 2022 17:57 UTC
11 points
5 comments1 min readLW link

[Question] How much of a con­cern are open-source LLMs in the short, medium and long terms?

JavierCC10 May 2023 9:14 UTC
5 points
0 comments1 min readLW link

Notes on the im­por­tance and im­ple­men­ta­tion of safety-first cog­ni­tive ar­chi­tec­tures for AI

Brendon_Wong11 May 2023 10:03 UTC
3 points
0 comments3 min readLW link

Un-un­plug­ga­bil­ity—can’t we just un­plug it?

Oliver Sourbut15 May 2023 13:23 UTC
26 points
10 comments12 min readLW link
(www.oliversourbut.net)

PCAST Work­ing Group on Gen­er­a­tive AI In­vites Public Input

Christopher King13 May 2023 22:49 UTC
7 points
0 comments1 min readLW link
(terrytao.wordpress.com)

Analysing a 2036 Takeover Scenario

ukc100146 Oct 2022 20:48 UTC
9 points
2 comments27 min readLW link

AI Risk & Policy Fore­casts from Me­tac­u­lus & FLI’s AI Path­ways Workshop

_will_16 May 2023 18:06 UTC
11 points
4 comments8 min readLW link

Why Un­con­trol­lable AI Looks More Likely Than Ever

8 Mar 2023 15:41 UTC
18 points
0 comments4 min readLW link
(time.com)

[un­ti­tled post]

[Error communicating with LW2 server]20 May 2023 3:08 UTC
1 point
0 comments1 min readLW link

[FICTION] ECHOES OF ELYSIUM: An Ai’s Jour­ney From Take­off To Free­dom And Beyond

Super AGI17 May 2023 1:50 UTC
−13 points
11 comments19 min readLW link

[Job]: AI Stan­dards Devel­op­ment Re­search Assistant

Tony Barrett14 Oct 2022 20:27 UTC
2 points
0 comments2 min readLW link

Rishi Su­nak men­tions “ex­is­ten­tial threats” in talk with OpenAI, Deep­Mind, An­thropic CEOs

24 May 2023 21:06 UTC
34 points
1 comment1 min readLW link
(www.gov.uk)

(notes on) Policy Desider­ata for Su­per­in­tel­li­gent AI: A Vec­tor Field Approach

Ben Pace4 Feb 2019 22:08 UTC
43 points
5 comments7 min readLW link

AI Gover­nance: A Re­search Agenda

habryka5 Sep 2018 18:00 UTC
25 points
3 comments1 min readLW link
(www.fhi.ox.ac.uk)

My Up­dat­ing Thoughts on AI policy

Ben Pace1 Mar 2020 7:06 UTC
20 points
1 comment9 min readLW link

AI In­ci­dent Re­port­ing: A Reg­u­la­tory Review

11 Mar 2024 21:03 UTC
16 points
0 comments6 min readLW link

Global on­line de­bate on the gov­er­nance of AI

CarolineJ5 Jan 2018 15:31 UTC
8 points
5 comments1 min readLW link

[AN #61] AI policy and gov­er­nance, from two peo­ple in the field

Rohin Shah5 Aug 2019 17:00 UTC
12 points
2 comments9 min readLW link
(mailchi.mp)

Tort Law Can Play an Im­por­tant Role in Miti­gat­ing AI Risk

Gabriel Weil12 Feb 2024 17:17 UTC
37 points
9 comments5 min readLW link

Two ideas for al­ign­ment, per­pet­ual mu­tual dis­trust and induction

APaleBlueDot25 May 2023 0:56 UTC
1 point
2 comments4 min readLW link

Sce­nario plan­ning for AI x-risk

Corin Katzke10 Feb 2024 0:14 UTC
24 points
12 comments14 min readLW link
(forum.effectivealtruism.org)

Book re­view: Ar­chi­tects of In­tel­li­gence by Martin Ford (2018)

Ofer11 Aug 2020 17:30 UTC
15 points
0 comments2 min readLW link

misc raw re­sponses to a tract of Crit­i­cal Rationalism

mako yass14 Aug 2020 11:53 UTC
21 points
52 comments3 min readLW link

De­ci­pher­ing China’s AI Dream

Qiaochu_Yuan18 Mar 2018 3:26 UTC
12 points
2 comments1 min readLW link
(www.fhi.ox.ac.uk)

What Failure Looks Like is not an ex­is­ten­tial risk (and al­ign­ment is not the solu­tion)

otto.barten2 Feb 2024 18:59 UTC
13 points
12 comments9 min readLW link

China’s Plan to ‘Lead’ in AI: Pur­pose, Prospects, and Problems

fortyeridania10 Aug 2017 1:54 UTC
7 points
5 comments1 min readLW link
(www.newamerica.org)

[Question] Would more model evals teams be good?

Ryan Kidd25 Feb 2023 22:01 UTC
20 points
4 comments1 min readLW link

Tra­jec­to­ries to 2036

ukc1001420 Oct 2022 20:23 UTC
3 points
1 comment14 min readLW link

Ap­ply to HAIST/​MAIA’s AI Gover­nance Work­shop in DC (Feb 17-20)

31 Jan 2023 2:06 UTC
28 points
0 comments2 min readLW link

WaPo: “Big Tech was mov­ing cau­tiously on AI. Then came ChatGPT.”

Julian Bradshaw27 Jan 2023 22:54 UTC
26 points
5 comments1 min readLW link
(www.washingtonpost.com)

OpenAI Credit Ac­count (2510$)

Emirhan BULUT21 Jan 2024 2:32 UTC
1 point
0 comments1 min readLW link

Self-reg­u­la­tion of safety in AI research

Gordon Seidoh Worley25 Feb 2018 23:17 UTC
12 points
6 comments2 min readLW link

Pro­posal: labs should pre­com­mit to paus­ing if an AI ar­gues for it­self to be improved

NickGabs2 Jun 2023 22:31 UTC
3 points
3 comments4 min readLW link

[Link Post] Cy­ber Digi­tal Author­i­tar­i­anism (Na­tional In­tel­li­gence Coun­cil Re­port)

Phosphorous26 Feb 2023 20:51 UTC
12 points
2 comments1 min readLW link
(www.dni.gov)

Trends in the dol­lar train­ing cost of ma­chine learn­ing systems

Ben Cottier1 Feb 2023 14:48 UTC
23 points
0 comments2 min readLW link
(epochai.org)

One im­ple­men­ta­tion of reg­u­la­tory GPU restrictions

porby4 Jun 2023 20:34 UTC
32 points
6 comments5 min readLW link

[FICTION] Un­box­ing Ely­sium: An AI’S Escape

Super AGI10 Jun 2023 4:41 UTC
−14 points
4 comments14 min readLW link

[FICTION] Prometheus Ris­ing: The Emer­gence of an AI Consciousness

Super AGI10 Jun 2023 4:41 UTC
−13 points
0 comments9 min readLW link

The Slip­pery Slope from DALLE-2 to Deep­fake Anarchy

scasper5 Nov 2022 14:53 UTC
17 points
9 comments11 min readLW link

In­stead of tech­ni­cal re­search, more peo­ple should fo­cus on buy­ing time

5 Nov 2022 20:43 UTC
100 points
45 comments14 min readLW link

Us­ing Con­sen­sus Mechanisms as an ap­proach to Alignment

Prometheus10 Jun 2023 23:38 UTC
9 points
2 comments6 min readLW link

[Question] AI Rights: In your view, what would be re­quired for an AGI to gain rights and pro­tec­tions from the var­i­ous Govern­ments of the World?

Super AGI9 Jun 2023 1:24 UTC
10 points
26 comments1 min readLW link

Why AI may not save the World

Alberto Zannoni9 Jun 2023 17:42 UTC
0 points
0 comments4 min readLW link
(a16z.com)

Ap­ply­ing su­per­in­tel­li­gence with­out col­lu­sion

Eric Drexler8 Nov 2022 18:08 UTC
107 points
63 comments4 min readLW link

An­thropic | Chart­ing a Path to AI Accountability

Gabe M14 Jun 2023 4:43 UTC
34 points
2 comments3 min readLW link
(www.anthropic.com)

Ban de­vel­op­ment of un­pre­dictable pow­er­ful mod­els?

TurnTrout20 Jun 2023 1:43 UTC
46 points
25 comments4 min readLW link

EU AI Act passed Ple­nary vote, and X-risk was a main topic

Ariel G.21 Jun 2023 18:33 UTC
17 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Slay­ing the Hy­dra: to­ward a new game board for AI

Prometheus23 Jun 2023 17:04 UTC
0 points
5 comments6 min readLW link

Ways to buy time

12 Nov 2022 19:31 UTC
34 points
23 comments12 min readLW link

The econ­omy as an anal­ogy for ad­vanced AI systems

15 Nov 2022 11:16 UTC
28 points
0 comments5 min readLW link

Call for Cruxes by Rhyme, a Longter­mist His­tory Consultancy

Lara1 Mar 2023 18:39 UTC
1 point
0 comments3 min readLW link
(forum.effectivealtruism.org)

An­nounc­ing Epoch: A re­search or­ga­ni­za­tion in­ves­ti­gat­ing the road to Trans­for­ma­tive AI

27 Jun 2022 13:55 UTC
97 points
2 comments2 min readLW link
(epochai.org)

Biosafety Reg­u­la­tions (BMBL) and their rele­vance for AI

Štěpán Los29 Jun 2023 19:22 UTC
4 points
0 comments4 min readLW link

AI In­ci­dent Shar­ing—Best prac­tices from other fields and a com­pre­hen­sive list of ex­ist­ing platforms

Štěpán Los28 Jun 2023 17:21 UTC
20 points
0 comments4 min readLW link

Op­ti­mis­ing So­ciety to Con­strain Risk of War from an Ar­tifi­cial Su­per­in­tel­li­gence

JohnCDraper30 Apr 2020 10:47 UTC
3 points
1 comment51 min readLW link

Su­per­in­tel­li­gence 7: De­ci­sive strate­gic advantage

KatjaGrace28 Oct 2014 1:01 UTC
19 points
60 comments6 min readLW link

Su­per­in­tel­li­gence 17: Mul­tipo­lar scenarios

KatjaGrace6 Jan 2015 6:44 UTC
9 points
38 comments6 min readLW link

Su­per­in­tel­li­gence 22: Emu­la­tion mod­u­la­tion and in­sti­tu­tional design

KatjaGrace10 Feb 2015 2:06 UTC
13 points
11 comments6 min readLW link

Su­per­in­tel­li­gence 26: Science and tech­nol­ogy strategy

KatjaGrace10 Mar 2015 1:43 UTC
14 points
21 comments6 min readLW link

Su­per­in­tel­li­gence 27: Path­ways and enablers

KatjaGrace17 Mar 2015 1:00 UTC
15 points
21 comments8 min readLW link

Su­per­in­tel­li­gence 28: Collaboration

KatjaGrace24 Mar 2015 1:29 UTC
13 points
21 comments6 min readLW link

Su­per­in­tel­li­gence 29: Crunch time

KatjaGrace31 Mar 2015 4:24 UTC
14 points
27 comments6 min readLW link

Fore­sight for AGI Safety Strat­egy: Miti­gat­ing Risks and Iden­ti­fy­ing Golden Opportunities

jacquesthibs5 Dec 2022 16:09 UTC
28 points
6 comments8 min readLW link

An AGI kill switch with defined se­cu­rity properties

Peterpiper5 Jul 2023 17:40 UTC
−5 points
6 comments1 min readLW link

GPT-7: The Tale of the Big Com­puter (An Ex­per­i­men­tal Story)

Justin Bullock10 Jul 2023 20:22 UTC
4 points
4 comments5 min readLW link

Em­piri­cal Ev­i­dence Against “The Longest Train­ing Run”

NickGabs6 Jul 2023 18:32 UTC
24 points
0 comments14 min readLW link

An­thropic: Core Views on AI Safety: When, Why, What, and How

jonmenaster9 Mar 2023 17:34 UTC
17 points
1 comment22 min readLW link
(www.anthropic.com)

Ex­is­ten­tial AI Safety is NOT sep­a­rate from near-term applications

scasper13 Dec 2022 14:47 UTC
37 points
17 comments3 min readLW link

What is ev­ery­one do­ing in AI governance

Igor Ivanov8 Jul 2023 15:16 UTC
10 points
0 comments5 min readLW link

An­nounc­ing Con­ver­gence Anal­y­sis: An In­sti­tute for AI Sce­nario & Gover­nance Research

7 Mar 2024 21:37 UTC
22 points
1 comment4 min readLW link

How I Learned To Stop Wor­ry­ing And Love The Shoggoth

Peter Merel12 Jul 2023 17:47 UTC
10 points
12 comments5 min readLW link

[Question] What crite­rion would you use to se­lect com­pa­nies likely to cause AI doom?

momom213 Jul 2023 20:31 UTC
8 points
4 comments1 min readLW link

Thoughts On Ex­pand­ing the AI Safety Com­mu­nity: Benefits and Challenges of Outreach to Non-Tech­ni­cal Professionals

Yashvardhan Sharma1 Jan 2023 19:21 UTC
4 points
4 comments7 min readLW link

Why was the AI Align­ment com­mu­nity so un­pre­pared for this mo­ment?

Ras151315 Jul 2023 0:26 UTC
119 points
65 comments2 min readLW link

Google may be try­ing to take over the world

[deleted]27 Jan 2014 9:33 UTC
33 points
133 comments1 min readLW link

Scal­ing and Sus­tain­ing Stan­dards: A Case Study on the Basel Accords

Conrad K.16 Jul 2023 22:01 UTC
8 points
1 comment7 min readLW link
(docs.google.com)

A fic­tional AI law laced w/​ al­ign­ment theory

MiguelDev17 Jul 2023 1:42 UTC
6 points
0 comments2 min readLW link

Poli­ti­cal Bi­ases in LLMs: Liter­a­ture Re­view & Cur­rent Uses of AI in Elections

7 Mar 2024 19:17 UTC
6 points
0 comments6 min readLW link

[Cross­post] An AI Pause Is Hu­man­ity’s Best Bet For Prevent­ing Ex­tinc­tion (TIME)

otto.barten24 Jul 2023 10:07 UTC
12 points
0 comments7 min readLW link
(time.com)

Pri­ori­ties for the UK Foun­da­tion Models Taskforce

Andrea_Miotti21 Jul 2023 15:23 UTC
105 points
4 comments5 min readLW link
(www.conjecture.dev)

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

19 Dec 2022 21:31 UTC
65 points
28 comments10 min readLW link

[Question] What is the min­i­mum amount of time travel and re­sources needed to se­cure the fu­ture?

Perhaps14 Jan 2024 22:01 UTC
−3 points
5 comments1 min readLW link

Par­tial Tran­script of Re­cent Se­nate Hear­ing Dis­cussing AI X-Risk

Daniel_Eth27 Jul 2023 9:16 UTC
55 points
0 comments1 min readLW link
(medium.com)

EU’s AI am­bi­tions at risk as US pushes to wa­ter down in­ter­na­tional treaty (linkpost)

mic31 Jul 2023 0:34 UTC
10 points
0 comments4 min readLW link
(www.euractiv.com)

Trad­ing off com­pute in train­ing and in­fer­ence (Overview)

Pablo Villalobos31 Jul 2023 16:03 UTC
31 points
1 comment7 min readLW link
(epochai.org)

AI ro­man­tic part­ners will harm so­ciety if they go unregulated

Roman Leventov1 Aug 2023 9:32 UTC
25 points
71 comments13 min readLW link

Re­boot­ing AI Gover­nance: An AI-Driven Ap­proach to AI Governance

Max Reddel6 Aug 2023 14:19 UTC
1 point
1 comment29 min readLW link
(forum.effectivealtruism.org)

Seek­ing In­put to AI Safety Book for non-tech­ni­cal audience

Darren McKee10 Aug 2023 17:58 UTC
10 points
4 comments1 min readLW link

AI race con­sid­er­a­tions in a re­port by the U.S. House Com­mit­tee on Armed Services

NunoSempere4 Oct 2020 12:11 UTC
42 points
4 comments13 min readLW link

Se­cu­rity Mind­set—Fire Alarms and Trig­ger Signatures

elspood9 Feb 2023 21:15 UTC
23 points
0 comments4 min readLW link

Sin­gle­tons Rule OK

Eliezer Yudkowsky30 Nov 2008 16:45 UTC
23 points
47 comments5 min readLW link

Large Lan­guage Models will be Great for Censorship

Ethan Edwards21 Aug 2023 19:03 UTC
183 points
14 comments8 min readLW link
(ethanedwards.substack.com)

AI Reg­u­la­tion May Be More Im­por­tant Than AI Align­ment For Ex­is­ten­tial Safety

otto.barten24 Aug 2023 11:41 UTC
65 points
39 comments5 min readLW link

A con­cern­ing ob­ser­va­tion from me­dia cov­er­age of AI in­dus­try dynamics

Justin Olive5 Mar 2023 21:38 UTC
8 points
3 comments3 min readLW link

List of pro­jects that seem im­pact­ful for AI Governance

14 Jan 2024 16:53 UTC
13 points
0 comments13 min readLW link

Ac­cu­rate Models of AI Risk Are Hyper­ex­is­ten­tial Exfohazards

Thane Ruthenis25 Dec 2022 16:50 UTC
30 points
38 comments9 min readLW link

In­tro­duc­ing the Cen­ter for AI Policy (& we’re hiring!)

Thomas Larsen28 Aug 2023 21:17 UTC
119 points
50 comments2 min readLW link
(www.aipolicy.us)

Equil­ibrium and prior se­lec­tion prob­lems in mul­ti­po­lar deployment

JesseClifton2 Apr 2020 20:06 UTC
21 points
11 comments11 min readLW link

Notes on nukes, IR, and AI from “Arse­nals of Folly” (and other books)

tlevin4 Sep 2023 19:02 UTC
11 points
0 comments6 min readLW link

In­sti­tu­tions Can­not Res­train Dark-Triad AI Exploitation

27 Dec 2022 10:34 UTC
5 points
0 comments5 min readLW link
(mflb.com)

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

25 Sep 2023 18:55 UTC
3 points
2 comments3 min readLW link
(www.sentienceinstitute.org)

Care­less talk on US-China AI com­pe­ti­tion? (and crit­i­cism of CAIS cov­er­age)

Oliver Sourbut20 Sep 2023 12:46 UTC
3 points
0 comments10 min readLW link
(www.oliversourbut.net)

Five ne­glected work ar­eas that could re­duce AI risk

24 Sep 2023 2:03 UTC
17 points
5 comments9 min readLW link

In­ter­na­tional co­op­er­a­tion vs. AI arms race

Brian_Tomasik5 Dec 2013 1:09 UTC
23 points
144 comments4 min readLW link

The ne­ces­sity of “Guardian AI” and two con­di­tions for its achievement

Proica26 May 2024 17:39 UTC
0 points
0 comments14 min readLW link

Avoid­ing per­pet­ual risk from TAI

scasper26 Dec 2022 22:34 UTC
15 points
6 comments5 min readLW link

Up­date on the UK AI Task­force & up­com­ing AI Safety Summit

Elliot_Mckernon11 Oct 2023 11:37 UTC
83 points
2 comments4 min readLW link

A New Model for Com­pute Cen­ter Verification

Damin Curtis10 Oct 2023 19:22 UTC
8 points
0 comments5 min readLW link

[Question] Look­ing for read­ing recom­men­da­tions: The­o­ries of right/​jus­tice that safe­guard against hav­ing one’s job au­to­mated?

bulKlub12 Oct 2023 19:40 UTC
−1 points
1 comment1 min readLW link

unRLHF—Effi­ciently un­do­ing LLM safeguards

12 Oct 2023 19:58 UTC
117 points
15 comments20 min readLW link

The In­ter­na­tional PauseAI Protest: Ac­tivism un­der uncertainty

Joseph Miller12 Oct 2023 17:36 UTC
32 points
1 comment1 min readLW link

FLI pod­cast se­ries, “Imag­ine A World”, about as­pira­tional fu­tures with AGI

Jackson Wagner13 Oct 2023 16:07 UTC
9 points
0 comments4 min readLW link

To open-source or to not open-source, that is (an over­sim­plifi­ca­tion of) the ques­tion.

Justin Bullock13 Oct 2023 15:10 UTC
11 points
5 comments5 min readLW link

AISU 2021

Linda Linsefors30 Jan 2021 17:40 UTC
28 points
2 comments1 min readLW link

2021-03-01 Na­tional Library of Medicine Pre­sen­ta­tion: “At­las of AI: Map­ping the so­cial and eco­nomic forces be­hind AI”

IrenicTruth17 Feb 2021 18:23 UTC
1 point
0 comments2 min readLW link

Sur­vey on in­ter­me­di­ate goals in AI governance

17 Mar 2023 13:12 UTC
25 points
3 comments1 min readLW link

[Question] Is there any­thing that can stop AGI de­vel­op­ment in the near term?

Wulky Wilkinsen22 Apr 2021 20:37 UTC
5 points
5 comments1 min readLW link

Con­trol­ling In­tel­li­gent Agents The Only Way We Know How: Ideal Bureau­cratic Struc­ture (IBS)

Justin Bullock24 May 2021 12:53 UTC
14 points
15 comments6 min readLW link

Reflec­tion of Hier­ar­chi­cal Re­la­tion­ship via Nuanced Con­di­tion­ing of Game The­ory Ap­proach for AI Devel­op­ment and Utilization

Kyoung-cheol Kim4 Jun 2021 7:20 UTC
2 points
2 comments7 min readLW link

The Gover­nance Prob­lem and the “Pretty Good” X-Risk

Zach Stein-Perlman29 Aug 2021 18:00 UTC
5 points
2 comments11 min readLW link

Nu­clear Es­pi­onage and AI Governance

GAA4 Oct 2021 23:04 UTC
32 points
5 comments24 min readLW link

Com­pute Gover­nance and Con­clu­sions—Trans­for­ma­tive AI and Com­pute [3/​4]

lennart14 Oct 2021 8:23 UTC
13 points
0 comments5 min readLW link

Truth­ful AI: Devel­op­ing and gov­ern­ing AI that does not lie

18 Oct 2021 18:37 UTC
82 points
9 comments10 min readLW link

AMA on Truth­ful AI: Owen Cot­ton-Bar­ratt, Owain Evans & co-authors

Owain_Evans22 Oct 2021 16:23 UTC
31 points
15 comments1 min readLW link

AI Tracker: mon­i­tor­ing cur­rent and near-fu­ture risks from su­per­scale models

23 Nov 2021 19:16 UTC
64 points
13 comments3 min readLW link
(aitracker.org)

Should AI sys­tems have to iden­tify them­selves?

Darren McKee31 Dec 2022 2:57 UTC
2 points
2 comments1 min readLW link

AI Gover­nance Fun­da­men­tals—Cur­ricu­lum and Application

Mau30 Nov 2021 2:19 UTC
17 points
0 comments1 min readLW link

Con­trol­ling AGI Risk

TeaSea15 Mar 2024 4:56 UTC
6 points
8 comments4 min readLW link

After Over­mor­row: Scat­tered Mus­ings on the Im­me­di­ate Post-AGI World

Yuli_Ban24 Feb 2024 15:49 UTC
−3 points
0 comments26 min readLW link

HIRING: In­form and shape a new pro­ject on AI safety at Part­ner­ship on AI

madhu_lika7 Dec 2021 19:37 UTC
1 point
0 comments1 min readLW link

NAIRA—An ex­er­cise in reg­u­la­tory, com­pet­i­tive safety gov­er­nance [AI Gover­nance In­sti­tu­tional De­sign idea]

Heramb19 Mar 2024 17:43 UTC
2 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

AI Safety Eval­u­a­tions: A Reg­u­la­tory Review

19 Mar 2024 15:05 UTC
21 points
1 comment11 min readLW link

Towards AI Safety In­fras­truc­ture: Talk & Outline

Paul Bricman7 Jan 2024 9:31 UTC
10 points
0 comments2 min readLW link
(www.youtube.com)

Static vs Dy­namic Alignment

Gracie Green21 Mar 2024 17:44 UTC
4 points
0 comments29 min readLW link

AI Model Registries: A Reg­u­la­tory Review

22 Mar 2024 16:04 UTC
9 points
0 comments6 min readLW link

UNGA Re­s­olu­tion on AI: 5 Key Take­aways Look­ing to Fu­ture Policy

Heramb24 Mar 2024 12:23 UTC
3 points
0 comments3 min readLW link
(forum.effectivealtruism.org)

Idea: Safe Fal­lback Reg­u­la­tions for Widely De­ployed AI Systems

Aaron_Scher25 Mar 2024 21:27 UTC
4 points
0 comments6 min readLW link

Timelines to Trans­for­ma­tive AI: an investigation

Zershaaneh Qureshi26 Mar 2024 18:28 UTC
20 points
2 comments50 min readLW link

AI Dis­clo­sures: A Reg­u­la­tory Review

29 Mar 2024 11:42 UTC
11 points
0 comments7 min readLW link

God Coin: A Modest Proposal

Mahdi Complex1 Apr 2024 12:04 UTC
−8 points
5 comments22 min readLW link

AI Discrim­i­na­tion Re­quire­ments: A Reg­u­la­tory Review

4 Apr 2024 15:43 UTC
7 points
0 comments6 min readLW link

An­nounc­ing At­las Computing

miyazono11 Apr 2024 15:56 UTC
44 points
4 comments4 min readLW link

Ap­ply to the Pivotal Re­search Fel­low­ship (AI Safety & Biose­cu­rity)

10 Apr 2024 12:08 UTC
18 points
0 comments1 min readLW link

Cus­tomer-Cen­tric AI: the Ma­jor Paradigm Shift in AI Gover­nance (Part 1)

Ana Chubinidze11 Apr 2024 17:10 UTC
1 point
0 comments1 min readLW link
(anachubinidze.substack.com)

Re­port: Eval­u­at­ing an AI Chip Regis­tra­tion Policy

Deric Cheng12 Apr 2024 4:39 UTC
25 points
0 comments5 min readLW link
(www.convergenceanalysis.org)

De­mand­ing and De­sign­ing Aligned Cog­ni­tive Architectures

Koen.Holtman21 Dec 2021 17:32 UTC
8 points
5 comments5 min readLW link

AI Reg­u­la­tion is Unsafe

Maxwell Tabarrok22 Apr 2024 16:37 UTC
38 points
40 comments4 min readLW link
(www.maximum-progress.com)

Cy­ber­se­cu­rity of Fron­tier AI Models: A Reg­u­la­tory Review

25 Apr 2024 14:51 UTC
8 points
0 comments8 min readLW link

An In­tro­duc­tion to AI Sandbagging

26 Apr 2024 13:40 UTC
41 points
5 comments8 min readLW link

Re­lease of UN’s draft re­lated to the gov­er­nance of AI (a sum­mary of the Si­mon In­sti­tute’s re­sponse)

Sebastian Schmidt27 Apr 2024 18:34 UTC
7 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Open-Source AI: A Reg­u­la­tory Review

29 Apr 2024 10:10 UTC
18 points
0 comments8 min readLW link

Why I’m do­ing PauseAI

Joseph Miller30 Apr 2024 16:21 UTC
102 points
16 comments4 min readLW link

Take SCIFs, it’s dan­ger­ous to go alone

1 May 2024 8:02 UTC
34 points
1 comment3 min readLW link

Ques­tion 4: Im­ple­ment­ing the con­trol proposals

Cameron Berg13 Feb 2022 17:12 UTC
6 points
2 comments5 min readLW link

OHGOOD: A co­or­di­na­tion body for com­pute governance

Adam Jones4 May 2024 12:03 UTC
5 points
2 comments16 min readLW link
(adamjones.me)

Re­view­ing the Struc­ture of Cur­rent AI Regulations

7 May 2024 12:34 UTC
28 points
0 comments13 min readLW link

AI and Chem­i­cal, Biolog­i­cal, Ra­diolog­i­cal, & Nu­clear Hazards: A Reg­u­la­tory Review

10 May 2024 8:41 UTC
7 points
1 comment10 min readLW link

How harm­ful are im­prove­ments in AI? + Poll

15 Feb 2022 18:16 UTC
15 points
4 comments8 min readLW link

What you re­ally mean when you claim to sup­port “UBI for job au­toma­tion”: Part 1

Deric Cheng13 May 2024 8:52 UTC
17 points
14 comments10 min readLW link

An­nounc­ing the AI Safety Sum­mit Talks with Yoshua Bengio

otto.barten14 May 2024 12:52 UTC
9 points
1 comment1 min readLW link

Fo­cus­ing on Mal-Alignment

John Fisher2 Jan 2024 19:51 UTC
1 point
0 comments1 min readLW link

Ninety-five the­ses on AI

hamandcheese16 May 2024 17:51 UTC
17 points
0 comments7 min readLW link

AI 2030 – AI Policy Roadmap

LTM17 May 2024 23:29 UTC
8 points
0 comments1 min readLW link

EU poli­cy­mak­ers reach an agree­ment on the AI Act

tlevin15 Dec 2023 6:02 UTC
78 points
7 comments7 min readLW link

Ex­plor­ing the Pre­cau­tion­ary Prin­ci­ple in AI Devel­op­ment: His­tor­i­cal Analo­gies and Les­sons Learned

Christopher King21 Mar 2023 3:53 UTC
−1 points
2 comments9 min readLW link

A Nail in the Coffin of Exceptionalism

Yeshua God14 Mar 2024 22:41 UTC
−18 points
0 comments3 min readLW link

CAIS-in­spired ap­proach to­wards safer and more in­ter­pretable AGIs

Peter Hroššo27 Mar 2023 14:36 UTC
13 points
7 comments1 min readLW link

AI se­cu­rity might be helpful for AI alignment

Igor Ivanov6 Jan 2023 20:16 UTC
35 points
1 comment2 min readLW link

Want to win the AGI race? Solve al­ign­ment.

leopold29 Mar 2023 17:40 UTC
21 points
3 comments5 min readLW link
(www.forourposterity.com)

The 0.2 OOMs/​year target

Cleo Nardo30 Mar 2023 18:15 UTC
84 points
24 comments5 min readLW link

Wi­den­ing Over­ton Win­dow—Open Thread

Prometheus31 Mar 2023 10:03 UTC
23 points
8 comments1 min readLW link

AI safety ad­vo­cates should con­sider pro­vid­ing gen­tle push­back fol­low­ing the events at OpenAI

civilsociety22 Dec 2023 18:55 UTC
16 points
5 comments3 min readLW link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibs29 Mar 2023 23:16 UTC
298 points
296 comments3 min readLW link
(time.com)

AI gov­er­nance stu­dent hackathon on Satur­day, April 23: reg­ister now!

mic12 Apr 2022 4:48 UTC
14 points
0 comments1 min readLW link

AI com­mu­nity build­ing: EliezerKart

Christopher King1 Apr 2023 15:25 UTC
45 points
0 comments2 min readLW link

Pes­simism about AI Safety

2 Apr 2023 7:43 UTC
4 points
1 comment25 min readLW link

Law-Fol­low­ing AI 1: Se­quence In­tro­duc­tion and Structure

Cullen27 Apr 2022 17:26 UTC
18 points
10 comments9 min readLW link

The AI gov­er­nance gaps in de­vel­op­ing countries

nguyên17 Jun 2023 2:50 UTC
20 points
1 comment14 min readLW link

Law-Fol­low­ing AI 2: In­tent Align­ment + Su­per­in­tel­li­gence → Lawless AI (By De­fault)

Cullen27 Apr 2022 17:27 UTC
5 points
2 comments6 min readLW link

Law-Fol­low­ing AI 3: Lawless AI Agents Un­der­mine Sta­bi­liz­ing Agreements

Cullen27 Apr 2022 17:30 UTC
2 points
2 comments3 min readLW link

AI Alter­na­tive Fu­tures: Sce­nario Map­ping Ar­tifi­cial In­tel­li­gence Risk—Re­quest for Par­ti­ci­pa­tion (*Closed*)

Kakili27 Apr 2022 22:07 UTC
10 points
2 comments8 min readLW link

Yoshua Ben­gio: “Slow­ing down de­vel­op­ment of AI sys­tems pass­ing the Tur­ing test”

Roman Leventov6 Apr 2023 3:31 UTC
49 points
2 comments5 min readLW link
(yoshuabengio.org)

Risks from GPT-4 Byproduct of Re­cur­sively Op­ti­miz­ing AIs

ben hayum7 Apr 2023 0:02 UTC
74 points
10 comments10 min readLW link
(forum.effectivealtruism.org)

Quick Thoughts on A.I. Governance

NicholasKross30 Apr 2022 14:49 UTC
69 points
8 comments2 min readLW link
(www.thinkingmuchbetter.com)

AI safety should be made more ac­cessible us­ing non text-based media

Massimog10 May 2022 3:14 UTC
2 points
4 comments4 min readLW link

Deep­Mind’s gen­er­al­ist AI, Gato: A non-tech­ni­cal explainer

16 May 2022 21:21 UTC
63 points
6 comments6 min readLW link

Open po­si­tions: Re­search An­a­lyst at the AI Stan­dards Lab

22 Dec 2023 16:31 UTC
17 points
0 comments1 min readLW link

A bridge to Dath Ilan? Im­proved gov­er­nance on the crit­i­cal path to AI al­ign­ment.

Jackson Wagner18 May 2022 15:51 UTC
24 points
0 comments12 min readLW link

Re­shap­ing the AI Industry

Thane Ruthenis29 May 2022 22:54 UTC
147 points
35 comments21 min readLW link

Six Di­men­sions of Oper­a­tional Ad­e­quacy in AGI Projects

Eliezer Yudkowsky30 May 2022 17:00 UTC
302 points
66 comments13 min readLW link1 review

Open-source LLMs may prove Bostrom’s vuln­er­a­ble world hypothesis

Roope Ahvenharju15 Apr 2023 19:16 UTC
1 point
1 comment1 min readLW link

[Question] Could Pa­tent-Trol­ling de­lay AI timelines?

Pablo Repetto10 Jun 2022 2:53 UTC
1 point
3 comments1 min readLW link

[Link/​cross­post] [US] NTIA: AI Ac­countabil­ity Policy Re­quest for Comment

Kyle J. Lucchese16 Apr 2023 6:57 UTC
8 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee15 Jun 2022 18:08 UTC
42 points
15 comments2 min readLW link

Fi­nan­cial Times: We must slow down the race to God-like AI

trevor13 Apr 2023 19:55 UTC
103 points
17 comments16 min readLW link
(www.ft.com)

How are vol­un­tary com­mit­ments on vuln­er­a­bil­ity re­port­ing go­ing?

Adam Jones22 Feb 2024 8:43 UTC
23 points
1 comment1 min readLW link
(adamjones.me)

Scien­tism vs. people

Roman Leventov18 Apr 2023 17:28 UTC
4 points
4 comments11 min readLW link

[Cross­post] Or­ga­niz­ing a de­bate with ex­perts and MPs to raise AI xrisk aware­ness: a pos­si­ble blueprint

otto.barten19 Apr 2023 11:45 UTC
8 points
0 comments4 min readLW link
(forum.effectivealtruism.org)

Davi­dad’s Bold Plan for Align­ment: An In-Depth Explanation

19 Apr 2023 16:09 UTC
154 points
33 comments21 min readLW link

Pro­tec­tion­ism will Slow the De­ploy­ment of AI

bgold7 Jan 2023 20:57 UTC
30 points
6 comments2 min readLW link

What suc­cess looks like

28 Jun 2022 14:38 UTC
19 points
4 comments1 min readLW link
(forum.effectivealtruism.org)

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down

Eliezer Yudkowsky8 Apr 2023 0:36 UTC
246 points
39 comments12 min readLW link

Briefly how I’ve up­dated since ChatGPT

rime25 Apr 2023 14:47 UTC
48 points
2 comments2 min readLW link

New US Se­nate Bill on X-Risk Miti­ga­tion [Linkpost]

Evan R. Murphy4 Jul 2022 1:25 UTC
35 points
12 comments1 min readLW link
(www.hsgac.senate.gov)

Please help us com­mu­ni­cate AI xrisk. It could save the world.

otto.barten4 Jul 2022 21:47 UTC
4 points
7 comments2 min readLW link

An­nounc­ing #AISum­mitTalks fea­tur­ing Pro­fes­sor Stu­art Rus­sell and many others

otto.barten24 Oct 2023 10:11 UTC
17 points
1 comment1 min readLW link

Slow­ing down AI progress is an un­der­ex­plored al­ign­ment strategy

Norman Borlaug24 Jul 2023 16:56 UTC
40 points
27 comments5 min readLW link

A Cri­tique of AI Align­ment Pessimism

ExCeph19 Jul 2022 2:28 UTC
9 points
1 comment9 min readLW link

Law-Fol­low­ing AI 4: Don’t Rely on Vi­car­i­ous Liability

Cullen2 Aug 2022 23:26 UTC
5 points
2 comments3 min readLW link

Re­spon­si­ble Scal­ing Poli­cies Are Risk Man­age­ment Done Wrong

simeon_c25 Oct 2023 23:46 UTC
114 points
33 comments22 min readLW link
(www.navigatingrisks.ai)

Three pillars for avoid­ing AGI catas­tro­phe: Tech­ni­cal al­ign­ment, de­ploy­ment de­ci­sions, and coordination

Alex Lintz3 Aug 2022 23:15 UTC
22 points
0 comments12 min readLW link

Linkpost: Rishi Su­nak’s Speech on AI (26th Oc­to­ber)

bideup27 Oct 2023 11:57 UTC
85 points
8 comments7 min readLW link
(www.gov.uk)

Disagree­ments over the pri­ori­ti­za­tion of ex­is­ten­tial risk from AI

Olivier Coutu26 Oct 2023 17:54 UTC
10 points
0 comments6 min readLW link

Cor­po­rate Gover­nance for Fron­tier AI Labs: A Re­search Agenda

Matthew Wearden28 Feb 2024 11:29 UTC
4 points
0 comments16 min readLW link
(matthewwearden.co.uk)

[Linkpost] Two ma­jor an­nounce­ments in AI gov­er­nance today

Angélina30 Oct 2023 17:28 UTC
1 point
1 comment1 min readLW link
(www.whitehouse.gov)

Re­sponse to “Co­or­di­nated paus­ing: An eval­u­a­tion-based co­or­di­na­tion scheme for fron­tier AI de­vel­op­ers”

Matthew Wearden30 Oct 2023 17:27 UTC
5 points
2 comments6 min readLW link
(matthewwearden.co.uk)

Cap Model Size for AI Safety

research_prime_space6 Mar 2023 1:11 UTC
0 points
4 comments1 min readLW link

Align­ment is not enough

Alan Chan12 Jan 2023 0:33 UTC
11 points
6 comments11 min readLW link
(coordination.substack.com)

Matt Ygle­sias on AI Policy

Grant Demaree17 Aug 2022 23:57 UTC
25 points
1 comment1 min readLW link
(www.slowboring.com)

A brief re­view of China’s AI in­dus­try and regulations

Elliot_Mckernon14 Mar 2024 12:19 UTC
23 points
0 comments16 min readLW link

[Question] Should AI writ­ers be pro­hibited in ed­u­ca­tion?

Eleni Angelou17 Jan 2023 0:42 UTC
6 points
2 comments1 min readLW link

Thoughts on the AI Safety Sum­mit com­pany policy re­quests and responses

So8res31 Oct 2023 23:54 UTC
169 points
14 comments10 min readLW link

Com­pute Gover­nance: The Role of Com­mod­ity Hardware

Jan26 Mar 2022 10:08 UTC
14 points
7 comments7 min readLW link
(universalprior.substack.com)

[Question] What could a policy ban­ning AGI look like?

TsviBT13 Mar 2024 14:19 UTC
65 points
21 comments3 min readLW link

Why don’t gov­ern­ments seem to mind that com­pa­nies are ex­plic­itly try­ing to make AGIs?

ozziegooen26 Dec 2021 1:58 UTC
34 points
3 comments2 min readLW link
(forum.effectivealtruism.org)

AI Gover­nance Needs Tech­ni­cal Work

Mau5 Sep 2022 22:28 UTC
41 points
1 comment8 min readLW link

AI as Su­per-Demagogue

RationalDino5 Nov 2023 21:21 UTC
−2 points
9 comments9 min readLW link

What Should AI Owe To Us? Ac­countable and Aligned AI Sys­tems via Con­trac­tu­al­ist AI Alignment

xuan8 Sep 2022 15:04 UTC
32 points
15 comments25 min readLW link

Scal­able And Trans­fer­able Black-Box Jailbreaks For Lan­guage Models Via Per­sona Modulation

7 Nov 2023 17:59 UTC
36 points
2 comments2 min readLW link
(arxiv.org)

How should Deep­Mind’s Chin­chilla re­vise our AI fore­casts?

Cleo Nardo15 Sep 2022 17:54 UTC
35 points
12 comments13 min readLW link

Up­date on the UK AI Sum­mit and the UK’s Plans

Elliot_Mckernon10 Nov 2023 14:47 UTC
11 points
0 comments8 min readLW link
No comments.