RSS

AI Risk Con­crete Stories

TagLast edit: 12 Apr 2023 18:55 UTC by Jacob Pfau

See also Threat Models

Brain­storm­ing: Slow Takeoff

David Piepgrass23 Jan 2024 6:58 UTC
2 points
0 comments51 min readLW link

“Hu­man­ity vs. AGI” Will Never Look Like “Hu­man­ity vs. AGI” to Humanity

Thane Ruthenis16 Dec 2023 20:08 UTC
170 points
23 comments5 min readLW link

Challenge pro­posal: small­est pos­si­ble self-hard­en­ing back­door for RLHF

Christopher King29 Jun 2023 16:56 UTC
7 points
0 comments2 min readLW link

Catas­trophic Risks from AI #6: Dis­cus­sion and FAQ

27 Jun 2023 23:23 UTC
24 points
1 comment13 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #5: Rogue AIs

27 Jun 2023 22:06 UTC
15 points
0 comments22 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #4: Or­ga­ni­za­tional Risks

26 Jun 2023 19:36 UTC
23 points
0 comments21 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #3: AI Race

23 Jun 2023 19:21 UTC
18 points
9 comments29 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #2: Mal­i­cious Use

22 Jun 2023 17:10 UTC
38 points
1 comment17 min readLW link
(arxiv.org)

Catas­trophic Risks from AI #1: Introduction

22 Jun 2023 17:09 UTC
40 points
1 comment5 min readLW link
(arxiv.org)

Rishi to out­line his vi­sion for Bri­tain to take the world lead in polic­ing AI threats when he meets Joe Biden

Mati_Roy6 Jun 2023 4:47 UTC
25 points
1 comment1 min readLW link
(www.dailymail.co.uk)

[FICTION] ECHOES OF ELYSIUM: An Ai’s Jour­ney From Take­off To Free­dom And Beyond

Super AGI17 May 2023 1:50 UTC
−13 points
11 comments19 min readLW link

The way AGI wins could look very stupid

Christopher King12 May 2023 16:34 UTC
42 points
22 comments1 min readLW link

Will GPT-5 be able to self-im­prove?

Nathan Helm-Burger29 Apr 2023 17:34 UTC
18 points
22 comments3 min readLW link

Green goo is plausible

anithite18 Apr 2023 0:04 UTC
57 points
29 comments4 min readLW link

AI x-risk, ap­prox­i­mately or­dered by embarrassment

Alex Lawsen 12 Apr 2023 23:01 UTC
140 points
7 comments19 min readLW link

The Peril of the Great Leaks (writ­ten with ChatGPT)

bvbvbvbvbvbvbvbvbvbvbv31 Mar 2023 18:14 UTC
3 points
1 comment1 min readLW link

Grad­ual take­off, fast failure

Max H16 Mar 2023 22:02 UTC
15 points
4 comments5 min readLW link

A bet­ter anal­ogy and ex­am­ple for teach­ing AI takeover: the ML Inferno

Christopher King14 Mar 2023 19:14 UTC
18 points
0 comments5 min readLW link

Hu­man level AI can plau­si­bly take over the world

anithite1 Mar 2023 23:27 UTC
26 points
12 comments2 min readLW link

The next decades might be wild

Marius Hobbhahn15 Dec 2022 16:10 UTC
175 points
42 comments41 min readLW link1 review

AI Safety Endgame Stories

Ivan Vendrov28 Sep 2022 16:58 UTC
31 points
11 comments11 min readLW link

Re­spond­ing to ‘Beyond Hyper­an­thro­po­mor­phism’

ukc1001414 Sep 2022 20:37 UTC
8 points
0 comments16 min readLW link

What suc­cess looks like

28 Jun 2022 14:38 UTC
19 points
4 comments1 min readLW link
(forum.effectivealtruism.org)

Slow mo­tion videos as AI risk in­tu­ition pumps

Andrew_Critch14 Jun 2022 19:31 UTC
237 points
41 comments2 min readLW link1 review

A Modest Pivotal Act

anonymousaisafety13 Jun 2022 19:24 UTC
−16 points
1 comment5 min readLW link

Another plau­si­ble sce­nario of AI risk: AI builds mil­i­tary in­fras­truc­ture while col­lab­o­rat­ing with hu­mans, defects later.

avturchin10 Jun 2022 17:24 UTC
10 points
2 comments1 min readLW link

A plau­si­ble story about AI risk.

DeLesley Hutchins10 Jun 2022 2:08 UTC
14 points
2 comments4 min readLW link

A Story of AI Risk: In­struc­tGPT-N

peterbarnett26 May 2022 23:22 UTC
24 points
0 comments8 min readLW link

A bridge to Dath Ilan? Im­proved gov­er­nance on the crit­i­cal path to AI al­ign­ment.

Jackson Wagner18 May 2022 15:51 UTC
24 points
0 comments12 min readLW link

It Looks Like You’re Try­ing To Take Over The World

gwern9 Mar 2022 16:35 UTC
402 points
120 comments1 min readLW link1 review
(www.gwern.net)

What Mul­tipo­lar Failure Looks Like, and Ro­bust Agent-Ag­nos­tic Pro­cesses (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC
272 points
64 comments22 min readLW link1 review

Clar­ify­ing “What failure looks like”

Sam Clarke20 Sep 2020 20:40 UTC
95 points
14 comments17 min readLW link