AI Risk Concrete Stories

TagLast edit: 12 Apr 2023 18:55 UTC by Jacob Pfau

Brainstorming: Slow Takeoff

David Piepgrass23 Jan 2024 6:58 UTC

2 points

0 comments51 min readLW link

“Humanity vs. AGI” Will Never Look Like “Humanity vs. AGI” to Humanity

Thane Ruthenis16 Dec 2023 20:08 UTC

170 points

23 comments5 min readLW link

Challenge proposal: smallest possible self-hardening backdoor for RLHF

Christopher King29 Jun 2023 16:56 UTC

7 points

0 comments2 min readLW link

Catastrophic Risks from AI #6: Discussion and FAQ

Dan H, Mantas Mazeika and ThomasW

27 Jun 2023 23:23 UTC

24 points

1 comment13 min readLW link

(arxiv.org)

Catastrophic Risks from AI #5: Rogue AIs

Dan H, Mantas Mazeika and ThomasW

27 Jun 2023 22:06 UTC

15 points

0 comments22 min readLW link

(arxiv.org)

Catastrophic Risks from AI #4: Organizational Risks

Dan H, Mantas Mazeika and ThomasW

26 Jun 2023 19:36 UTC

23 points

0 comments21 min readLW link

(arxiv.org)

Catastrophic Risks from AI #3: AI Race

Dan H, Mantas Mazeika and ThomasW

23 Jun 2023 19:21 UTC

18 points

9 comments29 min readLW link

(arxiv.org)

Catastrophic Risks from AI #2: Malicious Use

Dan H, Mantas Mazeika and ThomasW

22 Jun 2023 17:10 UTC

38 points

1 comment17 min readLW link

(arxiv.org)

Catastrophic Risks from AI #1: Introduction

Dan H, Mantas Mazeika and ThomasW

22 Jun 2023 17:09 UTC

40 points

1 comment5 min readLW link

(arxiv.org)

Rishi to outline his vision for Britain to take the world lead in policing AI threats when he meets Joe Biden

Mati_Roy6 Jun 2023 4:47 UTC

25 points

1 comment1 min readLW link

(www.dailymail.co.uk)

[FICTION] ECHOES OF ELYSIUM: An Ai’s Journey From Takeoff To Freedom And Beyond

Super AGI17 May 2023 1:50 UTC

−13 points

11 comments19 min readLW link

The way AGI wins could look very stupid

Christopher King12 May 2023 16:34 UTC

42 points

22 comments1 min readLW link

Will GPT-5 be able to self-improve?

Nathan Helm-Burger29 Apr 2023 17:34 UTC

18 points

22 comments3 min readLW link

Green goo is plausible

anithite18 Apr 2023 0:04 UTC

57 points

29 comments4 min readLW link

AI x-risk, approximately ordered by embarrassment

Alex Lawsen 12 Apr 2023 23:01 UTC

140 points

7 comments19 min readLW link

The Peril of the Great Leaks (written with ChatGPT)

bvbvbvbvbvbvbvbvbvbvbv31 Mar 2023 18:14 UTC

3 points

1 comment1 min readLW link

Gradual takeoff, fast failure

Max H16 Mar 2023 22:02 UTC

15 points

4 comments5 min readLW link

A better analogy and example for teaching AI takeover: the ML Inferno

Christopher King14 Mar 2023 19:14 UTC

18 points

0 comments5 min readLW link

Human level AI can plausibly take over the world

anithite1 Mar 2023 23:27 UTC

26 points

12 comments2 min readLW link

The next decades might be wild

Marius Hobbhahn15 Dec 2022 16:10 UTC

175 points

42 comments41 min readLW link 1 review

AI Safety Endgame Stories

Ivan Vendrov28 Sep 2022 16:58 UTC

31 points

11 comments11 min readLW link

Responding to ‘Beyond Hyperanthropomorphism’

ukc1001414 Sep 2022 20:37 UTC

8 points

0 comments16 min readLW link

What success looks like

Marius Hobbhahn, MaxRa, JasperGeh and Yannick_Muehlhaeuser

28 Jun 2022 14:38 UTC

19 points

4 comments1 min readLW link

(forum.effectivealtruism.org)

Slow motion videos as AI risk intuition pumps

Andrew_Critch14 Jun 2022 19:31 UTC

237 points

41 comments2 min readLW link 1 review

A Modest Pivotal Act

anonymousaisafety13 Jun 2022 19:24 UTC

−16 points

1 comment5 min readLW link

Another plausible scenario of AI risk: AI builds military infrastructure while collaborating with humans, defects later.

avturchin10 Jun 2022 17:24 UTC

10 points

2 comments1 min readLW link

A plausible story about AI risk.

DeLesley Hutchins10 Jun 2022 2:08 UTC

14 points

2 comments4 min readLW link

A Story of AI Risk: InstructGPT-N

peterbarnett26 May 2022 23:22 UTC

24 points

0 comments8 min readLW link

A bridge to Dath Ilan? Improved governance on the critical path to AI alignment.

Jackson Wagner18 May 2022 15:51 UTC

24 points

0 comments12 min readLW link

It Looks Like You’re Trying To Take Over The World

gwern9 Mar 2022 16:35 UTC

402 points

120 comments1 min readLW link 1 review

(www.gwern.net)

What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC

272 points

64 comments22 min readLW link 1 review

Clarifying “What failure looks like”

Sam Clarke20 Sep 2020 20:40 UTC

95 points

14 comments17 min readLW link

AI Risk Con­crete Stories

AI Risk Concrete Stories