Threat Models

TagLast edit: 20 Apr 2021 21:57 UTC by Quinn

A threat model is a story of how a particular risk (e.g. AI) plays out.

In the AI case, according to Rohin Shah, a threat model is ideally:

Combination of a development model that says how we get AGI and a risk model that says how AGI leads to existential catastrophe.

Another (outer) al­ign­ment failure story

paulfchristiano7 Apr 2021 20:12 UTC
189 points
37 comments12 min readLW link

What failure looks like

paulfchristiano17 Mar 2019 20:18 UTC
260 points
48 comments8 min readLW link2 nominations2 reviews

Dist­in­guish­ing AI takeover scenarios

8 Sep 2021 16:19 UTC
60 points
9 comments14 min readLW link

What Mul­tipo­lar Failure Looks Like, and Ro­bust Agent-Ag­nos­tic Pro­cesses (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC
150 points
59 comments22 min readLW link

Vignettes Work­shop (AI Im­pacts)

Daniel Kokotajlo15 Jun 2021 12:05 UTC
47 points
3 comments1 min readLW link

Sur­vey on AI ex­is­ten­tial risk scenarios

8 Jun 2021 17:12 UTC
55 points
10 comments7 min readLW link

In­ves­ti­gat­ing AI Takeover Scenarios

Sammy Martin17 Sep 2021 18:47 UTC
26 points
1 comment27 min readLW link

Rogue AGI Em­bod­ies Valuable In­tel­lec­tual Property

3 Jun 2021 20:37 UTC
69 points
9 comments3 min readLW link

My AGI Threat Model: Misal­igned Model-Based RL Agent

Steven Byrnes25 Mar 2021 13:45 UTC
62 points
40 comments16 min readLW link

Less Real­is­tic Tales of Doom

Mark Xu6 May 2021 23:01 UTC
99 points
13 comments4 min readLW link

What Failure Looks Like: Distill­ing the Discussion

Ben Pace29 Jul 2020 21:49 UTC
74 points
12 comments7 min readLW link