Threat Models

TagLast edit: 20 Apr 2021 21:57 UTC by Quinn

A threat model is a story of how a particular risk (e.g. AI) plays out.

In the AI case, according to Rohin Shah, a threat model is ideally:

Combination of a development model that says how we get AGI and a risk model that says how AGI leads to existential catastrophe.

Another (outer) al­ign­ment failure story

paulfchristiano7 Apr 2021 20:12 UTC
158 points
32 comments12 min readLW link

What failure looks like

paulfchristiano17 Mar 2019 20:18 UTC
241 points
48 comments8 min readLW link2 nominations2 reviews

What Mul­tipo­lar Failure Looks Like, and Ro­bust Agent-Ag­nos­tic Pro­cesses (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC
139 points
53 comments22 min readLW link

My AGI Threat Model: Misal­igned Model-Based RL Agent

Steven Byrnes25 Mar 2021 13:45 UTC
62 points
37 comments16 min readLW link

Less Real­is­tic Tales of Doom

Mark Xu6 May 2021 23:01 UTC
92 points
9 comments4 min readLW link
