AI Suc­cess Models

TagLast edit: 17 Nov 2021 23:17 UTC by plex

AI Success Models are proposed paths to an existential win via aligned AI. They are (so far) high level overviews and won’t contain all the details, but present at least a sketch of what a full solution might look like. They can be contrasted with threat models, which are stories about how AI might lead to major problems.

A pos­i­tive case for how we might suc­ceed at pro­saic AI alignment

evhub16 Nov 2021 1:49 UTC
84 points
45 comments6 min readLW link

An overview of 11 pro­pos­als for build­ing safe ad­vanced AI

evhub29 May 2020 20:38 UTC
181 points
34 comments38 min readLW link2 reviews

Solv­ing the whole AGI con­trol prob­lem, ver­sion 0.0001

Steven Byrnes8 Apr 2021 15:14 UTC
55 points
7 comments26 min readLW link

In­ter­pretabil­ity’s Align­ment-Solv­ing Po­ten­tial: Anal­y­sis of 7 Scenarios

Evan R. Murphy12 May 2022 20:01 UTC
38 points
0 comments59 min readLW link

Var­i­ous Align­ment Strate­gies (and how likely they are to work)

Logan Zoellner3 May 2022 16:54 UTC
66 points
34 comments11 min readLW link

AI Safety “Suc­cess Sto­ries”

Wei_Dai7 Sep 2019 2:54 UTC
106 points
27 comments4 min readLW link1 review

[Question] If AGI were com­ing in a year, what should we do?

MichaelStJules1 Apr 2022 0:41 UTC
20 points
16 comments1 min readLW link

An AI-in-a-box suc­cess model

azsantosk11 Apr 2022 22:28 UTC
16 points
1 comment10 min readLW link

How Might an Align­ment At­trac­tor Look like?

shminux28 Apr 2022 6:46 UTC
47 points
15 comments2 min readLW link

In­tro­duc­tion to the se­quence: In­ter­pretabil­ity Re­search for the Most Im­por­tant Century

Evan R. Murphy12 May 2022 19:59 UTC
16 points
0 comments8 min readLW link