RSS

AI Suc­cess Models

TagLast edit: Nov 17, 2021, 11:17 PM by plex

AI Success Models are proposed paths to an existential win via aligned AI. They are (so far) high level overviews and won’t contain all the details, but present at least a sketch of what a full solution might look like. They can be contrasted with threat models, which are stories about how AI might lead to major problems.

Solv­ing the whole AGI con­trol prob­lem, ver­sion 0.0001

Steven ByrnesApr 8, 2021, 3:14 PM
63 points

31 votes

Overall karma indicates overall quality.

7 comments26 min readLW link

An overview of 11 pro­pos­als for build­ing safe ad­vanced AI

evhubMay 29, 2020, 8:38 PM
220 points

113 votes

Overall karma indicates overall quality.

37 comments38 min readLW link2 reviews

A pos­i­tive case for how we might suc­ceed at pro­saic AI alignment

evhubNov 16, 2021, 1:49 AM
81 points

35 votes

Overall karma indicates overall quality.

46 comments6 min readLW link

In­ter­pretabil­ity’s Align­ment-Solv­ing Po­ten­tial: Anal­y­sis of 7 Scenarios

Evan R. MurphyMay 12, 2022, 8:01 PM
58 points

29 votes

Overall karma indicates overall quality.

0 comments59 min readLW link

Con­ver­sa­tion with Eliezer: What do you want the sys­tem to do?

Orpheus16Jun 25, 2022, 5:36 PM
114 points

73 votes

Overall karma indicates overall quality.

38 comments2 min readLW link

[Question] Any fur­ther work on AI Safety Suc­cess Sto­ries?

KriegerOct 2, 2022, 9:53 AM
8 points

6 votes

Overall karma indicates overall quality.

6 comments1 min readLW link

Four vi­sions of Trans­for­ma­tive AI success

Steven ByrnesJan 17, 2024, 8:45 PM
112 points

38 votes

Overall karma indicates overall quality.

22 comments15 min readLW link

Gra­di­ent Des­cent on the Hu­man Brain

Apr 1, 2024, 10:39 PM
59 points

29 votes

Overall karma indicates overall quality.

5 comments2 min readLW link

How Would an Utopia-Max­i­mizer Look Like?

Thane RuthenisDec 20, 2023, 8:01 PM
32 points

9 votes

Overall karma indicates overall quality.

23 comments10 min readLW link

Con­di­tion­ing Gen­er­a­tive Models for Alignment

JozdienJul 18, 2022, 7:11 AM
60 points

29 votes

Overall karma indicates overall quality.

8 comments20 min readLW link

An Open Agency Ar­chi­tec­ture for Safe Trans­for­ma­tive AI

davidadDec 20, 2022, 1:04 PM
80 points

52 votes

Overall karma indicates overall quality.

22 comments4 min readLW link

Wor­ri­some mi­s­un­der­stand­ing of the core is­sues with AI transition

Roman LeventovJan 18, 2024, 10:05 AM
5 points

7 votes

Overall karma indicates overall quality.

2 comments4 min readLW link

Against blan­ket ar­gu­ments against interpretability

Dmitry VaintrobJan 22, 2025, 9:46 AM
52 points

20 votes

Overall karma indicates overall quality.

4 comments7 min readLW link

Suc­cess with­out dig­nity: a nearcast­ing story of avoid­ing catas­tro­phe by luck

HoldenKarnofskyMar 14, 2023, 7:23 PM
85 points

45 votes

Overall karma indicates overall quality.

17 comments15 min readLW link

AI Safety “Suc­cess Sto­ries”

Wei DaiSep 7, 2019, 2:54 AM
128 points

50 votes

Overall karma indicates overall quality.

27 comments4 min readLW link1 review

Var­i­ous Align­ment Strate­gies (and how likely they are to work)

Logan ZoellnerMay 3, 2022, 4:54 PM
85 points

43 votes

Overall karma indicates overall quality.

34 comments11 min readLW link

Gaia Net­work: a prac­ti­cal, in­cre­men­tal path­way to Open Agency Architecture

Dec 20, 2023, 5:11 PM
22 points

13 votes

Overall karma indicates overall quality.

8 comments16 min readLW link

Get­ting from an un­al­igned AGI to an al­igned AGI?

Tor Økland BarstadJun 21, 2022, 12:36 PM
13 points

9 votes

Overall karma indicates overall quality.

7 comments9 min readLW link

Mak­ing it harder for an AGI to “trick” us, with STVs

Tor Økland BarstadJul 9, 2022, 2:42 PM
15 points

5 votes

Overall karma indicates overall quality.

5 comments22 min readLW link

AI Safety Endgame Stories

Ivan VendrovSep 28, 2022, 4:58 PM
31 points

13 votes

Overall karma indicates overall quality.

11 comments11 min readLW link

An AI-in-a-box suc­cess model

azsantoskApr 11, 2022, 10:28 PM
16 points

8 votes

Overall karma indicates overall quality.

1 comment10 min readLW link

Gaia Net­work: An Illus­trated Primer

Jan 18, 2024, 6:23 PM
3 points

9 votes

Overall karma indicates overall quality.

2 comments15 min readLW link

Ac­cept­abil­ity Ver­ifi­ca­tion: A Re­search Agenda

Jul 12, 2022, 8:11 PM
50 points

21 votes

Overall karma indicates overall quality.

0 comments1 min readLW link
(docs.google.com)

AI Safety via Luck

JozdienApr 1, 2023, 8:13 PM
82 points

53 votes

Overall karma indicates overall quality.

7 comments11 min readLW link

[Question] If AGI were com­ing in a year, what should we do?

MichaelStJulesApr 1, 2022, 12:41 AM
20 points

8 votes

Overall karma indicates overall quality.

16 comments1 min readLW link

Align­ment with ar­gu­ment-net­works and as­sess­ment-predictions

Tor Økland BarstadDec 13, 2022, 2:17 AM
10 points

3 votes

Overall karma indicates overall quality.

5 comments45 min readLW link

Pos­si­ble miracles

Oct 9, 2022, 6:17 PM
64 points

43 votes

Overall karma indicates overall quality.

34 comments8 min readLW link

How Might an Align­ment At­trac­tor Look like?

ShmiApr 28, 2022, 6:46 AM
47 points

18 votes

Overall karma indicates overall quality.

15 comments2 min readLW link

What suc­cess looks like

Jun 28, 2022, 2:38 PM
19 points

10 votes

Overall karma indicates overall quality.

4 comments1 min readLW link
(forum.effectivealtruism.org)

[Question] What Does AI Align­ment Suc­cess Look Like?

ShmiOct 20, 2022, 12:32 AM
23 points

8 votes

Overall karma indicates overall quality.

7 comments1 min readLW link

In­tro­duc­tion to the se­quence: In­ter­pretabil­ity Re­search for the Most Im­por­tant Century

Evan R. MurphyMay 12, 2022, 7:59 PM
16 points

9 votes

Overall karma indicates overall quality.

0 comments8 min readLW link

Towards Hodge-podge Alignment

Cleo NardoDec 19, 2022, 8:12 PM
95 points

55 votes

Overall karma indicates overall quality.

30 comments9 min readLW link