RSS

In­stru­men­tal Convergence

In­stru­men­tal con­ver­gence or con­ver­gent in­stru­men­tal val­ues is the the­o­rized ten­dency for most suffi­ciently in­tel­li­gent agents to pur­sue po­ten­tially un­bounded in­stru­men­tal goals such as self-preser­va­tion and re­source ac­qui­si­tion [1].

Seek­ing Power is Often Prov­ably In­stru­men­tally Con­ver­gent in MDPs

5 Dec 2019 2:33 UTC
114 points
25 comments11 min readLW link
(arxiv.org)

Corrigibility

paulfchristiano
27 Nov 2018 21:50 UTC
40 points
3 comments6 min readLW link

AI pre­dic­tion case study 5: Omo­hun­dro’s AI drives

Stuart_Armstrong
15 Mar 2013 9:09 UTC
5 points
5 comments8 min readLW link

Gen­eral pur­pose in­tel­li­gence: ar­gu­ing the Orthog­o­nal­ity thesis

Stuart_Armstrong
15 May 2012 10:23 UTC
24 points
156 comments18 min readLW link

Toy model: con­ver­gent in­stru­men­tal goals

Stuart_Armstrong
25 Feb 2016 14:03 UTC
8 points
2 comments4 min readLW link

De­bate on In­stru­men­tal Con­ver­gence be­tween LeCun, Rus­sell, Ben­gio, Zador, and More

Ben Pace
4 Oct 2019 4:08 UTC
186 points
49 comments15 min readLW link

Goal re­ten­tion dis­cus­sion with Eliezer

MaxTegmark
4 Sep 2014 22:23 UTC
61 points
26 comments6 min readLW link

Gen­er­al­iz­ing the Power-Seek­ing Theorems

TurnTrout
27 Jul 2020 0:28 UTC
39 points
2 comments6 min readLW link

The Catas­trophic Con­ver­gence Conjecture

TurnTrout
14 Feb 2020 21:16 UTC
40 points
13 comments8 min readLW link

Power as Easily Ex­ploitable Opportunities

TurnTrout
1 Aug 2020 2:14 UTC
26 points
5 comments6 min readLW link

Clar­ify­ing Power-Seek­ing and In­stru­men­tal Convergence

TurnTrout
20 Dec 2019 19:59 UTC
42 points
7 comments3 min readLW link

A Gym Grid­world En­vi­ron­ment for the Treach­er­ous Turn

Michaël Trazzi
28 Jul 2018 21:27 UTC
69 points
9 comments3 min readLW link
(github.com)

Ra­tion­al­ity: Com­mon In­ter­est of Many Causes

Eliezer Yudkowsky
29 Mar 2009 10:49 UTC
48 points
52 comments4 min readLW link

Plau­si­bly, al­most ev­ery pow­er­ful al­gorithm would be manipulative

Stuart_Armstrong
6 Feb 2020 11:50 UTC
41 points
25 comments3 min readLW link

Asymp­tot­i­cally Unam­bi­tious AGI

michaelcohen
6 Mar 2019 1:15 UTC
40 points
216 comments2 min readLW link

The Utility of Hu­man Atoms for the Paper­clip Maximizer

avturchin
2 Feb 2018 10:06 UTC
8 points
19 comments3 min readLW link