RSS

Power Seek­ing (AI)

TagLast edit: 24 Oct 2022 22:49 UTC by Raemon

Power Seeking is a property that agents might have, where they attempt to gain more general ability to control their environment. It’s particularly relevant to AIs, and related to Instrumental Convergence.

Power-seek­ing for suc­ces­sive choices

adamShimi12 Aug 2021 20:37 UTC
11 points
9 comments4 min readLW link

Power-Seek­ing AI and Ex­is­ten­tial Risk

Antonio Franca11 Oct 2022 22:50 UTC
5 points
0 comments9 min readLW link

Gen­er­al­iz­ing the Power-Seek­ing Theorems

TurnTrout27 Jul 2020 0:28 UTC
41 points
6 comments4 min readLW link

Re­views of “Is power-seek­ing AI an ex­is­ten­tial risk?”

Joe Carlsmith16 Dec 2021 20:48 UTC
77 points
20 comments1 min readLW link

Eli’s re­view of “Is power-seek­ing AI an ex­is­ten­tial risk?”

elifland30 Sep 2022 12:21 UTC
67 points
0 comments3 min readLW link
(docs.google.com)

[AN #170]: An­a­lyz­ing the ar­gu­ment for risk from power-seek­ing AI

Rohin Shah8 Dec 2021 18:10 UTC
21 points
1 comment7 min readLW link
(mailchi.mp)

POWER­play: An open-source toolchain to study AI power-seeking

Edouard Harris24 Oct 2022 20:03 UTC
22 points
0 comments1 min readLW link
(github.com)

Para­met­ri­cally re­tar­getable de­ci­sion-mak­ers tend to seek power

TurnTrout18 Feb 2023 18:41 UTC
146 points
6 comments2 min readLW link
(arxiv.org)

Power-seek­ing can be prob­a­ble and pre­dic­tive for trained agents

28 Feb 2023 21:10 UTC
33 points
1 comment8 min readLW link

[Linkpost] Shorter ver­sion of re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe Carlsmith22 Mar 2023 18:09 UTC
7 points
0 comments1 min readLW link

Ques­tions about Value Lock-in, Pa­ter­nal­ism, and Empowerment

Sam16 Nov 2022 15:33 UTC
12 points
2 comments12 min readLW link
(sambrown.eu)

Avoid­ing Psy­cho­pathic AI

Cameron Berg19 Dec 2022 17:01 UTC
28 points
3 comments20 min readLW link

Sim­ple Way to Prevent Power-Seek­ing AI

research_prime_space7 Dec 2022 0:26 UTC
7 points
1 comment1 min readLW link

Power-Seek­ing = Min­imis­ing free energy

Jonas Hallgren22 Feb 2023 4:28 UTC
19 points
4 comments7 min readLW link

The Waluigi Effect (mega-post)

Cleo Nardo3 Mar 2023 3:22 UTC
586 points
175 comments16 min readLW link
No comments.