Robust Agents

TagLast edit: Sep 14, 2020, 11:17 PM by Ruby

Robust Agents are decision-makers who can perform well in a variety of situations. Whereas some humans rely on folk wisdom or instinct, and some AIs might be designed to achieve a narrow set of goals, a Robust Agent has a coherent set of values and decision-procedures. This enables them to adapt to new circumstances (such as succeeding in a new environment, or responding to a new strategy by a competitor).

Being a Robust Agent

RaemonOct 18, 2018, 7:00 AM

152 points

32 comments7 min readLW link 2 reviews

Security Mindset and Ordinary Paranoia

Eliezer YudkowskyNov 25, 2017, 5:53 PM

132 points

25 comments29 min readLW link

Subagents, akrasia, and coherence in humans

Kaj_SotalaMar 25, 2019, 2:24 PM

141 points

31 comments16 min readLW link

On Being Robust

TurnTroutJan 10, 2020, 3:51 AM

45 points

7 comments2 min readLW link

Embedded Agency (full-text version)

Scott Garrabrant and abramdemski

Nov 15, 2018, 7:49 PM

210 points

17 comments54 min readLW link

Desiderata for an AI

Nathan Helm-BurgerJul 19, 2023, 4:18 PM

9 points

0 comments4 min readLW link

Gradations of Agency

Daniel KokotajloMay 23, 2022, 1:10 AM

41 points

6 comments5 min readLW link

Robust Agency for People and Organizations

RaemonJul 19, 2019, 1:18 AM

65 points

10 comments12 min readLW link

Upcoming stability of values

Stuart_ArmstrongMar 15, 2018, 11:36 AM

15 points

15 comments2 min readLW link

Humans are very reliable agents

alyssavanceJun 16, 2022, 10:02 PM

270 points

35 comments3 min readLW link

The Power of Agency

lukeprogMay 7, 2011, 1:38 AM

114 points

78 comments1 min readLW link

[Question] What if memes are common in highly capable minds?

Daniel KokotajloJul 30, 2020, 8:45 PM

40 points

13 comments2 min readLW link

Robust Delegation

abramdemski and Scott Garrabrant

Nov 4, 2018, 4:38 PM

116 points

10 comments1 min readLW link

A multi-disciplinary view on AI safety research

Roman LeventovFeb 8, 2023, 4:50 PM

46 points

4 comments26 min readLW link

[Aspiration-based designs] 2. Formal framework, basic algorithm

Jobst Heitzig, Simon Dima and Simon Fischer

Apr 28, 2024, 1:02 PM

18 points

2 comments16 min readLW link

We need a universal definition of ‘agency’ and related words

CstineSublimeJan 11, 2025, 3:22 AM

18 points

1 comment5 min readLW link

An angle of attack on Open Problem #1

BenyaAug 18, 2012, 12:08 PM

54 points

85 comments7 min readLW link

Metaphilosophical competence can’t be disentangled from alignment

zhukeepaApr 1, 2018, 12:38 AM

46 points

39 comments3 min readLW link

Thoughts on the 5-10 Problem

ToflyJul 18, 2019, 6:56 PM

18 points

11 comments1 min readLW link

Reward is not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning

Roman LeventovJan 12, 2023, 4:43 PM

17 points

2 comments2 min readLW link

(arxiv.org)

Robustness to Scale

Scott GarrabrantFeb 21, 2018, 10:55 PM

131 points

23 comments2 min readLW link 1 review

Sets of objectives for a multi-objective RL agent to optimize

Ben Smith and Roland Pihlakas

Nov 23, 2022, 6:49 AM

13 points

0 comments8 min readLW link

Reflection in Probabilistic Logic

Eliezer YudkowskyMar 24, 2013, 4:37 PM

112 points

168 comments3 min readLW link

Security Mindset and the Logistic Success Curve

Eliezer YudkowskyNov 26, 2017, 3:58 PM

106 points

49 comments20 min readLW link

Can we achieve AGI Alignment by balancing multiple human objectives?

Ben SmithJul 3, 2022, 2:51 AM

11 points

1 comment4 min readLW link

Temporally Layered Architecture for Adaptive, Distributed and Continuous Control

Roman LeventovFeb 2, 2023, 6:29 AM

6 points

4 comments1 min readLW link

(arxiv.org)

Automated monitoring systems

hiki_tNov 28, 2024, 6:54 PM

1 point

0 comments2 min readLW link

Even Superhuman Go AIs Have Surprising Failure Modes

AdamGleave, EuanMcLean, Tony Wang, Kellin Pelrine, Tom Tseng, Yawen Duan, Joseph Miller and MichaelDennis

Jul 20, 2023, 5:31 PM

130 points

22 comments10 min readLW link

(far.ai)

Tiling Agents for Self-Modifying AI (OPFAI #2)

Eliezer YudkowskyJun 6, 2013, 8:24 PM

88 points

259 comments3 min readLW link

Vingean Reflection: Reliable Reasoning for Self-Improving Agents

So8resJan 15, 2015, 10:47 PM

37 points

5 comments9 min readLW link

2-D Robustness

Vlad MikulikAug 30, 2019, 8:27 PM

85 points

8 comments2 min readLW link

AISC project: SatisfIA – AI that satisfies without overdoing it

Jobst HeitzigNov 11, 2023, 6:22 PM

12 points

0 comments1 min readLW link

(docs.google.com)

Beyond the Board: Exploring AI Robustness Through Go

AdamGleaveJun 19, 2024, 4:40 PM

41 points

2 comments1 min readLW link

(far.ai)

On agentic generalist models: we’re essentially using existing technology the weakest and worst way you can use it

Yuli_BanAug 28, 2024, 1:57 AM

10 points

2 comments9 min readLW link

Ruby Sep 14, 2020, 10:47 PM
2 points
0
I actually think this is definition is basically just what “agency” is.
- Ruby Sep 14, 2020, 11:17 PM
  2 points
  0
  Parent
  some AIs might be programmed with a narrow set of goals.
  I don’t think this precludes you from being an agent or should preclude you from being a “robust agent”.

Ro­bust Agents

See also

Robust Agents