RSS

Goal-Directedness

TagLast edit: 30 Nov 2021 23:53 UTC by Jon Garcia

The property of some system to be aiming at some goal. In need of formalization, but might prove important in deciding which kind of AI to try to align.

A goal may be defined as a world-state that an agent tries to achieve. Goal-directed agents may generate internal representations of desired end states, compare them against their internal representation of the current state of the world, and formulate plans for navigating from the latter to the former.

The goal-generating function may be derived from a pre-programmed lookup table (for simple worlds), from directly inverting the agent’s utility function (for simple utility functions), or it may be learned through experience mapping states to rewards and predicting which states will produce the largest rewards. The plan-generating algorithm could range from shortest-path algorithms like A* or Dijkstra’s algorithm (for fully-representable world graphs), to policy functions that learn through RL which actions bring the current state closer to the goal state (for simple AI), to some combination or extrapolation (for more advanced AI).

Implicit goal-directedness may come about in agents that do not have explicit internal representations of goals but that nevertheless learn or enact policies that cause the environment to converge on a certain state or set of states. Such implicit goal-directedness may arise, for instance, in simple reinforcement learning agents, which learn a policy function that maps states directly to actions.

Liter­a­ture Re­view on Goal-Directedness

18 Jan 2021 11:15 UTC
61 points
21 comments31 min readLW link

Co­her­ence ar­gu­ments do not en­tail goal-di­rected behavior

Rohin Shah3 Dec 2018 3:26 UTC
94 points
69 comments7 min readLW link3 reviews

Be­hav­ioral Suffi­cient Statis­tics for Goal-Directedness

adamShimi11 Mar 2021 15:01 UTC
21 points
12 comments9 min readLW link

In­tu­itions about goal-di­rected behavior

Rohin Shah1 Dec 2018 4:25 UTC
50 points
15 comments6 min readLW link

Will hu­mans build goal-di­rected agents?

Rohin Shah5 Jan 2019 1:33 UTC
51 points
43 comments5 min readLW link

AI safety with­out goal-di­rected behavior

Rohin Shah7 Jan 2019 7:48 UTC
63 points
15 comments4 min readLW link

Goal-di­rected = Model-based RL?

adamShimi20 Feb 2020 19:13 UTC
21 points
10 comments3 min readLW link

Fo­cus: you are al­lowed to be bad at ac­com­plish­ing your goals

adamShimi3 Jun 2020 21:04 UTC
19 points
17 comments3 min readLW link

Goal-di­rect­ed­ness is be­hav­ioral, not structural

adamShimi8 Jun 2020 23:05 UTC
6 points
12 comments3 min readLW link

Lo­cal­ity of goals

adamShimi22 Jun 2020 21:56 UTC
16 points
8 comments6 min readLW link

Goals and short descriptions

Michele Campolo2 Jul 2020 17:41 UTC
14 points
8 comments5 min readLW link

Goal-Direct­ed­ness: What Suc­cess Looks Like

adamShimi16 Aug 2020 18:33 UTC
9 points
0 comments2 min readLW link

Goal-Direct­ed­ness and Be­hav­ior, Redux

adamShimi9 Aug 2021 14:26 UTC
12 points
4 comments2 min readLW link

P₂B: Plan to P₂B Better

24 Oct 2021 15:21 UTC
26 points
14 comments6 min readLW link

Against the Back­ward Ap­proach to Goal-Directedness

adamShimi19 Jan 2021 18:46 UTC
19 points
6 comments4 min readLW link

Towards a Mechanis­tic Un­der­stand­ing of Goal-Directedness

Mark Xu9 Mar 2021 20:17 UTC
45 points
1 comment5 min readLW link

Value load­ing in the hu­man brain: a worked example

Steven Byrnes4 Aug 2021 17:20 UTC
43 points
2 comments8 min readLW link

When Most VNM-Co­her­ent Prefer­ence Order­ings Have Con­ver­gent In­stru­men­tal Incentives

TurnTrout9 Aug 2021 17:22 UTC
52 points
4 comments5 min readLW link

Ap­pli­ca­tions for De­con­fus­ing Goal-Directedness

adamShimi8 Aug 2021 13:05 UTC
36 points
0 comments5 min readLW link

A re­view of “Agents and De­vices”

adamShimi13 Aug 2021 8:42 UTC
9 points
0 comments4 min readLW link

Op­ti­miza­tion Con­cepts in the Game of Life

16 Oct 2021 20:51 UTC
66 points
14 comments10 min readLW link

Goal-di­rect­ed­ness: my baseline beliefs

Morgan_Rogers8 Jan 2022 13:09 UTC
20 points
3 comments3 min readLW link

Goal-di­rect­ed­ness: ex­plor­ing explanations

Morgan_Rogers14 Feb 2022 16:20 UTC
12 points
3 comments18 min readLW link

Goal-di­rect­ed­ness: im­perfect rea­son­ing, limited knowl­edge and in­ac­cu­rate beliefs

Morgan_Rogers19 Mar 2022 17:28 UTC
4 points
1 comment21 min readLW link

[Question] why as­sume AGIs will op­ti­mize for fixed goals?

nostalgebraist10 Jun 2022 1:28 UTC
91 points
52 comments4 min readLW link

wrap­per-minds are the enemy

nostalgebraist17 Jun 2022 1:58 UTC
70 points
26 comments8 min readLW link

Su­per­in­tel­li­gence 15: Or­a­cles, ge­nies and sovereigns

KatjaGrace23 Dec 2014 2:01 UTC
12 points
30 comments7 min readLW link

Dis­cus­sion: Ob­jec­tive Ro­bust­ness and In­ner Align­ment Terminology

23 Jun 2021 23:25 UTC
67 points
6 comments9 min readLW link

Em­piri­cal Ob­ser­va­tions of Ob­jec­tive Ro­bust­ness Failures

23 Jun 2021 23:23 UTC
63 points
5 comments9 min readLW link

Grokking the In­ten­tional Stance

jbkjr31 Aug 2021 15:49 UTC
40 points
20 comments20 min readLW link

[Question] Does Agent-like Be­hav­ior Im­ply Agent-like Ar­chi­tec­ture?

Scott Garrabrant23 Aug 2019 2:01 UTC
49 points
7 comments1 min readLW link

Fram­ing ap­proaches to al­ign­ment and the hard prob­lem of AI cognition

ryan_greenblatt15 Dec 2021 19:06 UTC
6 points
15 comments27 min readLW link

Break­ing Down Goal-Directed Behaviour

Oliver Sourbut16 Jun 2022 18:45 UTC
3 points
1 comment2 min readLW link

De­liber­a­tion, Re­ac­tions, and Con­trol: Ten­ta­tive Defi­ni­tions and a Res­tate­ment of In­stru­men­tal Convergence

Oliver Sourbut27 Jun 2022 17:25 UTC
6 points
0 comments11 min readLW link

De­liber­a­tion Every­where: Sim­ple Examples

Oliver Sourbut27 Jun 2022 17:26 UTC
12 points
0 comments15 min readLW link
No comments.