Toying With Goal-Directedness

We (Adam Shimi, Joe Collman, Michele Campolo and Sabrina Tang) are studying both how to formalize the intuitions behind goal-directedness, and what is its relevance to AI Safety. This sequence is here for the not fully polished posts that represent more the ideas of their specific author than a consensus inside the group.

Goal-di­rected = Model-based RL?

Fo­cus: you are al­lowed to be bad at ac­com­plish­ing your goals

Goal-di­rect­ed­ness is be­hav­ioral, not structural

Lo­cal­ity of goals

Goals and short descriptions

Goal-Direct­ed­ness: What Suc­cess Looks Like