The Gears of Impact

Schedul­ing: The re­main­der of the se­quence will be re­leased af­ter some de­lay.

Ex­er­cise: Why does in­stru­men­tal con­ver­gence hap­pen? Would it be co­her­ent to imag­ine a re­al­ity with­out it?


  • Here, our de­scrip­tive the­ory re­lies on our abil­ity to have rea­son­able be­liefs about what we’ll do, and how things in the world will af­fect our later de­ci­sion-mak­ing pro­cess. No one knows how to for­mal­ize that kind of rea­son­ing, so I’m leav­ing it a black box: we some­how have these rea­son­able be­liefs which are ap­par­ently used to calcu­late AU.

  • In tech­ni­cal terms, AU calcu­lated with the “could” crite­rion would be closer to an op­ti­mal value func­tion, while ac­tual AU seems to be an on-policy pre­dic­tion, what­ever that means in the em­bed­ded con­text. Felt im­pact cor­re­sponds to TD er­ror.

    • This is one ma­jor rea­son I’m dis­am­biguat­ing be­tween AU and EU; in the non-em­bed­ded con­text. In re­in­force­ment learn­ing, AU is a very par­tic­u­lar kind of EU: , the ex­pected re­turn un­der the op­ti­mal policy.

  • Framed as a kind of EU, we plau­si­bly use AU to make de­ci­sions.

  • I’m not claiming nor­ma­tively that “em­bed­ded agen­tic” EU should be AU; I’m sim­ply us­ing “em­bed­ded agen­tic” as an ad­jec­tive.