The quote specifically mentioned the type signature of A’s goal component. The idea I took from it was that B is doing something like
utility = compare(world_state, goal)
and A is doing something like
utility = compare(world_state, goal(world_state))
where the second thing can be unstable in ways that the first can’t.
...But maybe talking about the type signature of components is cheating? We probably don’t want to assume that all relevant agents will have something we can consider a “goal component”, though conceivably a selection theorem might let us do that in some situations.
The quote specifically mentioned the type signature of A’s goal component. The idea I took from it was that B is doing something like
and A is doing something like
where the second thing can be unstable in ways that the first can’t.
...But maybe talking about the type signature of components is cheating? We probably don’t want to assume that all relevant agents will have something we can consider a “goal component”, though conceivably a selection theorem might let us do that in some situations.