I didn’t mean to say “internal” memory. E.g. you can start TM head, which is finite automaton, on empty tape. So, how is it different from starting finite automaton in the world where you can interact with something that might as well be tape, and get TM analogue in combination of agent + world.
So, it has only observations from environment, no action lever to pull? Or the actions are internal, without being relayed to environment?
Two quick clarifications that I should state here more explicitly:
- An FSRA bounds the agent’s internal memory: it has a finite state set Q and fixed maps δ,λ. The agent emits an action symbol for each observation symbol; actions are part of the model (alphabet A). - What I did not assume (unless explicitly stated) is any particular model of the environment’s memory or persistence. If the environment can record the agent’s actions and later reveal them back as observations (i.e. the environment is effectively a persistent writable tape), then an FSRA interacting with that environment can indeed implement behavior that a standalone finite automaton cannot, the joint agent+environment can be (arbitrarily) powerful.
The point of the FSRA model in this post is to isolate finite internal memory as a resource. To decide whether the closed-loop system is finite-state you must also model the environment. Observations do not depend on past agent actions (or the environment does not persist agent writes). Then the agent’s only memory is Q, and Theorem 3.13 applies.
I didn’t mean to say “internal” memory. E.g. you can start TM head, which is finite automaton, on empty tape. So, how is it different from starting finite automaton in the world where you can interact with something that might as well be tape, and get TM analogue in combination of agent + world.
So, it has only observations from environment, no action lever to pull? Or the actions are internal, without being relayed to environment?
(I did not read your whole post sorry)
Two quick clarifications that I should state here more explicitly:
- An FSRA bounds the agent’s internal memory: it has a finite state set Q and fixed maps δ,λ. The agent emits an action symbol for each observation symbol; actions are part of the model (alphabet A).
- What I did not assume (unless explicitly stated) is any particular model of the environment’s memory or persistence. If the environment can record the agent’s actions and later reveal them back as observations (i.e. the environment is effectively a persistent writable tape), then an FSRA interacting with that environment can indeed implement behavior that a standalone finite automaton cannot, the joint agent+environment can be (arbitrarily) powerful.
The point of the FSRA model in this post is to isolate finite internal memory as a resource. To decide whether the closed-loop system is finite-state you must also model the environment. Observations do not depend on past agent actions (or the environment does not persist agent writes). Then the agent’s only memory is Q, and Theorem 3.13 applies.