Thanks for your question. I may be misunderstanding what you’re pointing at, so apologies in advance if I’m replying slightly off target.
In my setup, interaction with the environment is entirely captured by the observation sequence. The agent does not get access to any external writable memory or oracle beyond the current observation symbol and its own finite internal state.
Crucially, the only memory it has is its finite state set Q. The environment may be arbitrarily complex, but all that complexity reaches the agent only through the finite observation alphabet O, symbol by symbol. There is no tape, no unbounded working memory, and no way to encode arbitrary computation internally.
That’s the key difference from a Turing machine. A TM interacting with an environment can still have access to an unbounded tape, so its computational power comes from its memory model, not from whether it is “interactive.” In contrast, an FSRA has bounded internal memory by construction (finite Q), so it can only implement finite-memory transductions (equivalently, functions with finite Myhill–Nerode index, as in Theorem 3.13).
Vardhan
Hi, I’m Vardhan, one of the Dovetail fellows this winter. Thanks Alex & Alfred for running this!
Background: I study mathematics and computer science (probability, algorithms, game theory) and I’m interested in formal models of agents and multi-agent interaction.
For the fellowship, I looked at the question: Which agents can be faithfully described by finite automata / finite transducers, and which structural properties make that more or less likely? In other words, when can an agent’s externally observable behavior be captured by a finite (possibly stochastic) automaton, and what observable signatures indicate that a finite-state model is impossible or misleading?
I’ve written a brief report summarizing definitions, toy examples, and some light lemmas. I’m planning a longer post with formal definitions, more examples, and proofs. I’d really appreciate recommendations on literature I may have missed (especially anything linking automata/dynamical-systems perspectives to algorithmic information theory, ergodic theory, or learning theory). Comments, questions, and pointers very welcome!
Two quick clarifications that I should state here more explicitly:
- An FSRA bounds the agent’s internal memory: it has a finite state set Q and fixed maps δ,λ. The agent emits an action symbol for each observation symbol; actions are part of the model (alphabet A).
- What I did not assume (unless explicitly stated) is any particular model of the environment’s memory or persistence. If the environment can record the agent’s actions and later reveal them back as observations (i.e. the environment is effectively a persistent writable tape), then an FSRA interacting with that environment can indeed implement behavior that a standalone finite automaton cannot, the joint agent+environment can be (arbitrarily) powerful.
The point of the FSRA model in this post is to isolate finite internal memory as a resource. To decide whether the closed-loop system is finite-state you must also model the environment. Observations do not depend on past agent actions (or the environment does not persist agent writes). Then the agent’s only memory is Q, and Theorem 3.13 applies.