Thanks so much for writing this, I think it’s a much needed—perhaps even a bit late contribution connecting static views of GPT-based LLMs to dynamical systems and predictive processing. I do research on empirical agency and it’s still surprises me how little the AI-safety community touches on this central part of agency—namely that you can’t have agents without this closed loop.
I’ve been speculating a bit (mostly to myself) about the possibility that “simulators” are already a type of organism—given that appear to do active inference—which is the main driving force for nervous system evolution. Simulators seem to live in this inter-dimensional paradigm where (i) on one hand during training they behave like (sensory-systems) agents because they learn to predict outcomes and “experience” the effect of their prediction; but (ii) during inference/prediction they generally do not receive feedback. As you point out, all of this speculation may be moot as many are moving pretty fast towards embedding simulators and giving them memory etc.
What is your opinion on this idea of “loosening up” our definition of agents? I spoke to Max Tegmark a few weeks ago and my position is that we might be thinking of organisms from a time-chauvinist position—where we require the loop to be closed in a fast fashion (e.g. 1sec for most biological organisms).
Thanks for the comment. I agree broadly of course, but the paper says more specific things. For example, agency needs to be prioritized, probably taken outside of standard optimization, otherwise decimating pressure is applied on other concepts including truth and other “human values”. The other part is a empirical one, also related to your concern, namely, human values are quite flexible and biology doesn’t create hard bounds / limits on depletion. If you couple that with ML/AI technologies that will predict what we will do next—then approaches that depend on human intent and values (broadly) are not as safe anymore.