I’ve been a bit confused about “steering” as a concept. It seems kinda dual to learning, but why? It seems like things which are good at learning are very close to things which are good at steering, but they don’t always end up steering. It also seems like steering requires learning. What’s up here?
I think steering is basically learning, backwards, and maybe flipped sideways. In learning, you build up mutual information between yourself and the world; in steering, you spend that mutual information. You can have learning without steering—but not the other way around—because of the way time works.
This also lets us map certain things to one another: the effectiveness of methods like monte-carlo tree search (i.e. calling your world model repeatedly to form a plan) can be seen as dual to the effectiveness of things like randomized controlled trials (i.e. querying the external world repeatedly to form a good model).
I think steering is basically learning, backwards, and maybe flipped sideways. In learning, you build up mutual information between yourself and the world; in steering, you spend that mutual information. You can have learning without steering—but not the other way around—because of the way time works.
Alternatively, for learning your brain can start out in any given configuration, and it will end up in the same (small set of) final configuration (one that reflects the world); for steering the world can start out in any given configuration, and it will end up in the same set of target configurations
It seems like some amount of steering without learning is possible (open-loop control), you can reduce entropy in a subsystem while increasing entropy elsewhere to maintain information conservation
Steering as Dual to Learning
I’ve been a bit confused about “steering” as a concept. It seems kinda dual to learning, but why? It seems like things which are good at learning are very close to things which are good at steering, but they don’t always end up steering. It also seems like steering requires learning. What’s up here?
I think steering is basically learning, backwards, and maybe flipped sideways. In learning, you build up mutual information between yourself and the world; in steering, you spend that mutual information. You can have learning without steering—but not the other way around—because of the way time works.
This also lets us map certain things to one another: the effectiveness of methods like monte-carlo tree search (i.e. calling your world model repeatedly to form a plan) can be seen as dual to the effectiveness of things like randomized controlled trials (i.e. querying the external world repeatedly to form a good model).
See also this paper about plasticity as dual to empowerment https://arxiv.org/pdf/2505.10361v2
I’m just going from pure word vibes here, but I’ve read somewhere (to be precise, here) about Todorov’s duality between prediction and control: https://roboti.us/lab/papers/TodorovCDC08.pdf
Alternatively, for learning your brain can start out in any given configuration, and it will end up in the same (small set of) final configuration (one that reflects the world); for steering the world can start out in any given configuration, and it will end up in the same set of target configurations
It seems like some amount of steering without learning is possible (open-loop control), you can reduce entropy in a subsystem while increasing entropy elsewhere to maintain information conservation