One can always reparameterize any given input / output mapping as a search for the minima of some internal energy function, without changing the mapping at all.
The main criteria to think about is whether an agent will use creative, original strategies to maximize inner objectives, strategies which are more easily predicted by assuming the agent is “deliberately” looking for extremes of the inner objectives, as opposed to basing such predictions on the agent’s past actions, e.g., “gather more computational resources so I can find a high maximum”.
One can always reparameterize any given input / output mapping as a search for the minima of some internal energy function, without changing the mapping at all.
The main criteria to think about is whether an agent will use creative, original strategies to maximize inner objectives, strategies which are more easily predicted by assuming the agent is “deliberately” looking for extremes of the inner objectives, as opposed to basing such predictions on the agent’s past actions, e.g., “gather more computational resources so I can find a high maximum”.