Hm, isn’t if we apply maximum entropy principle universally, aren’t we also obliged to apply it reflectively, i.e., model oneself as a maximum-entropy (active inference) agent?
If you precisely define what it means to apply it “universally” such that it gets you the desired behavior, sure. And to be clear, I’m not saying that’s a hard/impossible problem or anything like that, it’s just not directly implied by all things which match the description “follows the principle of maximum entropy.”
Looks more like a suitable inductive bias and/or bias is needed rather than causal surgery.
If you were actually trying to implement this, yes, I wouldn’t recommend routing through weird counterfactuals. (I just bring those up as a way of describing the target behavior.)
In fact, because even the version I outlined in the added footnote can still suffer from collapse in the case of convergent acausal strategies across possible predictors, I would indeed strongly recommend pushing for some additional bias that gives you more control over how the distribution looks. I think that’s pretty tractable, too.
If you precisely define what it means to apply it “universally” such that it gets you the desired behavior, sure. And to be clear, I’m not saying that’s a hard/impossible problem or anything like that, it’s just not directly implied by all things which match the description “follows the principle of maximum entropy.”
If you were actually trying to implement this, yes, I wouldn’t recommend routing through weird counterfactuals. (I just bring those up as a way of describing the target behavior.)
In fact, because even the version I outlined in the added footnote can still suffer from collapse in the case of convergent acausal strategies across possible predictors, I would indeed strongly recommend pushing for some additional bias that gives you more control over how the distribution looks. I think that’s pretty tractable, too.