there is “how true they are”, which is the amount of noise you fudge the model with to get lowest loss (maybe KL?) in expectation
This depends on the data distribution though, which could vary greatly (and in fact the data you collect will vary based on your actions which in turn are based on your models).
So I think a lot of the action is in defining which loss we care about.
This depends on the data distribution though, which could vary greatly (and in fact the data you collect will vary based on your actions which in turn are based on your models).
So I think a lot of the action is in defining which loss we care about.