tentative claim: there are models of the world, which make predictions, and there is “how true they are”, which is the amount of noise you fudge the model with to get lowest loss (maybe KL?) in expectation.
E.g. “the grocery store is 500m away” corresponds to “my dist over the grocery store is centered at 500m, but has some amount of noise”
related to the claim that “all models are meta-models”, in that they are objects capable of e.g evaluating how applicable they are for making a given prediction. E.g. “newtonian mechanics” also carries along with it information about how if things are moving too fast, you need to add more noise to its predictions, i.e. it’s less true/applicable/etc.
there is “how true they are”, which is the amount of noise you fudge the model with to get lowest loss (maybe KL?) in expectation
This depends on the data distribution though, which could vary greatly (and in fact the data you collect will vary based on your actions which in turn are based on your models).
So I think a lot of the action is in defining which loss we care about.
tentative claim: there are models of the world, which make predictions, and there is “how true they are”, which is the amount of noise you fudge the model with to get lowest loss (maybe KL?) in expectation.
E.g. “the grocery store is 500m away” corresponds to “my dist over the grocery store is centered at 500m, but has some amount of noise”
related to the claim that “all models are meta-models”, in that they are objects capable of e.g evaluating how applicable they are for making a given prediction. E.g. “newtonian mechanics” also carries along with it information about how if things are moving too fast, you need to add more noise to its predictions, i.e. it’s less true/applicable/etc.
This depends on the data distribution though, which could vary greatly (and in fact the data you collect will vary based on your actions which in turn are based on your models).
So I think a lot of the action is in defining which loss we care about.
So perhaps noise ≈ inverse of variance ≈ degree of truth?