For the record, the VNM theorem is about the fact that you are *maximizing expected utility*. All three of the words are important, not just the utility function part. The biggest constraint that the VNM theorem applies is that, assuming there is a “true” probability distribution over outcomes (or that the agent has a well-calibrated belief over outcomes that captures all information it has about the environment), the agent must choose actions in a way consistent with maximizing the expectation of some real-valued function of the outcome, which does in fact rule out some possibilities.

It’s only when you don’t have a probability distribution that the VNM theorem becomes contentless. So one check to see whether or not it’s “reasonable” to apply the VNM theorem is to see what happens in a deterministic environment (and the agent can perfectly model the environment) -- the VNM theorem shouldn’t add any force to the argument in this setting.

In theory, never (either hyperbolic time discounting is a bias, and never “should” be done, or it’s a value, but one that longtermists explicitly don’t share).

In practice, hyperbolic time discounting might be a useful heuristic, e.g. perhaps since we are bad at thinking of all the ways that our plans can go wrong, we tend to overestimate the amount of stuff we’ll have in the future, and hyperbolic time discounting corrects for that.