It seems to me that the optimal schedule by which to use up your slack / resources is based on risk. When planning for the future, there’s always the possibility that some unknown unknown interferes. When maximizing the total Intrinsically Good Stuff you get to do, you have to take into account timelines where all the ants’ planning is for nought and the grasshopper actually has the right idea. It doesn’t seem right to ever have zero credence of this (as that means being totally certain that the project of saving up resources for cosmic winter will go perfectly smoothly, and we can’t be certain of something that will literally take trillions of years), therefore it is actually optimal to always put some of your resources into living for right now, proportional to that uncertainty about the success of the project.

The math doesn’t necessarily work out that way. If you value the good stuff linearly, the optimal course of action will either be to spend all your resources right away (because the high discount rate makes the future too risky) or to save everything for later (because you can get such a high return on investment that spending any now would be wasteful). Even in a more realistic case where utility is logarithmic with, for example, computation, anticipation of much higher efficiency in the far future could lead to the optimal choice being to use essentially the bare minimum right now.

I think there are reasonable arguments for putting some resources toward a good life in the present, but they mostly involve not being able to realistically pull off total self-deprivation for an extended period of time. So finding the right balance is difficult, because our thinking is naturally biased to want to enjoy ourselves right now. How do you “cancel out” this bias while still accounting for the limits of your ability to maintain motivation? Seems like a tall order to achieve just by introspection.

Ooh! I don’t know much about the theory of reinforcement learning, could you explain that more / point me to references? (Also, this feels like it relates to the real reason for the time-value of money: money you supposedly will get in the future always has a less than 100% chance of actually reaching you, and is thus less valuable than money you have now.)

Exactly this. This is the relationship in RL between the discount factor and the probability of transitioning into an absorbing state (death)

