I’m pretty sure that decision theories are not designed on that basis.
You are wrong. In fact, this is a totally standard thing to consider, and “avoid back-chaining defection in games of fixed length” is a known problem, with various known strategies.
“Willpower is not exhaustible” is not necessarily the same claim as “willpower is infallible”. If, for example, you have a flat 75% chance of turning down sweets, then avoiding sweets still makes you more likely to not eat them. You’re not spending willpower, it’s just inherently unreliable.