Parfit’s Hitchhiker is a problem for testing decision theories. It asks an agent to sacrifice utility at a later time to obtain utility at an earlier time, where the sacrifice happens after the benefit.
Suppose you’re out in the desert, running out of water, and soon to die—when someone in a motor vehicle drives up next to you. Furthermore, the driver of the motor vehicle is a perfectly selfish ideal game-theoretic agent, and even further, so are you; and what’s more, the driver is Paul Ekman, who’s really, really good at reading facial microexpressions. The driver says, “Well, I’ll convey you to town if it’s in my interest to do so—so will you give me $100 from an ATM when we reach town?”
Now of course you wish you could answer “Yes”, but as an ideal game theorist yourself, you realize that, once you actually reach town, you’ll have no further motive to pay off the driver. “Yes,” you say. “You’re lying,” says the driver, and drives off leaving you to die.
If only you weren’t so rational!
This is the dilemma of Parfit’s Hitchhiker, and the above is the standard resolution according to mainstream philosophy’s causal decision theory, which also two-boxes on Newcomb’s problem and defects in the [Twin] Prisoner’s Dilemma.
MIRI’s newest decision theory, Functional Decision Theory (FDT), notes that, while you’re talking to the driver, you already know whether or not you’re going to pay once you’re in town (the problem specifies this). More formally, you have a model of your future decision procedure, which you run to predict what you will do once you’re in town. Therefore, if you follow FDT, you pay up once you’re in town: if you do so, your past model of your decision procedure also “pays”—which means “past you” predicts you’ll pay and can truthfully say “Yes” to Paul Ekman. Paul reads your microexpressions, believes you, and conveys you to town. If you don’t pay up once you’re in town, “past you” predicts this and can’t truthfully say “yes” to the driver, resulting in a lonely death in the desert.