This seems to be a case of trying to find easy solutions to hard abstract problems at the cost of failing to be correct on easy and ordinary ones. It’s also fairly trivial to come up with abstract scenarios where this fails catastrophically, so it’s not like this wins on the abstract scenarios front either. It just fails on a new and different set of problems—ones that aren’t talked about because no-one’s ever found a way to fail on them before.
Also, all of the problems you list it solving are problems which I would consider to be satisfactorily solved already. Pascal’s mugging fails if the believability of the claim is impacted by the magnitude of the numbers in it, since the mugger can keep naming bigger numbers and simply suffer lower credibility as a result. The St Petersburg paradox is intellectually interesting but impossible to actually construct in practice given a finite universe (versions using infinite time are defeated by bounded utility within a time period and geometric future discounting). The Cauchy distribution is just one of many functions with no mean, all that tells me is that it’s the wrong function to model the world with if you know the world should have a mean. And the repungent conclusion, well I can’t comment usefully about this because “repungent” or not I’ve never viewed it to be incorrect in the first place—so to me this potentially justifying smaller but happier populations is an error if anything.
I just think it’s worth making the point that the existing, complex solutions to these problems are a good thing. Complexity-influenced priors, careful handling of infinite numbers, bounded utility within a time period, geometric future discounting, integratable functions and correct utility summation and zero-points are all things we want to be doing anyway. Even when they’re not resolving a paradox! The paradoxes are good, they teach us things which circumventing the paradoxes in this way would not.
PS People feel free to correct my incomplete resolutions of those paradoxes, but be mindful of whether any errors or differences of opinion I might have actually undermine my point here or not.
Median utility does fail trivially. But it opens the door to other systems which might not. He just posted a refinement on this idea, Mean of Quantiles.
IMO this system is much more robust than expected utility. EU is required to trade away utility from the majority of possible outcomes to really rare outliers, like the mugger. Median utility will get you better outcomes at least 50% of the time. And tradeoffs like the one above, will get you outcomes that are good in the majority of possible outcomes, ignoring rare outliers. I’m not satisfied it’s the best possible system, so the subject is still worth thinking about and debating.
I don’t think any of your paradoxes are solved. You can’t get around Pascal’s mugging by modifying your probability distribution. The probability distribution has nothing to do with your utility function or decision theory. Besides being totally inelegant and hacky, there might be practical consequences. Like you can’t believe in the singularity now. The singularity could lead to vastly high utility futures, or really negative ones. Therefore it’s probability must be extremely small.
The St Petersburg casino is silly of course, but there’s no reason a real thing couldn’t produce a similar distribution. If you have some sequence of probabilities dependent on each other, that each have 1⁄2 probability, and give increasing utility.
I do acknowledge that my comment was overly negative, certainly the ideas behind it might lead to something useful.
I think you misunderstand my resolution of the mugging (which is fair enough since it wasn’t spelled out). I’m not modifying a probability, I’m assigning different probabilities to different statements. If the mugger says he’ll generate 3 units of utility difference that’s a more plausible statement than if the mugger says he’ll generate 3^^^3, etc. In fact, why would you not assign a different probability to those statements? So long as the implausibility grows at least as fast as the value (and why wouldn’t it?) there’s no paradox.
Re St Petersburg, sure you can have real scenarios that are “similar”, it’s just that they’re finite in practice. That’s a fairly important difference. If they’re finite then the game has a finite value, you can calculate it, and there’s no paradox. In which case median utility can only give the same answer or an exploitably wrong answer.
The whole point of the Pascal’s Mugging scenario is that the probability doesn’t decrease faster than the reward. If for example, you decrease the probability by half for each additional bit it takes to describe, 3^^^3 still only takes a few bits to write down.
Do you believe it’s literally impossible that there is a matrix? Or that it can’t be 3^^^3 large? Because when you assign these things so low probability, you are basically saying they are impossible. No amount of evidence could convince you otherwise.
I think EY had the best counter argument. He had a fictional scenario where a physicist proposed a new theory that was simple and fit the data perfectly. But the theory also implies a new law of physics that could be exploited for computing power, and would allow unfathomably large amounts of computing power. And that computing power could be used to create simulated humans.
Therefore anyone alive today has a small probability of affecting large amounts of simulated people. Since that is impossible, the theory must be wrong. It doesn’t matter if it’s simple or if it fits the data perfectly.
If they’re finite then the game has a finite value, you can calculate it, and there’s no paradox. In which case median utility can only give the same answer or an exploitably wrong answer.
Even in finite case, I believe it can grow quite large as the number of iterations increases. It’s one expected dollar each step. Each step having half the probability of the previous step, and twice the reward.
Imagine the game goes for n finite steps. An expected utility maximizer would still spend $n to play the game. A median maximizer would say “You are never going to win in the liftetime of the universe and then some, so no thanks.” The median maximizer seems correct to me.
Re St Petersburg, I will reiterate that there is no paradox in any finite setting. The game has a value. Whether you’d want to take a bet at close to the value of the game in a large but finite setting is a different question entirely.
And one that’s also been solved, certainly to my satisfaction. Logarithmic utility and/or the Kelly Criterion will both tell you not to bet if the payout is in money, and for the right reasons rather than arbitrary, value-ignoring reasons (in that they’ll tell you exactly what you should pay for the bet). If the payout is directly in utility, well I think you’d want to see what mindbogglingly large utility looked like before you dismiss it. It’s pretty hard if not impossible to generate that much utility with logarithmic utility of wealth and geometric discounting. But even given that, a one in a triillion chance at a trillion worthwhile extra days of life may well be worth a dollar (assuming I believed it of course). I’d probably just lose the dollar, but I wouldn’t want to completely dismiss it without even looking at the numbers.
Re the mugging, well I can at least accept that there are people who might find this convincing. But it’s funny that people can be willing to accept that they should pay but still don’t want to, and then come up with a rationalisation like median maximising, which might not even pay a dollar for the mugger not to shoot their mother if they couldn’t see the gun. If you really do think it’s sufficiently plausible, you should actually pay the guy. If you don’t want to pay I’d suggest it’s because you know intuitively that there’s something wrong with the rationale and refuse to pay a tax on your inability to sort it out. Which is the role the median utility is trying to play here, but to me it’s a case of trying to let two wrongs make a right.
Personally though I don’t have this problem. If you want to define “impossible” as “so unlikely that I will correctly never account for it in any decision I ever make” then yes, I do believe it’s impossible and so should anyone. Certainly there’s evidence that could convince me, even rather quickly, it’s just that I don’t expect to ever see such evidence. I certainly think there might be new laws of physics, but new laws of physics that lead to that much computing power that quickly is something else entirely. But that’s just what I think, and what you want to call impossible is entirely a non-argument, irrelevant issue anyway.
The trap I think is that when one imagines something like the matrix, one has no basis on which to put an upper bound on the scale of it, so any size seems plausible. But there is actually a tool for that exact situation: the ignorance prior of a scale value, 1/n. Which happens to decay at exactly the same rate as the number grows. Not everyone is on board with ignorance priors but I will mention that the biggest problem with the 1/n ignorance prior is actually that it doesn’t decay fast enough! Which serves to highlight the fact that if you’re willing to have the plausibility decay even slower than 1/n, your probability distribution is ill-formed, since it can’t integrate to 1.
Now to steel-man your argument, I’m aware of the way to cheat that. It’s by redistributing the values by, for instance, complexity, such that a family of arbitrarily large numbers can have sufficiently high probability assigned while the overall integral remains unity. What I think though—and this is the part I can accept people might disagree with, is that it’s a categorical error to use this distribution for the plausibility of a particular matrix-like unknown meta-universe. Complexity based probability distributions are a very good tool to describe, for instance, the plausibility of somebody making up such a story, since they have limited time to tell it and are more likely to pick a number they can describe easily. But being able to write a computer program to generate a number and having the actual physical resources to simulate that number of people are two entirely different sorts of things. I see no reason to believe that a meta-universe with 3^^^3 resources is any more likely than a meta-universe with similarly large but impossible to describe resources.
So I’ll stick with my proportional to 1/n likelihood of meta-universe scales, and continue to get the answer to the mugging that everyone else seems to think is right anyway. I certainly like it a lot better than median utility. But I concede that I shouldn’t have been quite so discouraging of someone trying to come up with an alternative, since not everyone might be convinced.
Re St Petersburg, I will reiterate that there is no paradox in any finite setting. The game has a value. Whether you’d want to take a bet at close to the value of the game in a large but finite setting is a different question entirely.
Well there are two separate points of the St Petersburg paradox. One is the existence of relatively simple distributions that have no mean. It doesn’t converge on any finite value. Another example of such a distribution, which actually occurs in physics, is the Cauchy distribution.
Another, which the original Pascal’s Mugger post was intended to address, was Solomonoff induction. The idealized prediction algorithm used in AIXI. EY demonstrated that if you use it to predict an unbounded value like utility, it doesn’t converge or have a mean.
The second point is just that the paying more than a few bucks to pay the game is silly. Even in a relatively small finite version of it. The probability of losing is very high. Even though it has a positive expected utility. And this holds even if you adjust the payout tables to account for utility != dollars.
You can bite the bullet and say that if the utility is really so high, you really should take that bet. And that’s fine. But I’m not really comfortable betting away everything on such tiny probabilities. You are basically guaranteed to lose and end up worse than not betting.
not even pay a dollar for the mugger not to shoot their mother if they couldn’t see the gun.
You can do a tradeoff between median maximizing and expected utility with mean of quantiles. This basically gives you the best average outcome ignoring incredibly unlikely outcomes. Even median maximizing by itself, which seems terrible, will give you the best possible outcome >50% of the time. The median is fairly robust.
Whereas expected utility could give you a shitty outcome 99% of the time or 99.999% of the time, etc. As long as the outliers are large enough.
Certainly there’s evidence that could convince me, even rather quickly, it’s just that I don’t expect to ever see such evidence.
If you are assigning 1/3^^^3 probability to something, then no amount of evidence will ever convince you.
I’m not saying that unbounded computing power is likely. I’m saying you shouldn’t assign infinitely small probability to it. The universe we live in runs on seemingly infinite computing power. We can’t even simulate the very smallest particles because of how large the number of computations grows.
Maybe someday someone will figure out how to use that computing power. Or even figure out that we could interact with the parent universe that runs us, etc. You shouldn’t use a model that assigns these things 0 probability.
This seems to be a case of trying to find easy solutions to hard abstract problems at the cost of failing to be correct on easy and ordinary ones. It’s also fairly trivial to come up with abstract scenarios where this fails catastrophically, so it’s not like this wins on the abstract scenarios front either. It just fails on a new and different set of problems—ones that aren’t talked about because no-one’s ever found a way to fail on them before.
Also, all of the problems you list it solving are problems which I would consider to be satisfactorily solved already. Pascal’s mugging fails if the believability of the claim is impacted by the magnitude of the numbers in it, since the mugger can keep naming bigger numbers and simply suffer lower credibility as a result. The St Petersburg paradox is intellectually interesting but impossible to actually construct in practice given a finite universe (versions using infinite time are defeated by bounded utility within a time period and geometric future discounting). The Cauchy distribution is just one of many functions with no mean, all that tells me is that it’s the wrong function to model the world with if you know the world should have a mean. And the repungent conclusion, well I can’t comment usefully about this because “repungent” or not I’ve never viewed it to be incorrect in the first place—so to me this potentially justifying smaller but happier populations is an error if anything.
I just think it’s worth making the point that the existing, complex solutions to these problems are a good thing. Complexity-influenced priors, careful handling of infinite numbers, bounded utility within a time period, geometric future discounting, integratable functions and correct utility summation and zero-points are all things we want to be doing anyway. Even when they’re not resolving a paradox! The paradoxes are good, they teach us things which circumventing the paradoxes in this way would not.
PS People feel free to correct my incomplete resolutions of those paradoxes, but be mindful of whether any errors or differences of opinion I might have actually undermine my point here or not.
Median utility does fail trivially. But it opens the door to other systems which might not. He just posted a refinement on this idea, Mean of Quantiles.
IMO this system is much more robust than expected utility. EU is required to trade away utility from the majority of possible outcomes to really rare outliers, like the mugger. Median utility will get you better outcomes at least 50% of the time. And tradeoffs like the one above, will get you outcomes that are good in the majority of possible outcomes, ignoring rare outliers. I’m not satisfied it’s the best possible system, so the subject is still worth thinking about and debating.
I don’t think any of your paradoxes are solved. You can’t get around Pascal’s mugging by modifying your probability distribution. The probability distribution has nothing to do with your utility function or decision theory. Besides being totally inelegant and hacky, there might be practical consequences. Like you can’t believe in the singularity now. The singularity could lead to vastly high utility futures, or really negative ones. Therefore it’s probability must be extremely small.
The St Petersburg casino is silly of course, but there’s no reason a real thing couldn’t produce a similar distribution. If you have some sequence of probabilities dependent on each other, that each have 1⁄2 probability, and give increasing utility.
I do acknowledge that my comment was overly negative, certainly the ideas behind it might lead to something useful.
I think you misunderstand my resolution of the mugging (which is fair enough since it wasn’t spelled out). I’m not modifying a probability, I’m assigning different probabilities to different statements. If the mugger says he’ll generate 3 units of utility difference that’s a more plausible statement than if the mugger says he’ll generate 3^^^3, etc. In fact, why would you not assign a different probability to those statements? So long as the implausibility grows at least as fast as the value (and why wouldn’t it?) there’s no paradox.
Re St Petersburg, sure you can have real scenarios that are “similar”, it’s just that they’re finite in practice. That’s a fairly important difference. If they’re finite then the game has a finite value, you can calculate it, and there’s no paradox. In which case median utility can only give the same answer or an exploitably wrong answer.
The whole point of the Pascal’s Mugging scenario is that the probability doesn’t decrease faster than the reward. If for example, you decrease the probability by half for each additional bit it takes to describe, 3^^^3 still only takes a few bits to write down.
Do you believe it’s literally impossible that there is a matrix? Or that it can’t be 3^^^3 large? Because when you assign these things so low probability, you are basically saying they are impossible. No amount of evidence could convince you otherwise.
I think EY had the best counter argument. He had a fictional scenario where a physicist proposed a new theory that was simple and fit the data perfectly. But the theory also implies a new law of physics that could be exploited for computing power, and would allow unfathomably large amounts of computing power. And that computing power could be used to create simulated humans.
Therefore anyone alive today has a small probability of affecting large amounts of simulated people. Since that is impossible, the theory must be wrong. It doesn’t matter if it’s simple or if it fits the data perfectly.
Even in finite case, I believe it can grow quite large as the number of iterations increases. It’s one expected dollar each step. Each step having half the probability of the previous step, and twice the reward.
Imagine the game goes for n finite steps. An expected utility maximizer would still spend $n to play the game. A median maximizer would say “You are never going to win in the liftetime of the universe and then some, so no thanks.” The median maximizer seems correct to me.
Re St Petersburg, I will reiterate that there is no paradox in any finite setting. The game has a value. Whether you’d want to take a bet at close to the value of the game in a large but finite setting is a different question entirely.
And one that’s also been solved, certainly to my satisfaction. Logarithmic utility and/or the Kelly Criterion will both tell you not to bet if the payout is in money, and for the right reasons rather than arbitrary, value-ignoring reasons (in that they’ll tell you exactly what you should pay for the bet). If the payout is directly in utility, well I think you’d want to see what mindbogglingly large utility looked like before you dismiss it. It’s pretty hard if not impossible to generate that much utility with logarithmic utility of wealth and geometric discounting. But even given that, a one in a triillion chance at a trillion worthwhile extra days of life may well be worth a dollar (assuming I believed it of course). I’d probably just lose the dollar, but I wouldn’t want to completely dismiss it without even looking at the numbers.
Re the mugging, well I can at least accept that there are people who might find this convincing. But it’s funny that people can be willing to accept that they should pay but still don’t want to, and then come up with a rationalisation like median maximising, which might not even pay a dollar for the mugger not to shoot their mother if they couldn’t see the gun. If you really do think it’s sufficiently plausible, you should actually pay the guy. If you don’t want to pay I’d suggest it’s because you know intuitively that there’s something wrong with the rationale and refuse to pay a tax on your inability to sort it out. Which is the role the median utility is trying to play here, but to me it’s a case of trying to let two wrongs make a right.
Personally though I don’t have this problem. If you want to define “impossible” as “so unlikely that I will correctly never account for it in any decision I ever make” then yes, I do believe it’s impossible and so should anyone. Certainly there’s evidence that could convince me, even rather quickly, it’s just that I don’t expect to ever see such evidence. I certainly think there might be new laws of physics, but new laws of physics that lead to that much computing power that quickly is something else entirely. But that’s just what I think, and what you want to call impossible is entirely a non-argument, irrelevant issue anyway.
The trap I think is that when one imagines something like the matrix, one has no basis on which to put an upper bound on the scale of it, so any size seems plausible. But there is actually a tool for that exact situation: the ignorance prior of a scale value, 1/n. Which happens to decay at exactly the same rate as the number grows. Not everyone is on board with ignorance priors but I will mention that the biggest problem with the 1/n ignorance prior is actually that it doesn’t decay fast enough! Which serves to highlight the fact that if you’re willing to have the plausibility decay even slower than 1/n, your probability distribution is ill-formed, since it can’t integrate to 1.
Now to steel-man your argument, I’m aware of the way to cheat that. It’s by redistributing the values by, for instance, complexity, such that a family of arbitrarily large numbers can have sufficiently high probability assigned while the overall integral remains unity. What I think though—and this is the part I can accept people might disagree with, is that it’s a categorical error to use this distribution for the plausibility of a particular matrix-like unknown meta-universe. Complexity based probability distributions are a very good tool to describe, for instance, the plausibility of somebody making up such a story, since they have limited time to tell it and are more likely to pick a number they can describe easily. But being able to write a computer program to generate a number and having the actual physical resources to simulate that number of people are two entirely different sorts of things. I see no reason to believe that a meta-universe with 3^^^3 resources is any more likely than a meta-universe with similarly large but impossible to describe resources.
So I’ll stick with my proportional to 1/n likelihood of meta-universe scales, and continue to get the answer to the mugging that everyone else seems to think is right anyway. I certainly like it a lot better than median utility. But I concede that I shouldn’t have been quite so discouraging of someone trying to come up with an alternative, since not everyone might be convinced.
Well there are two separate points of the St Petersburg paradox. One is the existence of relatively simple distributions that have no mean. It doesn’t converge on any finite value. Another example of such a distribution, which actually occurs in physics, is the Cauchy distribution.
Another, which the original Pascal’s Mugger post was intended to address, was Solomonoff induction. The idealized prediction algorithm used in AIXI. EY demonstrated that if you use it to predict an unbounded value like utility, it doesn’t converge or have a mean.
The second point is just that the paying more than a few bucks to pay the game is silly. Even in a relatively small finite version of it. The probability of losing is very high. Even though it has a positive expected utility. And this holds even if you adjust the payout tables to account for utility != dollars.
You can bite the bullet and say that if the utility is really so high, you really should take that bet. And that’s fine. But I’m not really comfortable betting away everything on such tiny probabilities. You are basically guaranteed to lose and end up worse than not betting.
You can do a tradeoff between median maximizing and expected utility with mean of quantiles. This basically gives you the best average outcome ignoring incredibly unlikely outcomes. Even median maximizing by itself, which seems terrible, will give you the best possible outcome >50% of the time. The median is fairly robust.
Whereas expected utility could give you a shitty outcome 99% of the time or 99.999% of the time, etc. As long as the outliers are large enough.
If you are assigning 1/3^^^3 probability to something, then no amount of evidence will ever convince you.
I’m not saying that unbounded computing power is likely. I’m saying you shouldn’t assign infinitely small probability to it. The universe we live in runs on seemingly infinite computing power. We can’t even simulate the very smallest particles because of how large the number of computations grows.
Maybe someday someone will figure out how to use that computing power. Or even figure out that we could interact with the parent universe that runs us, etc. You shouldn’t use a model that assigns these things 0 probability.