Has the following reply to Pascal’s Mugging been discussed on LessWrong?
Almost any ordinary good thing you could do has some positive expected downstream effects.
These positive expected downstream effects include lots of things like, “Humanity has slightly higher probability of doing awesome thing X in the far future.” Possible values of X include: create 3^^^^3 great lives or create infinite value through some presently unknown method, and stuff like, in a scenario where the future would have been really awesome, it’s one part in 10^30 better.
Given all the possible values of X whose probability is raised by doing ordinary good things, the expected value of doing any ordinary good thing is higher than the expected value of paying the mugger.
Therefore, almost any ordinary good thing you could do is better than paying the mugger. [I take it this is the conclusion we want.]
The most obvious complaint I can think of for this response is that it doesn’t solve selfish versions of Pascal’s Mugging very well, and may need to be combined with other tools in that case. But I don’t remember people talking about this and I don’t currently see what’s wrong with this as a response to the altruistic version of Pascal’s Mugging. (I don’t mean to suggest I would be very surprised if someone quickly and convincingly shoots this down.)
The obvious problem with this is that your utility is not defined if you are willing to accept muggings, so you can’t use the framework of expected utility maximization at all. The point of the mugger is just to illustrate this, I don’t think anyone thinks you should actually pay them (after all, you might encounter a more generous mugger tomorrow, or any number of more realistic opportunities to do astronomical amounts of good...)
Part of the issue is that I am coming at this problem from a different perspective than maybe you or Eliezer is. I believe that paying the mugger is basically worthless in the sense that doing almost any old good thing is better than paying the mugger. I would like to have a satisfying explanation of this. In contrast, Eliezer is interested in reconciling a view about complexity priors with a view about utility functions, and the mugger is an illustration of the conflict.
I do not have a proposed reconciliation of complexity priors and unbounded utility functions. Instead, the above comment is a recommended as an explanation of why paying the mugger is basically worthless in comparison with ordinary things you could do. So this hypothesis would say that if you set up your priors and your utility function in a reasonable way, the expected utility of downstream effects of ordinary good actions would greatly exceed the expected utility of paying the mugger.
Even if you decided that the expected utility framework somehow breaks down in cases like this, I think various related claims would still be plausible. E.g., rather than saying that doing ordinary good things has higher expected utility, it would be plausible that doing ordinary good things is “better relative to your uncertainty” than paying the mugger.
On a different note, another thing I find unsatisfying about the downstream effects reply is that it doesn’t seem to match up with why ordinary people think it is dumb to pay the mugger. The ultimate reason I think it is dumb to pay the mugger is strongly related to why ordinary people think it is dumb to pay the mugger, and I would like to be able to thoroughly understand the most plausible common-sense explanation of why paying the mugger is dumb. The proposed relationship between ordinary actions and their distant effects seems too far off from why common sense would say that paying the mugger is dumb. I guess this is ultimately pretty close to one of Nick Bostrom’s complaints about empirical stabilizing assumptions.
I believe that paying the mugger is basically worthless in the sense that doing almost any old good thing is better than paying the mugger.
I think we are all in agreement with this (modulo the fact that all of the expected values end up being infinite and so we can’t compare in the normal way; if you e.g. proposed a cap of 3^^^^^^^3 on utilities, then you certainly wouldn’t pay the mugger).
On a different note, another thing I find unsatisfying about the downstream effects reply is that it doesn’t seem to match up with why ordinary people think it is dumb to pay the mugger.
It seems very likely to me that ordinary people are best modeled as having bounded utility functions, which would explain the puzzle.
So it seems like there are two issues:
You would never pay the mugger in any case, because other actions are better.
If you object to the fact that the only thing you care about is a very small probability of an incredibly good outcome, then that’s basically the definition of having a bounded utility function.
And then there is the third issue Eliezer is dealing with, where he wants to be able to have an unbounded utility function even if that doesn’t describe anyone’s preferences (since it seems like boundedness is an unfortunate restriction to randomly impose on your preferences for technical reasons), and formally it’s not clear how to do that. At the end of the post he seems to suggest giving up on that though.
Obviously to really put the idea of people having bounded utility functions to the test, you have to forget about it solving problems of small probabilities and incredibly good outcomes and focus on the most unintuitive consequences of it. For one, having a bounded utility function means caring arbitrarily little about differences between the goodness of different sufficiently good outcomes. And all the outcomes could be certain too. You could come up with all kinds of thought experiments involving purchasing huge numbers of years happy life or some other good for a few cents. You know all of this so I wonder why you don’t talk about it.
Also I believe that Eliezer thinks that an unbounded utility function describes at least his preferences. I remember he made a comment about caring about new happy years of life no matter how many he’d already been granted.
(I haven’t read most of the discussion in this thread or might just be missing something so this might be irrelevant.)
As far as I know the strongest version of this argument is Benja’s, here (which incidentally seems to deserve many more upvotes than it got).
Benja’s scenario isn’t a problem for normal people though, who are not reflectively consistent and whose preferences manifestly change over time.
Beyond that, it seems like people’s preferences regarding the lifespan dilemma are somewhat confusing and probably inconsistent, much like their preferences regarding the repugnant conclusion. But that seems mostly orthogonal to pascal’s mugging, and the basic point—having unbounded utility by definition means you are willing to accept negligible chances of sufficiently good outcomes against probability nearly 1 of any fixed bad outcome, so if you object to the latter you are just objecting to unbounded utility.
I agree I was being uncharitable towards Eliezer. But it is true that at the end of this post he was suggesting giving up on unbounded utility, and that everyone in this crowd seems to ultimately take that route.
I think we are all in agreement with this (modulo the fact that all of the expected values end up being infinite and so we can’t compare in the normal way; if you e.g. proposed a cap of 3^^^^^^^3 on utilities, then you certainly wouldn’t pay the mugger).
Sorry, I didn’t mean to suggest otherwise. The “different perspective” part was supposed to be about the “in contrast” part.
It seems very likely to me that ordinary people are best modeled as having bounded utility functions, which would explain the puzzle.
I agree with yli that this has other unfortunate consequences. And, like Holden, I find it unfortunate to have to say that saving N lives with probability 1/N is worse than saving 1 life with probability 1. I also recognize that the things I would like to say about this collection of cases are inconsistent with each other. It’s a puzzle. I have written about this puzzle at reasonable length in my dissertation. I tend to think that bounded utility functions are the best consistent solution I know of, but that continuing to operate with inconsistent preferences (in a tasteful way) may be better in practice.
It’s in Nick Bostrom’s Infinite Ethics paper, which has been discussed repeatedly here, and has been floating around in various versions since 2003. He uses the term “empirical stabilizing assumption.”
I bring this up routinely in such discussions because of the misleading intuitions you elicit by using an example like a mugging that sets off many “no-go heuristics” that track chances of payoffs, large or small. But just because ordinary things may have a higher chance of producing huge payoffs than paying off a Pascal’s Mugger (who doesn’t do demonstrations), doesn’t mean your activities will be completely unchanged by taking huge payoffs into account.
Maybe the answer to this reply is that if there is a downstream multiplier for ordinary good accomplished, there is also a downstream multiplier for good accomplished by the mugger in the scenario where he is telling the truth. And multiplying each by a constant doesn’t change the bottom line.
The hypothesis is not that they exactly cancel the mugging utility, but that the downstream utilities exceed the mugging utility. I was actually thinking that these downstream effects would be much greater than paying the mugger.
That’s probably true in many cases, but the “mugger” scenario is really designed to test our limits. If 3^^^3 doesn’t work, then probably 3^^^^3 will. To be logically coherent, there has to be some crossover point, where the mugger provides exactly enough evidence to decide that yes, it’s worth paying the $5, despite our astoundingly low priors.
The proposed priors have one of two problems:
you can get mugged too easily, by your mugger simply being sophisticated enough to pick a high enough number to overwhelm your prior.
We’ve got a prior that is highly resistant to mugging, but unfortunately, is also resistant to being convinced by evidence. If there is any positive probability that we really could encounter a matrix lord able to do what they claim, and would offer some kind of pascal mugging like deal, there should be some amount of evidence that would convince us to take the deal. We would like it if the amount of necessary evidence were within the bounds of what it is possible for our brain to receive and update on in a lifetime, but that is not necessarily the case with the priors which we know will be able to avoid specious muggings.
I’m not actually certain that a prior has to exist which doesn’t have one of these two problems.
I also agree with Eliezer’s general principle that when we see convincing evidence of things that we previously considered effectively impossible (prior of /10^-googol or such), then we need to update the whole map on which that prior was based, not just on the specific point. When you watch a person turn into a small cat, either your own sense data, or pretty much your whole map of how things work must come into question. You can’t just say “Oh, people can turn into cats.” and move on as if that doesn’t affect almost everything you previously thought you knew about how the world worked.
It’s much more likely, based on what I know right now, that I am having an unusually convincing dream or hallucination than that people can turn into cats. And if I manage to collect enough evidence to actually make my probability of “people can turn into cats” higher than “my sensory data is not reliable”, then the whole framework of physics, chemistry, biology, and basic experience which caused me to assign such a low probability to “people can turn into cats” in the first place has to be reconsidered.
That’s probably true in many cases, but the “mugger” scenario is really designed to test our limits. If 3^^^3 doesn’t work, then probably 3^^^^3 will.
The probability that humans will eventually be capable of creating x utility given that the mugger is capable of creating x utility probably converges to some constant as x goes to infinity. (Of course, this still isn’t a solution as expected utility still doesn’t convege.)
Has the following reply to Pascal’s Mugging been discussed on LessWrong?
Almost any ordinary good thing you could do has some positive expected downstream effects.
These positive expected downstream effects include lots of things like, “Humanity has slightly higher probability of doing awesome thing X in the far future.” Possible values of X include: create 3^^^^3 great lives or create infinite value through some presently unknown method, and stuff like, in a scenario where the future would have been really awesome, it’s one part in 10^30 better.
Given all the possible values of X whose probability is raised by doing ordinary good things, the expected value of doing any ordinary good thing is higher than the expected value of paying the mugger.
Therefore, almost any ordinary good thing you could do is better than paying the mugger. [I take it this is the conclusion we want.]
The most obvious complaint I can think of for this response is that it doesn’t solve selfish versions of Pascal’s Mugging very well, and may need to be combined with other tools in that case. But I don’t remember people talking about this and I don’t currently see what’s wrong with this as a response to the altruistic version of Pascal’s Mugging. (I don’t mean to suggest I would be very surprised if someone quickly and convincingly shoots this down.)
The obvious problem with this is that your utility is not defined if you are willing to accept muggings, so you can’t use the framework of expected utility maximization at all. The point of the mugger is just to illustrate this, I don’t think anyone thinks you should actually pay them (after all, you might encounter a more generous mugger tomorrow, or any number of more realistic opportunities to do astronomical amounts of good...)
Part of the issue is that I am coming at this problem from a different perspective than maybe you or Eliezer is. I believe that paying the mugger is basically worthless in the sense that doing almost any old good thing is better than paying the mugger. I would like to have a satisfying explanation of this. In contrast, Eliezer is interested in reconciling a view about complexity priors with a view about utility functions, and the mugger is an illustration of the conflict.
I do not have a proposed reconciliation of complexity priors and unbounded utility functions. Instead, the above comment is a recommended as an explanation of why paying the mugger is basically worthless in comparison with ordinary things you could do. So this hypothesis would say that if you set up your priors and your utility function in a reasonable way, the expected utility of downstream effects of ordinary good actions would greatly exceed the expected utility of paying the mugger.
Even if you decided that the expected utility framework somehow breaks down in cases like this, I think various related claims would still be plausible. E.g., rather than saying that doing ordinary good things has higher expected utility, it would be plausible that doing ordinary good things is “better relative to your uncertainty” than paying the mugger.
On a different note, another thing I find unsatisfying about the downstream effects reply is that it doesn’t seem to match up with why ordinary people think it is dumb to pay the mugger. The ultimate reason I think it is dumb to pay the mugger is strongly related to why ordinary people think it is dumb to pay the mugger, and I would like to be able to thoroughly understand the most plausible common-sense explanation of why paying the mugger is dumb. The proposed relationship between ordinary actions and their distant effects seems too far off from why common sense would say that paying the mugger is dumb. I guess this is ultimately pretty close to one of Nick Bostrom’s complaints about empirical stabilizing assumptions.
I think we are all in agreement with this (modulo the fact that all of the expected values end up being infinite and so we can’t compare in the normal way; if you e.g. proposed a cap of 3^^^^^^^3 on utilities, then you certainly wouldn’t pay the mugger).
It seems very likely to me that ordinary people are best modeled as having bounded utility functions, which would explain the puzzle.
So it seems like there are two issues:
You would never pay the mugger in any case, because other actions are better.
If you object to the fact that the only thing you care about is a very small probability of an incredibly good outcome, then that’s basically the definition of having a bounded utility function.
And then there is the third issue Eliezer is dealing with, where he wants to be able to have an unbounded utility function even if that doesn’t describe anyone’s preferences (since it seems like boundedness is an unfortunate restriction to randomly impose on your preferences for technical reasons), and formally it’s not clear how to do that. At the end of the post he seems to suggest giving up on that though.
Obviously to really put the idea of people having bounded utility functions to the test, you have to forget about it solving problems of small probabilities and incredibly good outcomes and focus on the most unintuitive consequences of it. For one, having a bounded utility function means caring arbitrarily little about differences between the goodness of different sufficiently good outcomes. And all the outcomes could be certain too. You could come up with all kinds of thought experiments involving purchasing huge numbers of years happy life or some other good for a few cents. You know all of this so I wonder why you don’t talk about it.
Also I believe that Eliezer thinks that an unbounded utility function describes at least his preferences. I remember he made a comment about caring about new happy years of life no matter how many he’d already been granted.
(I haven’t read most of the discussion in this thread or might just be missing something so this might be irrelevant.)
As far as I know the strongest version of this argument is Benja’s, here (which incidentally seems to deserve many more upvotes than it got).
Benja’s scenario isn’t a problem for normal people though, who are not reflectively consistent and whose preferences manifestly change over time.
Beyond that, it seems like people’s preferences regarding the lifespan dilemma are somewhat confusing and probably inconsistent, much like their preferences regarding the repugnant conclusion. But that seems mostly orthogonal to pascal’s mugging, and the basic point—having unbounded utility by definition means you are willing to accept negligible chances of sufficiently good outcomes against probability nearly 1 of any fixed bad outcome, so if you object to the latter you are just objecting to unbounded utility.
I agree I was being uncharitable towards Eliezer. But it is true that at the end of this post he was suggesting giving up on unbounded utility, and that everyone in this crowd seems to ultimately take that route.
Sorry, I didn’t mean to suggest otherwise. The “different perspective” part was supposed to be about the “in contrast” part.
I agree with yli that this has other unfortunate consequences. And, like Holden, I find it unfortunate to have to say that saving N lives with probability 1/N is worse than saving 1 life with probability 1. I also recognize that the things I would like to say about this collection of cases are inconsistent with each other. It’s a puzzle. I have written about this puzzle at reasonable length in my dissertation. I tend to think that bounded utility functions are the best consistent solution I know of, but that continuing to operate with inconsistent preferences (in a tasteful way) may be better in practice.
It’s in Nick Bostrom’s Infinite Ethics paper, which has been discussed repeatedly here, and has been floating around in various versions since 2003. He uses the term “empirical stabilizing assumption.”
I bring this up routinely in such discussions because of the misleading intuitions you elicit by using an example like a mugging that sets off many “no-go heuristics” that track chances of payoffs, large or small. But just because ordinary things may have a higher chance of producing huge payoffs than paying off a Pascal’s Mugger (who doesn’t do demonstrations), doesn’t mean your activities will be completely unchanged by taking huge payoffs into account.
Maybe the answer to this reply is that if there is a downstream multiplier for ordinary good accomplished, there is also a downstream multiplier for good accomplished by the mugger in the scenario where he is telling the truth. And multiplying each by a constant doesn’t change the bottom line.
Why on earth would you expect the downstream utilities to exactly cancel the mugging utility?
The hypothesis is not that they exactly cancel the mugging utility, but that the downstream utilities exceed the mugging utility. I was actually thinking that these downstream effects would be much greater than paying the mugger.
That’s probably true in many cases, but the “mugger” scenario is really designed to test our limits. If 3^^^3 doesn’t work, then probably 3^^^^3 will. To be logically coherent, there has to be some crossover point, where the mugger provides exactly enough evidence to decide that yes, it’s worth paying the $5, despite our astoundingly low priors.
The proposed priors have one of two problems:
you can get mugged too easily, by your mugger simply being sophisticated enough to pick a high enough number to overwhelm your prior.
We’ve got a prior that is highly resistant to mugging, but unfortunately, is also resistant to being convinced by evidence. If there is any positive probability that we really could encounter a matrix lord able to do what they claim, and would offer some kind of pascal mugging like deal, there should be some amount of evidence that would convince us to take the deal. We would like it if the amount of necessary evidence were within the bounds of what it is possible for our brain to receive and update on in a lifetime, but that is not necessarily the case with the priors which we know will be able to avoid specious muggings.
I’m not actually certain that a prior has to exist which doesn’t have one of these two problems.
I also agree with Eliezer’s general principle that when we see convincing evidence of things that we previously considered effectively impossible (prior of /10^-googol or such), then we need to update the whole map on which that prior was based, not just on the specific point. When you watch a person turn into a small cat, either your own sense data, or pretty much your whole map of how things work must come into question. You can’t just say “Oh, people can turn into cats.” and move on as if that doesn’t affect almost everything you previously thought you knew about how the world worked.
It’s much more likely, based on what I know right now, that I am having an unusually convincing dream or hallucination than that people can turn into cats. And if I manage to collect enough evidence to actually make my probability of “people can turn into cats” higher than “my sensory data is not reliable”, then the whole framework of physics, chemistry, biology, and basic experience which caused me to assign such a low probability to “people can turn into cats” in the first place has to be reconsidered.
The probability that humans will eventually be capable of creating x utility given that the mugger is capable of creating x utility probably converges to some constant as x goes to infinity. (Of course, this still isn’t a solution as expected utility still doesn’t convege.)
That assumes that the number is independent of the prior. I wouldn’t make that assumption.