there are large objects computed by short programs with short input or even no input, so your overall argument is still incorrect.
I have to say, this caused me a fair bit of thought.
Firstly, I just want to confirm that you agree a universe as we know it has complexity of the order of its size. I agree that an equivalently “large” universe with low complexity could be imagined, but its laws would have to be quite different to ours. Such a universe, while large, would be locked in symmetry to preserve its low complexity.
Just an aside on randomness, you might consider a relatively small program generating even this universe, by simply simulating the laws of physics, which include a lot of random events quite possibly including even the Big Bang itself. However I would argue that the definition of complexity does not allow for random calculations. To make such calculations, a pseudo random input is required, the length of which is added to the complexity. AIXI would certainly not be able to function otherwise.
The mugger requires more than just a sufficiently large universe. They require a universe which can simulate 3^^^^3 people. A low complexity universe might be able to be large by some measures, but because it is locked in a low complexity symmetry, it cannot be used simulate 3^^^^3 unique people. For example the memory required (remember I mean the memory within the mugger’s universe itself, not the memory used by the hypothetical program used to evaluate that universe’s complexity) would need to be of the order of 3^^^^3, however while the universe may have 3^^^^3 particles if those particles are locked in a low-complexity symmetry then they cannot possibly hold 3^^^^3 bits of data.
In short, a machine of complexity of 3^^^^3 is fundamentally required to simulate 3^^^^3 different people. My error was to argue about the complexity of the mugger’s universe, when what matters is the complexity of the mugger’s computing resources.
I already explained why this is incorrect, and you responded by defending your separate point about action guidance while appearing to believe that you had made a rebuttal.
No, all of your arguments relate to random sensory inputs, which are alternative theories ‘C’ not the ‘A’ or ‘B’ that I referred to. To formalise:
I claim there exists theories A and B along with evidence E, such that:
p(B) > 3^^^^3p(A)
p(A|E) > p(B|E)
complexity(E) << 3^^^^3 (or more to the point it’s within our sensory bandwidth.
You have only demonstrated that there exists theory C (random input) such that C != B for any B satisfying the above, which I also tentatively agree with.
So the reason I switch to a separate point is because I don’t consider my original statement disproven, but I accept that theories like C may limit the relevance of it. Thus I argue about the relevance of it, with this business about whether it affects your action or not. To be clear, I do agree (and I have said this) that C-like theories can influence action (as you argue). I am trying to argue though that in many cases they do not. It’s hard to resolve since we don’t actually have a specific case we’re considering here, this whole issue is off on a tangent from the mugging itself.
I admit that the text of mine you quoted implies I meant it for any two theories A and B, which would be wrong. What I really meant was that there exist such (pairs of) theories. The cases where it can be true need to be very limited anyway because most theories do not admit evidence E as described, since it requires this extremely inefficiently encoded input.
If you’re saying that the extent to which an individual cares about the desires of an unbounded number of agents is unbounded, then you are contradicting yourself. If you aren’t saying that, then I don’t see why you wouldn’t accept boundedness of your utility function as a solution to Pascal’s mugging.
I’m not saying the first thing. I do accept bounded utility as a solution to the mugging for me (or any other agent) as an individual, as I said in the original post. If I was mugged I would not pay for this reason.
However, I am motivated (by a bounded amount) to make moral decisions correctly, especially when they don’t otherwise impact me directly. Thus if you modify the mugging to be an entirely moral question (i.e. someone else is paying), I am motivated to answer it correctly. To answer it correctly, I need to consider moral calculations, which I still believe to be unbounded. So for me there is still a problem to be solved here.
Firstly, I just want to confirm that you agree a universe as we know it has complexity of the order of its size. I agree that an equivalently “large” universe with low complexity could be imagined, but its laws would have to be quite different to ours. Such a universe, while large, would be locked in symmetry to preserve its low complexity.
No. Low complexity is not the same thing as symmetry. For example, you can write a short program to compute the first 3^^^^3 digits of pi. But it is widely believed that the first 3^^^^3 digits of pi have almost no symmetry.
I would argue that the definition of complexity does not allow for random calculations. To make such calculations, a pseudo random input is required, the length of which is added to the complexity.
Mostly correct. However, given a low-complexity program that uses a large random input, you can make a low-complexity program that simulates it by iterating through all possible inputs, and running the program on all of them. It is only when you try to run it on one particular high-complexity input without also running it on the others that it requires high complexity. Thus the lack of ability for a low-complexity program to use randomness does not prevent it from producing objects in its output that look like they were generated using randomness.
No, all of your arguments relate to random sensory inputs, which are alternative theories ‘C’ not the ‘A’ or ‘B’ that I referred to. To formalise: I claim there exists theories A and B along with evidence E, such that: p(B) > 3^^^^3p(A) p(A|E) > p(B|E) complexity(E) << 3^^^^3 (or more to the point it’s within our sensory bandwidth.
Oh, I see. This claim is correct. However, it does not seem that important to me, since p(A|E) will still be negligible.
To be clear, I do agree (and I have said this) that C-like theories can influence action (as you argue). I am trying to argue though that in many cases they do not.
It would be quite surprising if none of the “C-like” theories could influence action, given that there are so many of them (the only requirement to be “C-like” is that it is impossible in practice to convince you that C is less likely than A, which is not a strong condition, since the prior for A is < 1/3^^^^3).
However, I am motivated (by a bounded amount) to make moral decisions correctly, especially when they don’t otherwise impact me directly. Thus if you modify the mugging to be an entirely moral question (i.e. someone else is paying), I am motivated to answer it correctly. To answer it correctly, I need to consider moral calculations, which I still believe to be unbounded. So for me there is still a problem to be solved here.
Ah, I think you’re actually right that utility function boundedness is not a solution here (I actually still think that the utility function should be bounded, but that this is not relevant under certain conditions that you may be pointing at). Here’s my attempt at an analysis:
Assume for simplicity that there exist 3^^^^3 people (this seems okay because the ability of the mugger to affect them is much more implausible than their existence). The probability that there exists any agent which can affect on the order of 3^^^^3 people, and uses this ability to do bizarre Pascal’s mugging-like threats, is small (let’s say 10^-20).The probability that a random person pretends to be Pascal’s mugger is also small, but not as small (let’s say 10^-6). Thus if people pay Pascal’s mugger each time, this results in an expected 3^^^^3/(10^6) people losing $5 each, and if people do not pay, the expected number of people effected is 3^^^^3/(10^20), and there probably isn’t anything you can do to someone that matters 10^14 times as much as the marginal value of $5 (10^14 is actually pretty big). Thus it is a better policy to not pay. This did not directly use any negligible probabilities (nothing like 1/3^^^^3, I mean). However, this is arguably suspect because it implicitly assigns a probability of less than 1/3^^^^3 to the hypothesis that there exists a unique Pascal’s mugger who actually does have the powers he claims and that I am the person he approaches with the dilemma. I’ll have to think about this more.
(meta) Well, I’m quite relieved because I think we’re actually converging rather than diverging finally.
No. Low complexity is not the same thing as symmetry.
Yes sorry symmetry was just how I pictured it in my head, but it’s not the right word. My point was that the particles aren’t acting independently, they’re constrained.
Mostly correct. However, given a low-complexity program that uses a large random input, you can make a low-complexity program that simulates it by iterating through all possible inputs, and running the program on all of them.
By the same token you can write a low complexity program to iteratively generate every number. That doesn’t mean all numbers have low complexity. It needs to be the unique output of the program. If you tried to generate every combination then pick one out as the unique output, the picking-one-out step would require high complexity.
I think as a result of this whole discussion I can simplify my entire “finite resources” section to this one statement, which I might even edit in to the original post (though at this stage I don’t think many more people are ever likely to read it):
“It is not possible to simulate n humans without resources of complexity at least n.”
Everything else can be seen as simply serving to illustrate the difference between a complexity of n, and a complexity of complexity(n).
It would be quite surprising if none of the “C-like” theories could influence action, given that there are so many of them
It’s easy to give a theory a posterior probability of less than 1/3^^^^3, by giving it zero. Any theory that’s actually inconsistent with the evidence is simply disproven. What’s left are theories which either accept the observed event, i.e. those which have priors < 1/3^^^^3 (e.g. that the number chosen was 7 in my example), and theories which somehow reject either the observation itself or the logic tying the whole thing together.
It’s my view that theories which reject either observation or logic don’t motivate action because they give you nothing to go on. There are many of them, but that’s part of the problem since they include “the world is like X and you’ve failed to observe it correctly” for every X, making it difficult to break the symmetry.
I’m not completely convinced there can’t be alternative theories which don’t fall into the two categories above (either disproven or unhelpful), but they’re specific to the examples so it’s hard to argue about them in general terms. In some ways it doesn’t matter if you’re right, even if there was always compelling arguments not to act on a belief which had a prior of less than 1/3^^^^3, Pascal’s Muggle could give those arguments and not look foolish by refusing to shift his beliefs in the face of strong evidence. All I was originally trying to say was that it isn’t wrong to assign priors that low to something in the first place. Unless you disagree with that then we’re ultimately arguing over nothing here.
Here’s my attempt at an analysis
This solution seems to work as stated, but I think the dilemma itself can dodge this solution by constructing itself in a way that forces the population of people-to-be-tortured to be separate from the population of people-to-be-mugged. In that case there isn’t of the order of 3^^^^3 people paying the $5.
(meta again) I have to admit it’s ironic that this whole original post stemmed from an argument with someone else (in a post about a median utility based decision theory), which was triggered by me claiming Pascal’s Mugging wasn’t a problem that needed solving (at least certainly not by said median utility based decision theory). By the end of that I became convinced that the problem wasn’t considered solved and my ideas on it would be considered valuable. I’ve then spent most of my time here arguing with someone who doesn’t consider it unsolved! Maybe I could have saved myself a lot of karma by just introducing the two of you instead.
“It is not possible to simulate n humans without resources of complexity at least n.”
Still disagree. As I pointed out, it is possible to for a short program to generate outputs with a very large number of complex components.
It’s my view that theories which reject either observation or logic don’t motivate action because they give you nothing to go on. There are many of them, but that’s part of the problem since they include “the world is like X and you’ve failed to observe it correctly” for every X, making it difficult to break the symmetry.
Given only partial failure of observation or logic (where most of your observations and deductions are still correct), you still have something to go on, so you shouldn’t have symmetry there. For everything to cancel so that your 1/3^^^^3-probability hypothesis dominates your decision-making, it would require a remarkably precise symmetry in everything else.
Maybe I could have saved myself a lot of karma by just introducing the two of you instead.
I have also argued against the median utility maximization proposal already, actually.
I have to say, this caused me a fair bit of thought.
Firstly, I just want to confirm that you agree a universe as we know it has complexity of the order of its size. I agree that an equivalently “large” universe with low complexity could be imagined, but its laws would have to be quite different to ours. Such a universe, while large, would be locked in symmetry to preserve its low complexity.
Just an aside on randomness, you might consider a relatively small program generating even this universe, by simply simulating the laws of physics, which include a lot of random events quite possibly including even the Big Bang itself. However I would argue that the definition of complexity does not allow for random calculations. To make such calculations, a pseudo random input is required, the length of which is added to the complexity. AIXI would certainly not be able to function otherwise.
The mugger requires more than just a sufficiently large universe. They require a universe which can simulate 3^^^^3 people. A low complexity universe might be able to be large by some measures, but because it is locked in a low complexity symmetry, it cannot be used simulate 3^^^^3 unique people. For example the memory required (remember I mean the memory within the mugger’s universe itself, not the memory used by the hypothetical program used to evaluate that universe’s complexity) would need to be of the order of 3^^^^3, however while the universe may have 3^^^^3 particles if those particles are locked in a low-complexity symmetry then they cannot possibly hold 3^^^^3 bits of data.
In short, a machine of complexity of 3^^^^3 is fundamentally required to simulate 3^^^^3 different people. My error was to argue about the complexity of the mugger’s universe, when what matters is the complexity of the mugger’s computing resources.
No, all of your arguments relate to random sensory inputs, which are alternative theories ‘C’ not the ‘A’ or ‘B’ that I referred to. To formalise:
I claim there exists theories A and B along with evidence E, such that: p(B) > 3^^^^3p(A) p(A|E) > p(B|E) complexity(E) << 3^^^^3 (or more to the point it’s within our sensory bandwidth.
You have only demonstrated that there exists theory C (random input) such that C != B for any B satisfying the above, which I also tentatively agree with.
So the reason I switch to a separate point is because I don’t consider my original statement disproven, but I accept that theories like C may limit the relevance of it. Thus I argue about the relevance of it, with this business about whether it affects your action or not. To be clear, I do agree (and I have said this) that C-like theories can influence action (as you argue). I am trying to argue though that in many cases they do not. It’s hard to resolve since we don’t actually have a specific case we’re considering here, this whole issue is off on a tangent from the mugging itself.
I admit that the text of mine you quoted implies I meant it for any two theories A and B, which would be wrong. What I really meant was that there exist such (pairs of) theories. The cases where it can be true need to be very limited anyway because most theories do not admit evidence E as described, since it requires this extremely inefficiently encoded input.
I’m not saying the first thing. I do accept bounded utility as a solution to the mugging for me (or any other agent) as an individual, as I said in the original post. If I was mugged I would not pay for this reason.
However, I am motivated (by a bounded amount) to make moral decisions correctly, especially when they don’t otherwise impact me directly. Thus if you modify the mugging to be an entirely moral question (i.e. someone else is paying), I am motivated to answer it correctly. To answer it correctly, I need to consider moral calculations, which I still believe to be unbounded. So for me there is still a problem to be solved here.
No. Low complexity is not the same thing as symmetry. For example, you can write a short program to compute the first 3^^^^3 digits of pi. But it is widely believed that the first 3^^^^3 digits of pi have almost no symmetry.
Mostly correct. However, given a low-complexity program that uses a large random input, you can make a low-complexity program that simulates it by iterating through all possible inputs, and running the program on all of them. It is only when you try to run it on one particular high-complexity input without also running it on the others that it requires high complexity. Thus the lack of ability for a low-complexity program to use randomness does not prevent it from producing objects in its output that look like they were generated using randomness.
Oh, I see. This claim is correct. However, it does not seem that important to me, since p(A|E) will still be negligible.
It would be quite surprising if none of the “C-like” theories could influence action, given that there are so many of them (the only requirement to be “C-like” is that it is impossible in practice to convince you that C is less likely than A, which is not a strong condition, since the prior for A is < 1/3^^^^3).
Ah, I think you’re actually right that utility function boundedness is not a solution here (I actually still think that the utility function should be bounded, but that this is not relevant under certain conditions that you may be pointing at). Here’s my attempt at an analysis:
Assume for simplicity that there exist 3^^^^3 people (this seems okay because the ability of the mugger to affect them is much more implausible than their existence). The probability that there exists any agent which can affect on the order of 3^^^^3 people, and uses this ability to do bizarre Pascal’s mugging-like threats, is small (let’s say 10^-20).The probability that a random person pretends to be Pascal’s mugger is also small, but not as small (let’s say 10^-6). Thus if people pay Pascal’s mugger each time, this results in an expected 3^^^^3/(10^6) people losing $5 each, and if people do not pay, the expected number of people effected is 3^^^^3/(10^20), and there probably isn’t anything you can do to someone that matters 10^14 times as much as the marginal value of $5 (10^14 is actually pretty big). Thus it is a better policy to not pay. This did not directly use any negligible probabilities (nothing like 1/3^^^^3, I mean). However, this is arguably suspect because it implicitly assigns a probability of less than 1/3^^^^3 to the hypothesis that there exists a unique Pascal’s mugger who actually does have the powers he claims and that I am the person he approaches with the dilemma. I’ll have to think about this more.
(meta) Well, I’m quite relieved because I think we’re actually converging rather than diverging finally.
Yes sorry symmetry was just how I pictured it in my head, but it’s not the right word. My point was that the particles aren’t acting independently, they’re constrained.
By the same token you can write a low complexity program to iteratively generate every number. That doesn’t mean all numbers have low complexity. It needs to be the unique output of the program. If you tried to generate every combination then pick one out as the unique output, the picking-one-out step would require high complexity.
I think as a result of this whole discussion I can simplify my entire “finite resources” section to this one statement, which I might even edit in to the original post (though at this stage I don’t think many more people are ever likely to read it):
“It is not possible to simulate n humans without resources of complexity at least n.”
Everything else can be seen as simply serving to illustrate the difference between a complexity of n, and a complexity of complexity(n).
It’s easy to give a theory a posterior probability of less than 1/3^^^^3, by giving it zero. Any theory that’s actually inconsistent with the evidence is simply disproven. What’s left are theories which either accept the observed event, i.e. those which have priors < 1/3^^^^3 (e.g. that the number chosen was 7 in my example), and theories which somehow reject either the observation itself or the logic tying the whole thing together.
It’s my view that theories which reject either observation or logic don’t motivate action because they give you nothing to go on. There are many of them, but that’s part of the problem since they include “the world is like X and you’ve failed to observe it correctly” for every X, making it difficult to break the symmetry.
I’m not completely convinced there can’t be alternative theories which don’t fall into the two categories above (either disproven or unhelpful), but they’re specific to the examples so it’s hard to argue about them in general terms. In some ways it doesn’t matter if you’re right, even if there was always compelling arguments not to act on a belief which had a prior of less than 1/3^^^^3, Pascal’s Muggle could give those arguments and not look foolish by refusing to shift his beliefs in the face of strong evidence. All I was originally trying to say was that it isn’t wrong to assign priors that low to something in the first place. Unless you disagree with that then we’re ultimately arguing over nothing here.
This solution seems to work as stated, but I think the dilemma itself can dodge this solution by constructing itself in a way that forces the population of people-to-be-tortured to be separate from the population of people-to-be-mugged. In that case there isn’t of the order of 3^^^^3 people paying the $5.
(meta again) I have to admit it’s ironic that this whole original post stemmed from an argument with someone else (in a post about a median utility based decision theory), which was triggered by me claiming Pascal’s Mugging wasn’t a problem that needed solving (at least certainly not by said median utility based decision theory). By the end of that I became convinced that the problem wasn’t considered solved and my ideas on it would be considered valuable. I’ve then spent most of my time here arguing with someone who doesn’t consider it unsolved! Maybe I could have saved myself a lot of karma by just introducing the two of you instead.
Still disagree. As I pointed out, it is possible to for a short program to generate outputs with a very large number of complex components.
Given only partial failure of observation or logic (where most of your observations and deductions are still correct), you still have something to go on, so you shouldn’t have symmetry there. For everything to cancel so that your 1/3^^^^3-probability hypothesis dominates your decision-making, it would require a remarkably precise symmetry in everything else.
I have also argued against the median utility maximization proposal already, actually.