So even as the thought of 3^^^3 lives outweighs the tiny probability; couldn’t it be that a similar factor punishes it to an opposite direction, especially when dealing with hypotheses in which the AI will be able to have no further control? I don’t know. Bring in the mathematicians.
In general, yes, there is such a similar factor. There’s another one pushing it back, etc. The real problem isn’t that when you take tiny probabilities of huge outcomes into account it doesn’t give you what you want. It’s that it diverges and doesn’t give you anything at all. Pascal’s mugging at least has the valid, if unpopular, solution of just paying the mugger. There is no analogous solution to a divergent expected utility.
The Law of Visible Impact (a.k.a. The Generalized Hanson)
This one only works if they are making 3^^^3 people. If instead they are just making one person suffer with an intensity of 3^^^3, it doesn’t apply.
The real problem isn’t that when you take tiny probabilities of huge outcomes into account it doesn’t give you what you want. It’s that it diverges and doesn’t give you anything at all.
Exactly.
In essence, money change hands only if expected utilities converge. Transaction happens because the agent computes an invalid approximation to the (convergent or divergent) expected utility, by summing expected utilities over available scenarios; a sum which is greatly affected by a scenario that has been made available via it’s suggestion by the mugger, creating a situation where your usual selfish agent does best by producing a string that makes you give it money.
In essence, money change hands only if expected utilities converge.
Great when someone tries to Pascal mug you. Not so great when you’re just trying to buy groceries, and you can’t prove that they’re not a Pascal mugger. The expected utilities don’t just diverge when some tries to take advantage of you. They always diverge.
Also, not making a decision is itself a decision, so there’s no more reason for money to not change hands than there is for it to change hands, but since you’re not actually acting against your decision theory, that problem isn’t that bad.
Well, de facto they always converge, mugging or not, and I’m not going to take as normative a formalism where they do diverge. edit: e.g. instead I can adopt speed prior, it’s far less insane than incompetent people make it out to be—code size penalty for optimizing out the unseen is very significant. Or if I don’t like speed prior (and other such “solutions”), I can simply be sane and conclude that we don’t have a working formalism. Prescriptivism is silly when it is unclear how to decide efficiently under bounded computing power.
I can simply be sane and conclude that we don’t have a working formalism.
That’s generally what you do when you find a paradox that you can’t solve. I’m not suggesting that you actually conclude that you can’t make a decision.
Of course. And on the practical level, if I want other agents to provide me with more accurate information (something that has high utility scaled by all potential unlikely scenarios), I must try to make production of falsehoods non-profitable.
This one only works if they are making 3^^^3 people. If instead they are just making one person suffer with an intensity of 3^^^3, it doesn’t apply.
I think it still works, at least for the particular example, because the probability P(Bob=Matrix Lord who is willing to torture 3^^^3 people) and the probability P(Bob=Matrix Lord who is willing to inflict suffering of 3^^^3 intensity on one person) are closely correlated via the probabilities P(Matrix), P(Bob=Matrix Lord), P(Bob=Matrix Lord who is a sadistic son of a bitch), etc...
So if we have strong evidence against the former, we also have strong evidence against the latter (not quite as strong, but still very strong). The silence of the gods provides evidence against loud gods, but it also provides evidence against any gods at all...
The argument is that, since you’re 3^^^3 times more likely to be one of the other people if there are indeed 3^^^3 other people, that’s powerful evidence that what he says is false. If he’s only hurting one person a whole lot, then there’s only a 50% prior probability of being that person, so it’s only one bit of evidence.
The prior probabilities are similar, but we only have evidence against one.
The argument is that, since you’re 3^^^3 times more likely to be one of the other people if there are indeed 3^^^3 other people, that’s powerful evidence that what he says is false.
Enh, that’s Hanson’s original argument, but what I attempted to do is generalize it so that we don’t actually need to be relying on the concept of “person”, nor be counting points of view. I would want the generalized argument to hopefully work even for a Clippy who would be threatened with the bending of 3^^^3 paperclips, even though paperclips don’t have a point of view. Because any impact, even impact on non-people, ought have a prior for visibility analogous to its
magnitude.
That’s not a generalization. That’s an entirely different argument. The original was about anthropic evidence. Yours is about prior probability. You can accept or reject them independently of each other. If you accept both, they stack.
Because any impact, even impact on non-people, ought have a prior for visibility analogous to its magnitude.
I don’t think that works. Consider a modification of laws of physics so that alternate universes exist, incompatible with advanced AI, with people and paperclips, each paired to a positron in our world. Or what ever would be the simplest modification which ties them to something that clippy can affect. It is conceivable that some such modification can be doable in 1 in a million.
There’s sane situations with low probability, by the way, for example if NASA calculates that an asteroid, based on measurement uncertainties, has 1 in a million chance of hitting the earth, we’d be willing to spend quite a bit of money on “refine measurements, if its still a threat, launch rockets” strategy. But we don’t want to start spending money any time someone who can’t get a normal job gets clever about crying 3^^^3 wolves, and even less so for speculative, untestable laws of physics under description length based prior.
In general, yes, there is such a similar factor. There’s another one pushing it back, etc. The real problem isn’t that when you take tiny probabilities of huge outcomes into account it doesn’t give you what you want. It’s that it diverges and doesn’t give you anything at all. Pascal’s mugging at least has the valid, if unpopular, solution of just paying the mugger. There is no analogous solution to a divergent expected utility.
This one only works if they are making 3^^^3 people. If instead they are just making one person suffer with an intensity of 3^^^3, it doesn’t apply.
Exactly.
In essence, money change hands only if expected utilities converge. Transaction happens because the agent computes an invalid approximation to the (convergent or divergent) expected utility, by summing expected utilities over available scenarios; a sum which is greatly affected by a scenario that has been made available via it’s suggestion by the mugger, creating a situation where your usual selfish agent does best by producing a string that makes you give it money.
Great when someone tries to Pascal mug you. Not so great when you’re just trying to buy groceries, and you can’t prove that they’re not a Pascal mugger. The expected utilities don’t just diverge when some tries to take advantage of you. They always diverge.
Also, not making a decision is itself a decision, so there’s no more reason for money to not change hands than there is for it to change hands, but since you’re not actually acting against your decision theory, that problem isn’t that bad.
Well, de facto they always converge, mugging or not, and I’m not going to take as normative a formalism where they do diverge. edit: e.g. instead I can adopt speed prior, it’s far less insane than incompetent people make it out to be—code size penalty for optimizing out the unseen is very significant. Or if I don’t like speed prior (and other such “solutions”), I can simply be sane and conclude that we don’t have a working formalism. Prescriptivism is silly when it is unclear how to decide efficiently under bounded computing power.
That’s generally what you do when you find a paradox that you can’t solve. I’m not suggesting that you actually conclude that you can’t make a decision.
Of course. And on the practical level, if I want other agents to provide me with more accurate information (something that has high utility scaled by all potential unlikely scenarios), I must try to make production of falsehoods non-profitable.
I think it still works, at least for the particular example, because the probability P(Bob=Matrix Lord who is willing to torture 3^^^3 people) and the probability P(Bob=Matrix Lord who is willing to inflict suffering of 3^^^3 intensity on one person) are closely correlated via the probabilities P(Matrix), P(Bob=Matrix Lord), P(Bob=Matrix Lord who is a sadistic son of a bitch), etc...
So if we have strong evidence against the former, we also have strong evidence against the latter (not quite as strong, but still very strong). The silence of the gods provides evidence against loud gods, but it also provides evidence against any gods at all...
The argument is that, since you’re 3^^^3 times more likely to be one of the other people if there are indeed 3^^^3 other people, that’s powerful evidence that what he says is false. If he’s only hurting one person a whole lot, then there’s only a 50% prior probability of being that person, so it’s only one bit of evidence.
The prior probabilities are similar, but we only have evidence against one.
Enh, that’s Hanson’s original argument, but what I attempted to do is generalize it so that we don’t actually need to be relying on the concept of “person”, nor be counting points of view. I would want the generalized argument to hopefully work even for a Clippy who would be threatened with the bending of 3^^^3 paperclips, even though paperclips don’t have a point of view. Because any impact, even impact on non-people, ought have a prior for visibility analogous to its magnitude.
That’s not a generalization. That’s an entirely different argument. The original was about anthropic evidence. Yours is about prior probability. You can accept or reject them independently of each other. If you accept both, they stack.
I don’t think that works. Consider a modification of laws of physics so that alternate universes exist, incompatible with advanced AI, with people and paperclips, each paired to a positron in our world. Or what ever would be the simplest modification which ties them to something that clippy can affect. It is conceivable that some such modification can be doable in 1 in a million.
There’s sane situations with low probability, by the way, for example if NASA calculates that an asteroid, based on measurement uncertainties, has 1 in a million chance of hitting the earth, we’d be willing to spend quite a bit of money on “refine measurements, if its still a threat, launch rockets” strategy. But we don’t want to start spending money any time someone who can’t get a normal job gets clever about crying 3^^^3 wolves, and even less so for speculative, untestable laws of physics under description length based prior.