Our current methods might turn out to be biased in new and unexpected ways. Pascal’s mugging, the Lifespan Dilemma, blackmailing and the wrath of Löb’s theorem are just a few examples on how an agent build according to our current understanding of rationality could fail.
I don’t really get it. For example, building a machine that is sceptical of Pascal’s wager doesn’t seem harder than building a machine that is sceptical of other verbal offers unsupported by evidence. I don’t see what’s wrong with the idea that “extraordinary claims require extraordinary evidence”.
For example, building a machine that is sceptical of Pascal’s wager doesn’t seem harder than building a machine that is sceptical of other verbal offers unsupported by evidence.
The verbal offer isn’t actually relevant to the problem, it’s just there to dramatize the situation.
I don’t see what’s wrong with the idea that “extraordinary claims require extraordinary evidence”.
Please formulate that maxim precisely enough to program into an AI in a way that solves the problem. Because the best way we currently have of formulating it, i.e., Bayseanism with quasi-Solomonoff priors doesn’t solve it.
The idea of devoting more resources to investigating claims when they involve potential costs is involves decision theory rather than just mere prediction. However, vanilla reinforcement learning should handle this OK. Agents that don’t investigate extraordinary claims will be exploited and suffer—and a conventional reinforcement learning agent can be expected to pick up on this just fine. Of course I can’t supply source code—or else we would be done—but that’s the general idea.
The idea of devoting more resources to investigating claims when they involve potential costs is involves decision theory rather than just mere prediction.
All claims involve decision theory in the sense that you’re presumably going to act on them at some point.
However, vanilla reinforcement learning should handle this OK. Agents that don’t investigate extraordinary claims will be exploited and suffer—and a conventional reinforcement learning agent can be expected to pick up on this just fine.
Would these agents also learn to pick up pennies in front of steam rollers? In fact, falling for Pascal’s mugging is just the extreme case of refusing to pick up pennies in front of a steam roller, the question is where you draw a line dividing the two.
However, vanilla reinforcement learning should handle this OK. Agents that don’t investigate extraordinary claims will be exploited and suffer—and a conventional reinforcement learning agent can be expected to pick up on this just fine.
Would these agents also learn to pick up pennies in front of steam rollers?
That depends on its utility function.
In fact, falling for Pascal’s mugging is just the extreme case of refusing to pick up pennies in front of a steam roller, the question is where you draw a line dividing the two.
The line (if any) is drawn as a consequence of specifying a utility function.
Verbal offers are evidence. Sometimes even compelling evidence. For example, I don’t currently believe my friend Sam is roasting a turkey—in fact, my prior probability of that is < 1%. But if Sam calls me up and says “Wanna come over for dinner? I’m roasting a turkey” my posterior probability becomes > 90%.
Designing a system well-calibrated enough that its probability estimates cause it to make optimal choices across a narrow band of likelihoods is a simpler problem than designing a system that works across a much wider band.
Verbal offers are evidence. Sometimes even compelling evidence.
True, but dangerous. Nobody really knows anything about general intelligence. Yet a combination of arguments that sound convincing when formulated in English and the reputation of a few people and their utterances are considered evidence in favor of risks from AI. No doubt that all those arguments constitute evidence. But people update on that evidence and repeat those arguments and add to the overall chorus of people who take risks from AI seriously which in turn causes other people to update towards the possibility. In the end much of all conviction is based on little evidence when put in perspective to the actual actions that such a conviction demands.
I don’t want to argue against risks from AI here. As I wrote many times, I support SI. But I believe that it takes more hard evidence to accept some of the implications and to follow through on drastic actions beyond basic research.
I’ll readily concede that my exact species extinction numbers were made up. But does it really matter? Two hundred million years from now, the children’s children’s children of humanity in their galaxy-civilizations, are unlikely to look back and say, “You know, in retrospect, it really would have been worth not colonizing the Herculus supercluster if only we could have saved 80% of species instead of 20%”. I don’t think they’ll spend much time fretting about it at all, really. It is really incredibly hard to make the consequentialist utilitarian case here, as opposed to the warm-fuzzies case.
I believe that this argument is unwise and that the line of reasoning is outright dangerous because it justifies too much in the minds of certain people. Making decisions on the basis of the expected utility associated with colonizing the Herculus supercluster is a prime example of what I am skeptical of.
Mostly, the actions I see people taking (and exhorting others to take) on LW are “do research” and “fund others doing research,” to the negligible extent that any AI-related action is taken here at all. And you seem to support those actions.
But, sure… I guess I can see how taking a far goal seriously might in principle lead to future actions other than research, and how those actions might be negative, and I can sort of see responding to that by campaigning against taking the goal seriously rather than by campaigning against specific negative actions.
I don’t see what’s wrong with the idea that “extraordinary claims require extraordinary evidence”.
Me neither, but quite a few people on lesswrong don’t seem to share that opinion or are in possession of vast amounts of evidence that I lack. For example, some people seem to consider “interference from an alternative Everett branch in which a singularity went badly” or “unfriendly AI that might achieve complete control over our branch by means of acausal trade”. Fascinating topics for sure, but in my opinion ridiculously far detached from reality to be taken at all seriously. Those ideas are merely logical implications of theories that we deem to reasonable. Another theory that is by itself reasonable is then used to argue that logical implications do not have to pay rent in future anticipations. And in the end, due to a combination of reasonable theories, one ends up with completely absurd ideas. I don’t see how this could have happened if one would follow the rule that “extraordinary claims require extraordinary evidence”.
I don’t understand in what way the linked comment says anything about interference from alternative Everett branches. Did you mean to link to something else?
I’m not sure what the majority view is on less wrong, but none of the people I have met in real life advocate making decisions based on (very) small probabilities of (very) large utility fluctuations. I think AI has probability at least 1% of destroying most human value under the status quo. I think 1% is a large enough number that it’s reasonable to care a lot, although it’s also small enough that it’s reasonable not to care. However, I also think that the probability is at least 20%, and that is large enough that I think it is unreasonable not to care (assuming that preservation of humanity is one of your principle terminal values, which it may or may not be).
Does this mean that I’m going to drop out of college to work at SingInst? No, because that closes a lot of doors. Does it mean that I’m seriously reconsidering my career path? Yes, and I am reasonably likely to act on those considerations.
I think AI has probability at least 1% of destroying most human value under the status quo. I think 1% is a large enough number that it’s reasonable to care a lot, although it’s also small enough that it’s reasonable not to care. However, I also think that the probability is at least 20%
Without machine intelligence, every single human alive today dies.
One wonders how that value carnage would be quantified—using the same scale.
However, I also think that the probability is at least 20%, and that is large enough that I think it is unreasonable not to care (assuming that preservation of humanity is one of your principle terminal values, which it may or may not be).
I agree.
I’m not sure what the majority view is on less wrong, but none of the people I have met in real life advocate making decisions based on (very) small probabilities of (very) large utility fluctuations.
No, I think some people here use the +20% estimate on risks from AI and act according to some implications of logical implications. See here, which is the post the comment I linked to talked about. I have chosen that post because it resembled ideas put forth in another post on lesswrong that has been banned because of the perceived risks and because people got nightmares due to it.
I don’t see what’s wrong with the idea that “extraordinary claims require extraordinary evidence”.
Me neither, but quite a few people on lesswrong don’t seem to share that opinion or are in possession of vast amounts of evidence that I lack. For example, some people seem to consider “interference from an alternative Everett branch in which a singularity went badly” or “unfriendly AI that might achieve complete control over our branch by means of acausal trade”. Fascinating topics for sure, but in my opinion ridiculously far detached from reality to be taken at all seriously.
I think you only get significant interference from “adjacent” worlds—but sure, this sounds a little strange, the way you put it.
If we go back to the Pascal’s wager post though—Eliezer Yudkowsky just seems to be saying that he doesn’t know how to build a resouce-limited version of Solomonoff induction that doesn’t make the mistake he mentions. That’s fair enough—nobody knows how to build high quality approximations of Solomonoff induction—or we would be done by now. The point is that this isn’t a problem with Solomonoff induction, or with the idea of approximating it. It’s just a limitation in Eliezer Yudkowsky’s current knowledge (and probably everyone else’s). I fully expect that we will solve the problem, though. Quite possibly to do so, we will have to approximate Solomonoff induction in the context of some kind of reward system or utility function—so that we know which mis-predictions are costly (e.g. by resulting in getting mugged) - which will guide us to the best points to apply our limited resources.
If we go back to the Pascal’s wager post though—Eliezer Yudkowsky just seems to be saying that he doesn’t know how to build a resouce-limited version of Solomonoff induction that doesn’t make the mistake he mentions.
It has nothing to do with recourse limitations, the problem is that Solomonoff induction itself can’t handle Pascal’s mugging. If anything, the resource limited version of Solomonoff induction is less likely to fall for Pascal’s mugging since it might round the small probability down to 0.
It has nothing to do with recourse limitations, the problem is that Solomonoff induction itself can’t handle Pascal’s mugging.
In what way? You think that Solomonoff induction would predict enormous torture with a non-negligible propbability if it observed the mugger not being paid? Why do you think that? That conclusion seems extremely unlikely to me—assumung that the Solomonoff induction had had a reasonable amount of previous exposure of the world. It would, like any sensible agent, assume that the mugger was lying.
That’s why the original Pascal’s mugging post post directed its criticism at “some bounded analogue of Solomonoff induction”.
In what way? You think that Solomonoff induction would predict enormous torture with a non-negligible propbability if it observed the mugger not being paid?
Because Solomonoff induction bases its priors on minimum message length and it’s possible to encode enormous numbers like 3^^^3 in a message of length much less then 3^^^3.
Why do you think that?
Because I understand mathematics. ;)
That’s why the original Pascal’s mugging post post directed its criticism at “some bounded analogue of Solomonoff induction”.
What Eliezer was referring to is the fact that an unbounded agent would attempt to incorporate all possible versions of Pascal’s wager and Pascal’s mugging simultaneously and promptly end up with an ∞ − ∞ error.
You think that Solomonoff induction would predict enormous torture with a non-negligible propbability if it observed the mugger not being paid?
Because Solomonoff induction bases its priors on minimum message length and it’s possible to encode enormous numbers like 3^^^3 in a message of length much less then 3^^^3.
Sure—but the claim there are large numbers of people waiting to be tortured also decreases in probability with the number of people involved.
I figure that Solomonoff induction would give a (correct) tiny probability for this hypothesis being correct.
Your problem is actually not with Solomonoff induction—despite what you say—I figure. Rather you are complaining about some decision theory application of Solomonoff induction—involving the concept of “utility”.
Because Solomonoff induction bases its priors on minimum message length and it’s possible to encode enormous numbers like 3^^^3 in a message of length much less then 3^^^3.
Sure—but the claim there are large numbers of people waiting to be tortured also decreases in probability with the number of people involved.
What does this have to do with my point.
I figure that Solomonoff induction would give a (correct) tiny probability for this hypothesis being correct.
It does, just not tiny enough to override the 3^^^3 utility difference.
Your problem is actually not with Solomonoff induction—despite what you say—I figure. Rather you are complaining about some decision theory application of Solomonoff induction—involving the concept of “utility”.
I don’t have a problem with anything, I’m just trying to correct misconceptions about Pascal’s mugging.
I’m just trying to correct misconceptions about Pascal’s mugging.
Well, your claim was that “Solomonoff induction itself can’t handle Pascal’s mugging”—which appears to be unsubstantiated nonsense. Solomonoff induction will give the correct answer based on Occamian priors and its past experience—which is the best that anyone could reasonably expect from it.
Hold on. What does “extraordinary claim” mean? I see two possible meanings: (1) a claim that triggers the “absurdity heuristic”, or (2) a claim that is incompatible with many things that are already believed. The examples you gave trigger the absurdity heuristic, because they introduce large, weird structures into an area of concept space that does not normally receive updates. However, I don’t see any actual incompatibilities between them and my pre-existing beliefs.
It becomes extraordinary at the point where the expected utility of the associated logical implications demands to take actions that might lead to inappropriately high risks. Where “inappropriately” is measured relative to the original evidence that led you to infer those implications. If the evidence is insufficient then discount some of the associated utility. Where “insufficient” is measured intuitively. In conclusion: Act according to your best formal theories but don’t factor out your intuition.
It becomes extraordinary at the point where the expected utility of the associated logical implications demands to take actions that might lead to inappropriately high risks.
So if I’m driving, and someone says “look out for that deer in the road!”, that’s an extraordinary claim because swerving is a large risk? Or did you push the question over into the word “inappropriately”?
I don’t really get it. For example, building a machine that is sceptical of Pascal’s wager doesn’t seem harder than building a machine that is sceptical of other verbal offers unsupported by evidence. I don’t see what’s wrong with the idea that “extraordinary claims require extraordinary evidence”.
The verbal offer isn’t actually relevant to the problem, it’s just there to dramatize the situation.
Please formulate that maxim precisely enough to program into an AI in a way that solves the problem. Because the best way we currently have of formulating it, i.e., Bayseanism with quasi-Solomonoff priors doesn’t solve it.
The idea of devoting more resources to investigating claims when they involve potential costs is involves decision theory rather than just mere prediction. However, vanilla reinforcement learning should handle this OK. Agents that don’t investigate extraordinary claims will be exploited and suffer—and a conventional reinforcement learning agent can be expected to pick up on this just fine. Of course I can’t supply source code—or else we would be done—but that’s the general idea.
All claims involve decision theory in the sense that you’re presumably going to act on them at some point.
Would these agents also learn to pick up pennies in front of steam rollers? In fact, falling for Pascal’s mugging is just the extreme case of refusing to pick up pennies in front of a steam roller, the question is where you draw a line dividing the two.
That depends on its utility function.
The line (if any) is drawn as a consequence of specifying a utility function.
Verbal offers are evidence. Sometimes even compelling evidence. For example, I don’t currently believe my friend Sam is roasting a turkey—in fact, my prior probability of that is < 1%. But if Sam calls me up and says “Wanna come over for dinner? I’m roasting a turkey” my posterior probability becomes > 90%.
Designing a system well-calibrated enough that its probability estimates cause it to make optimal choices across a narrow band of likelihoods is a simpler problem than designing a system that works across a much wider band.
True, but dangerous. Nobody really knows anything about general intelligence. Yet a combination of arguments that sound convincing when formulated in English and the reputation of a few people and their utterances are considered evidence in favor of risks from AI. No doubt that all those arguments constitute evidence. But people update on that evidence and repeat those arguments and add to the overall chorus of people who take risks from AI seriously which in turn causes other people to update towards the possibility. In the end much of all conviction is based on little evidence when put in perspective to the actual actions that such a conviction demands.
I don’t want to argue against risks from AI here. As I wrote many times, I support SI. But I believe that it takes more hard evidence to accept some of the implications and to follow through on drastic actions beyond basic research.
What drastic actions do you see other people following through on that you consider unjustified?
I am mainly worried about future actions. The perception of imminent risks from AI could give an enormous incentive to commit incredible stupid acts.
Consider the following comment by Eliezer:
I believe that this argument is unwise and that the line of reasoning is outright dangerous because it justifies too much in the minds of certain people. Making decisions on the basis of the expected utility associated with colonizing the Herculus supercluster is a prime example of what I am skeptical of.
Mostly, the actions I see people taking (and exhorting others to take) on LW are “do research” and “fund others doing research,” to the negligible extent that any AI-related action is taken here at all. And you seem to support those actions.
But, sure… I guess I can see how taking a far goal seriously might in principle lead to future actions other than research, and how those actions might be negative, and I can sort of see responding to that by campaigning against taking the goal seriously rather than by campaigning against specific negative actions.
Thanks for clarifying.
Me neither, but quite a few people on lesswrong don’t seem to share that opinion or are in possession of vast amounts of evidence that I lack. For example, some people seem to consider “interference from an alternative Everett branch in which a singularity went badly” or “unfriendly AI that might achieve complete control over our branch by means of acausal trade”. Fascinating topics for sure, but in my opinion ridiculously far detached from reality to be taken at all seriously. Those ideas are merely logical implications of theories that we deem to reasonable. Another theory that is by itself reasonable is then used to argue that logical implications do not have to pay rent in future anticipations. And in the end, due to a combination of reasonable theories, one ends up with completely absurd ideas. I don’t see how this could have happened if one would follow the rule that “extraordinary claims require extraordinary evidence”.
I don’t understand in what way the linked comment says anything about interference from alternative Everett branches. Did you mean to link to something else?
I’m not sure what the majority view is on less wrong, but none of the people I have met in real life advocate making decisions based on (very) small probabilities of (very) large utility fluctuations. I think AI has probability at least 1% of destroying most human value under the status quo. I think 1% is a large enough number that it’s reasonable to care a lot, although it’s also small enough that it’s reasonable not to care. However, I also think that the probability is at least 20%, and that is large enough that I think it is unreasonable not to care (assuming that preservation of humanity is one of your principle terminal values, which it may or may not be).
Does this mean that I’m going to drop out of college to work at SingInst? No, because that closes a lot of doors. Does it mean that I’m seriously reconsidering my career path? Yes, and I am reasonably likely to act on those considerations.
Without machine intelligence, every single human alive today dies.
One wonders how that value carnage would be quantified—using the same scale.
I agree.
No, I think some people here use the +20% estimate on risks from AI and act according to some implications of logical implications. See here, which is the post the comment I linked to talked about. I have chosen that post because it resembled ideas put forth in another post on lesswrong that has been banned because of the perceived risks and because people got nightmares due to it.
I think you only get significant interference from “adjacent” worlds—but sure, this sounds a little strange, the way you put it.
If we go back to the Pascal’s wager post though—Eliezer Yudkowsky just seems to be saying that he doesn’t know how to build a resouce-limited version of Solomonoff induction that doesn’t make the mistake he mentions. That’s fair enough—nobody knows how to build high quality approximations of Solomonoff induction—or we would be done by now. The point is that this isn’t a problem with Solomonoff induction, or with the idea of approximating it. It’s just a limitation in Eliezer Yudkowsky’s current knowledge (and probably everyone else’s). I fully expect that we will solve the problem, though. Quite possibly to do so, we will have to approximate Solomonoff induction in the context of some kind of reward system or utility function—so that we know which mis-predictions are costly (e.g. by resulting in getting mugged) - which will guide us to the best points to apply our limited resources.
It has nothing to do with recourse limitations, the problem is that Solomonoff induction itself can’t handle Pascal’s mugging. If anything, the resource limited version of Solomonoff induction is less likely to fall for Pascal’s mugging since it might round the small probability down to 0.
In what way? You think that Solomonoff induction would predict enormous torture with a non-negligible propbability if it observed the mugger not being paid? Why do you think that? That conclusion seems extremely unlikely to me—assumung that the Solomonoff induction had had a reasonable amount of previous exposure of the world. It would, like any sensible agent, assume that the mugger was lying.
That’s why the original Pascal’s mugging post post directed its criticism at “some bounded analogue of Solomonoff induction”.
Because Solomonoff induction bases its priors on minimum message length and it’s possible to encode enormous numbers like 3^^^3 in a message of length much less then 3^^^3.
Because I understand mathematics. ;)
What Eliezer was referring to is the fact that an unbounded agent would attempt to incorporate all possible versions of Pascal’s wager and Pascal’s mugging simultaneously and promptly end up with an ∞ − ∞ error.
Sure—but the claim there are large numbers of people waiting to be tortured also decreases in probability with the number of people involved.
I figure that Solomonoff induction would give a (correct) tiny probability for this hypothesis being correct.
Your problem is actually not with Solomonoff induction—despite what you say—I figure. Rather you are complaining about some decision theory application of Solomonoff induction—involving the concept of “utility”.
What does this have to do with my point.
It does, just not tiny enough to override the 3^^^3 utility difference.
I don’t have a problem with anything, I’m just trying to correct misconceptions about Pascal’s mugging.
Well, your claim was that “Solomonoff induction itself can’t handle Pascal’s mugging”—which appears to be unsubstantiated nonsense. Solomonoff induction will give the correct answer based on Occamian priors and its past experience—which is the best that anyone could reasonably expect from it.
Hold on. What does “extraordinary claim” mean? I see two possible meanings: (1) a claim that triggers the “absurdity heuristic”, or (2) a claim that is incompatible with many things that are already believed. The examples you gave trigger the absurdity heuristic, because they introduce large, weird structures into an area of concept space that does not normally receive updates. However, I don’t see any actual incompatibilities between them and my pre-existing beliefs.
It becomes extraordinary at the point where the expected utility of the associated logical implications demands to take actions that might lead to inappropriately high risks. Where “inappropriately” is measured relative to the original evidence that led you to infer those implications. If the evidence is insufficient then discount some of the associated utility. Where “insufficient” is measured intuitively. In conclusion: Act according to your best formal theories but don’t factor out your intuition.
So if I’m driving, and someone says “look out for that deer in the road!”, that’s an extraordinary claim because swerving is a large risk? Or did you push the question over into the word “inappropriately”?
Claims are only extraordinary with respect to theories.