The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one, and extraordinary claims require extraordinary evidence.
Reminder: I don’t know if you were committing this particular error internally, but, at the least, the sentence is liable to cause the error externally, so: Large consequences != prior improbability. E.g. although global warming has very large consequences, and even implies that we should take large actions, it isn’t improbable a priori that carbon dioxide should trap heat in the atmosphere—it’s supposed to happen, according to standard physics. And so demanding strong evidence that global warming is anthropogenic is bad probability theory and decision theory. Expensive actions imply a high value of information, meaning that if we happen to have access to cheap, powerfully distinguishing evidence about global warming we should look at it; but if that evidence is not available, then we go from the default extrapolation from standard physics and make policy on that basis—not demand more powerful evidence on pain of doing nothing.
The claim that SIAI is currently best-suited to convert marginal dollars into FAI and/or general x-risk mitigation has large consequences. Likewise claims like “most possible self-improving AIs will kill you, although there’s an accessible small space of good designs”. This is not the same as saying that if the other facts of the world are what they appear at face value to be, these claims should require extraordinary evidence before we believe them.
Since reference class tennis is also a danger (i.e, if you want to conclude that a belief is false, you can always find a reference class in which to put it where most beliefs are false, e.g. classifying global warming as an “apocalyptic belief”), one more reliable standard to require before saying “Extraordinary claims require extraordinary evidence” is to ask what prior belief needs to be broken by the extraordinary evidence, and how well-supported that prior belief may be. Suppose global warming is real—what facet of existing scientific understanding would need to change? None, in fact; it is the absence of anthropogenic global warming that would imply change in our current beliefs, so that’s what would require the extraordinary evidence to power it. In the same sense, an AI showing up as early as 2025, self-improving, and ending the world, doesn’t make us say “What? Impossible!” with respect to any current well-supported scientific belief. And if SIAI manages to get together a pack of topnotch mathematicians and solve the FAI problem, it’s not clear to me that you can pinpoint a currently-well-supported element of the world-model which gets broken.
The idea that the proposition contains too much burdensome detail—as opposed to an extraordinary element—would be a separate discussion. There are fewer details required than many strawman versions would have it; and often what seems like a specific detail is actually just an antiprediction, i.e., UFAI is not about a special utility function but about the whole class of non-Friendly utility functions. Nonetheless, if someone’s thought processes were dominated by model risk, but they nonetheless actually cared about Earth’s survival, and were generally sympathetic to SIAI even as they distrusted the specifics, it seems to me that they should support CFAR, part of whose rationale is explicitly the idea that Earth gets a log(number of rationalists) saving throw bonus on many different x-risks.
I am coming to the conclusion that “extraordinary claims require extraordinary evidence” is just bad advice, precisely because it causes people to conflate large consequences and prior improbability. People are fond of saying it about cryonics, for example.
At least sometimes, people may say “extraordinary claims require extraordinary evidence” when they mean “your large novel claim has set off my fraud risk detector; please show me how you’re not a scam.”
In other words, the caution being expressed is not about prior probabilities in the natural world, but rather the intentions and morals of the claimant.
Well, consider strategic point of view. Suppose that a system (humans) is known for it’s poor performance at evaluating the claims without performing direct experimentation. Long, long history of such failures.
Consider also that a false high-impact claim can ruin ability of this system to perform it’s survival function, with again a long history of such events; the damage is proportionally to the claimed impact. (Mayans are a good example, killing people so that the sun will rise tomorrow; great utilitarian rationalists they were; believing that their reasoning is perfect enough to warrant such action. Note that donating to a wrong charity instead of a right one kills people)
When we anticipate that a huge percentage of the claims will be false, we can build the system to require evidence that if the claim was false the system would be in a small probability world (i.e. require that for a claim evidence was collected so that p(evidence | ~claim)/p(evidence | claim) is low), to make the system, once deployed, fall off the cliffs less often. The required strength of the evidence is then increasing with impact of the claim.
It is not an ideal strategy, but it is the one that works given the limitations. There are other strategies and it is not straightforward to improve performance (and easy to degrade performance by making idealized implicit assumptions).
I don’t know if you were committing this particular error internally, but, at the least, the sentence is liable to cause the error externally, so: Large consequences != prior improbability.
What I meant when I described the claim (hereafter “C”) that SI is better suited to convert dollars to existential risk mitigation than any other charitable organization as “extraordinary” was that priors for C are low (C is false for most organizations, and therefore likely to be false for SI absent additional evidence about SI), not that C has large consequences (although that is true as well).
Yes, this might be a failing of using the wrong reference class (charitable organizations in general) to establish one’s priors., as you suggest. The fact remains that when trying to solicit broad public support, or support from an organization like GiveWell, it’s likely that SI will be evaluated within the reference class of other charities. If using that reference class leads to improperly low priors for C, it seems SI has a few strategic choices:
1) Convince GiveWell, and donors in general, that SI is importantly unlike other charities, and should not be evaluated as though it were like them—in other words, win at reference class tennis.
2) Ignore donors in general and concentrate its attention primarily on potential donors who already use the correct reference class.
3) Provide enough evidence to convince even someone who starts out with improperly low priors drawn from the incorrect reference class of “SI is a charity” to update to a sufficiently high estimate of C that donating money to SI seems reasonable (in practice, I think this is what has happened and is happening with anthropogenic climate change).
4) Look for alternate sources of funding besides charitable donations.
One way to approach strategy #1 is the one you use here—shift the conversation from whether or not SI can actually spend money effectively to mitigate existential risk to whether or not uFAI/FAI by 2025 (or some other near-mode threshold) is plausible.
That’s not a bad tactic; it works pretty well in general.
Your statement was that it was an extraordinary claim that SIAI provided x-risk reduction—why then would SIAI be compared to most other charities, which don’t provide x-risk reduction, and don’t claim to provide x-risk reduction? The AI-risk item was there for comparison of standards, as was global warming; i.e., if you claim that you doubt X because of Y, but Y implies doubting Z, but you don’t doubt Z, you should question whether you’re really doubting X because of Y.
why then would SIAI be compared to most other charities, which don’t provide x-risk reduction, and don’t claim to provide x-risk reduction?
Are you trying to argue that it isn’t in fact being compared to other charities? (Specifically, by GiveWell?) Or merely that if it is, those doing such comparison are mistaken?
If you’re arguing the former… huh. I will admit, in that case, that almost everything I’ve said in this thread is irrelevant to your point, and I’ve completely failed to follow your argument. If that’s the case, let me know and I’ll back up and re-read your argument in that context.
If you’re arguing the latter, well, I’m happy to grant that, but I’m not sure how relevant it is to Luke’s goal (which I take to be encouraging Holden to endorse SI as a charitable donation).
If SI wants to argue that GiveWell’s expertise with evaluating other charities isn’t relevant to evaluating SI because SI ought not be compared to other charities in the first place, that’s a coherent argument (though it raises the question of why GiveWell ever got involved in evaluating SI to begin with… wasn’t that at SI’s request? Maybe not. Or maybe it was, but SI now realizes that was a mistake. I don’t know.)
But as far as I can tell that’s not the argument SI is making in Luke’s reply to Holden. (Perhaps it ought to be? I don’t know.)
I worry that this conversation is starting to turn around points of phrasing, but… I think it’s worth separating the ideas that you ought to be doing x-risk reduction and that SIAI is the most efficient way to do it, which is why I myself agreed strongly with your own, original phrasing, that the key claim is providing the most efficient x-risk reduction. If someone’s comparing SIAI to Rare Diseases in Cute Puppies or anything else that isn’t about x-risk, I’ll leave that debate to someone else—I don’t think I have much comparative advantage in talking about it.
Further, it seems to me that Holden is implicitly comparing SI to other charitable-giving opportunities when he provides GW’s evaluation of SI, rather than comparing SI to other x-risk-reduction opportunities. I tentatively infer, from the fact that you consider responding to such a comparison something you should leave to others but you’re participating in a discussion of how SI ought to respond to Holden, that you don’t agree that Holden is engaging in such a comparison.
If you’re right, then I don’t know what Holden is doing, and I probably don’t have a clue how Luke ought to reply to Holden.
Holden is comparing SI to other giving opportunities, not just to giving opportunities that may reduce x-risk. That’s not a part of the discussion Eliezer feels he should contribute to, though. I tried to address it in the first two sections of my post above, and then in part 3 I talked about why both FHI and SI contribute unique and important value to the x-risk reduction front.
In other words: I tried to explain that for many people, x-risk is Super Duper Important, and so for those people, what matters is which charities among those reducing x-risk they should support. And then I went on to talk about SI’s value for x-risk reduction in particular.
Much of the debate over x-risk as a giving opportunity in general has to do with Holden’s earlier posts about expected value estimates, and SI’s post on that subject (written by Steven Kaas) is still under development.
here are fewer details required than many strawman versions would have it; and often what seems like a specific detail is actually just an antiprediction, i.e., UFAI is not about a special utility function but about the whole class of non-Friendly utility functions.
If by “utility function” you mean “a computable function, expressible using lambda calculus” (or Turing machine tape or python code, that’s equivalent), then the arguing that majority of such functions lead to a model-based utility-based agent killing you, is a huge stretch, as such functions are not grounded and the correspondence of model with the real world is not a sub-goal to finding maximum of such function.
Reminder: I don’t know if you were committing this particular error internally, but, at the least, the sentence is liable to cause the error externally, so: Large consequences != prior improbability. E.g. although global warming has very large consequences, and even implies that we should take large actions, it isn’t improbable a priori that carbon dioxide should trap heat in the atmosphere—it’s supposed to happen, according to standard physics. And so demanding strong evidence that global warming is anthropogenic is bad probability theory and decision theory. Expensive actions imply a high value of information, meaning that if we happen to have access to cheap, powerfully distinguishing evidence about global warming we should look at it; but if that evidence is not available, then we go from the default extrapolation from standard physics and make policy on that basis—not demand more powerful evidence on pain of doing nothing.
The claim that SIAI is currently best-suited to convert marginal dollars into FAI and/or general x-risk mitigation has large consequences. Likewise claims like “most possible self-improving AIs will kill you, although there’s an accessible small space of good designs”. This is not the same as saying that if the other facts of the world are what they appear at face value to be, these claims should require extraordinary evidence before we believe them.
Since reference class tennis is also a danger (i.e, if you want to conclude that a belief is false, you can always find a reference class in which to put it where most beliefs are false, e.g. classifying global warming as an “apocalyptic belief”), one more reliable standard to require before saying “Extraordinary claims require extraordinary evidence” is to ask what prior belief needs to be broken by the extraordinary evidence, and how well-supported that prior belief may be. Suppose global warming is real—what facet of existing scientific understanding would need to change? None, in fact; it is the absence of anthropogenic global warming that would imply change in our current beliefs, so that’s what would require the extraordinary evidence to power it. In the same sense, an AI showing up as early as 2025, self-improving, and ending the world, doesn’t make us say “What? Impossible!” with respect to any current well-supported scientific belief. And if SIAI manages to get together a pack of topnotch mathematicians and solve the FAI problem, it’s not clear to me that you can pinpoint a currently-well-supported element of the world-model which gets broken.
The idea that the proposition contains too much burdensome detail—as opposed to an extraordinary element—would be a separate discussion. There are fewer details required than many strawman versions would have it; and often what seems like a specific detail is actually just an antiprediction, i.e., UFAI is not about a special utility function but about the whole class of non-Friendly utility functions. Nonetheless, if someone’s thought processes were dominated by model risk, but they nonetheless actually cared about Earth’s survival, and were generally sympathetic to SIAI even as they distrusted the specifics, it seems to me that they should support CFAR, part of whose rationale is explicitly the idea that Earth gets a log(number of rationalists) saving throw bonus on many different x-risks.
I am coming to the conclusion that “extraordinary claims require extraordinary evidence” is just bad advice, precisely because it causes people to conflate large consequences and prior improbability. People are fond of saying it about cryonics, for example.
At least sometimes, people may say “extraordinary claims require extraordinary evidence” when they mean “your large novel claim has set off my fraud risk detector; please show me how you’re not a scam.”
In other words, the caution being expressed is not about prior probabilities in the natural world, but rather the intentions and morals of the claimant.
We need two new versions of the advice, to satisfy everyone.
Version for scientists: “improbable claims require extraordinary evidence”.
Versions for politicians: “inconvenient claims require extraordinary evidence”.
Well, consider strategic point of view. Suppose that a system (humans) is known for it’s poor performance at evaluating the claims without performing direct experimentation. Long, long history of such failures.
Consider also that a false high-impact claim can ruin ability of this system to perform it’s survival function, with again a long history of such events; the damage is proportionally to the claimed impact. (Mayans are a good example, killing people so that the sun will rise tomorrow; great utilitarian rationalists they were; believing that their reasoning is perfect enough to warrant such action. Note that donating to a wrong charity instead of a right one kills people)
When we anticipate that a huge percentage of the claims will be false, we can build the system to require evidence that if the claim was false the system would be in a small probability world (i.e. require that for a claim evidence was collected so that p(evidence | ~claim)/p(evidence | claim) is low), to make the system, once deployed, fall off the cliffs less often. The required strength of the evidence is then increasing with impact of the claim.
It is not an ideal strategy, but it is the one that works given the limitations. There are other strategies and it is not straightforward to improve performance (and easy to degrade performance by making idealized implicit assumptions).
What I meant when I described the claim (hereafter “C”) that SI is better suited to convert dollars to existential risk mitigation than any other charitable organization as “extraordinary” was that priors for C are low (C is false for most organizations, and therefore likely to be false for SI absent additional evidence about SI), not that C has large consequences (although that is true as well).
Yes, this might be a failing of using the wrong reference class (charitable organizations in general) to establish one’s priors., as you suggest. The fact remains that when trying to solicit broad public support, or support from an organization like GiveWell, it’s likely that SI will be evaluated within the reference class of other charities. If using that reference class leads to improperly low priors for C, it seems SI has a few strategic choices:
1) Convince GiveWell, and donors in general, that SI is importantly unlike other charities, and should not be evaluated as though it were like them—in other words, win at reference class tennis.
2) Ignore donors in general and concentrate its attention primarily on potential donors who already use the correct reference class.
3) Provide enough evidence to convince even someone who starts out with improperly low priors drawn from the incorrect reference class of “SI is a charity” to update to a sufficiently high estimate of C that donating money to SI seems reasonable (in practice, I think this is what has happened and is happening with anthropogenic climate change).
4) Look for alternate sources of funding besides charitable donations.
One way to approach strategy #1 is the one you use here—shift the conversation from whether or not SI can actually spend money effectively to mitigate existential risk to whether or not uFAI/FAI by 2025 (or some other near-mode threshold) is plausible.
That’s not a bad tactic; it works pretty well in general.
Your statement was that it was an extraordinary claim that SIAI provided x-risk reduction—why then would SIAI be compared to most other charities, which don’t provide x-risk reduction, and don’t claim to provide x-risk reduction? The AI-risk item was there for comparison of standards, as was global warming; i.e., if you claim that you doubt X because of Y, but Y implies doubting Z, but you don’t doubt Z, you should question whether you’re really doubting X because of Y.
Are you trying to argue that it isn’t in fact being compared to other charities? (Specifically, by GiveWell?) Or merely that if it is, those doing such comparison are mistaken?
If you’re arguing the former… huh. I will admit, in that case, that almost everything I’ve said in this thread is irrelevant to your point, and I’ve completely failed to follow your argument. If that’s the case, let me know and I’ll back up and re-read your argument in that context.
If you’re arguing the latter, well, I’m happy to grant that, but I’m not sure how relevant it is to Luke’s goal (which I take to be encouraging Holden to endorse SI as a charitable donation).
If SI wants to argue that GiveWell’s expertise with evaluating other charities isn’t relevant to evaluating SI because SI ought not be compared to other charities in the first place, that’s a coherent argument (though it raises the question of why GiveWell ever got involved in evaluating SI to begin with… wasn’t that at SI’s request? Maybe not. Or maybe it was, but SI now realizes that was a mistake. I don’t know.)
But as far as I can tell that’s not the argument SI is making in Luke’s reply to Holden. (Perhaps it ought to be? I don’t know.)
I worry that this conversation is starting to turn around points of phrasing, but… I think it’s worth separating the ideas that you ought to be doing x-risk reduction and that SIAI is the most efficient way to do it, which is why I myself agreed strongly with your own, original phrasing, that the key claim is providing the most efficient x-risk reduction. If someone’s comparing SIAI to Rare Diseases in Cute Puppies or anything else that isn’t about x-risk, I’ll leave that debate to someone else—I don’t think I have much comparative advantage in talking about it.
I agree with you on all of those points.
Further, it seems to me that Holden is implicitly comparing SI to other charitable-giving opportunities when he provides GW’s evaluation of SI, rather than comparing SI to other x-risk-reduction opportunities.
I tentatively infer, from the fact that you consider responding to such a comparison something you should leave to others but you’re participating in a discussion of how SI ought to respond to Holden, that you don’t agree that Holden is engaging in such a comparison.
If you’re right, then I don’t know what Holden is doing, and I probably don’t have a clue how Luke ought to reply to Holden.
Holden is comparing SI to other giving opportunities, not just to giving opportunities that may reduce x-risk. That’s not a part of the discussion Eliezer feels he should contribute to, though. I tried to address it in the first two sections of my post above, and then in part 3 I talked about why both FHI and SI contribute unique and important value to the x-risk reduction front.
In other words: I tried to explain that for many people, x-risk is Super Duper Important, and so for those people, what matters is which charities among those reducing x-risk they should support. And then I went on to talk about SI’s value for x-risk reduction in particular.
Much of the debate over x-risk as a giving opportunity in general has to do with Holden’s earlier posts about expected value estimates, and SI’s post on that subject (written by Steven Kaas) is still under development.
If by “utility function” you mean “a computable function, expressible using lambda calculus” (or Turing machine tape or python code, that’s equivalent), then the arguing that majority of such functions lead to a model-based utility-based agent killing you, is a huge stretch, as such functions are not grounded and the correspondence of model with the real world is not a sub-goal to finding maximum of such function.