I agree with the most of the points points here. Short-term cost-effective charities are often more worthy of donation than speculative long-term high-uncertainty ones. I would prefer donating to GiveWell’s top charities to funding US/China exchange programs.
Yet I still donate to MIRI. Why? Because it’s a completely different beast.
I don’t view MIRI as addressing far-future concerns. I view MIRI as addressing one very specific problem: we are on the path to AI, and we seem to be a lot closer to developing AI than we are to developing a perfect reflective preservable logical encoding of all human values.
There’s a timer. That timer is getting uncomfortably low. And when it gets to zero, there’s not a lot of death and a bad economy—there’s an extinction event.
If we had good reason to believe that the US and China will cross a threshold this century causing them to either blow up the world or collaborate and travel to the stars, based solely on the sentiment of each population towards the other, then you’re damn right I’d fund exchange programs.
We don’t have any evidence along those lines. There are a plethora of potential political catastrophes and uncountable factors that could cause them. A China/US nuclear war would be very bad, but it’s one small possibility in a sea of many. It’s far future, and it’s very hard to predict what will help and what won’t.
Strong AI this century is a moderately likely probability, and if we don’t have the right decision theory by the time it gets here then humanity loses. End of story, game over, no savepoints.
MIRI isn’t hoping that it will nudge us towards a better future. It isn’t trying to tweak factors that just maybe might push us towards a better future. Rather, MIRI is addressing a single-point-of-failure with an expiration date. There’s a good likelihood that the fate of humanity will hinge upon the existence of this one bit of mathematics.
Yours is the kind of response I have to posts like the OP and also Holden’s “Empowerment and Catastrophic Risk,” though I wouldn’t place so much specific emphasis on e.g. decision theory.
It is important to analyze general features of the world and build up many outside views, but one must also exercise the Be Specific skill. If several outside views tell me to eat Osha Thai, but then I snack on some Synsepalum dulcificum and I know it makes sour food taste sweet, then I should update heavily against the results of my original model combination, even if the Osha Thai recommendation was a robust result of 20+ models under model combination. Similarly, even if you have a very strong outside view that your lottery ticket is not the winner, a simple observation that the number on your ticket matches the announced winner on live TV should allow you to update all the way to belief that you won.
To consider a case somewhat more analogous to risks and AI, there were lots of outside views in 1938 suggesting that one shouldn’t invest billions in an unprecedented technology that would increase our bombing power by several orders of magnitude, based on then-theoretical physics. Definitely an “unproven cause.” And yet there were strong reasons to suggest it would be possible, and could be a determining factor in WWII, even though lots of the initial research would end up being on the wrong path and so on.
Also see Eliezer’s comment here, and its supporting post here.
I agree with the most of the points points here. Short-term cost-effective charities are often more worthy of donation than speculative long-term high-uncertainty ones. I would prefer donating to GiveWell’s top charities to funding US/China exchange programs.
Yet I still donate to MIRI. Why? Because it’s a completely different beast.
I don’t view MIRI as addressing far-future concerns. I view MIRI as addressing one very specific problem: we are on the path to AI, and we seem to be a lot closer to developing AI than we are to developing a perfect reflective preservable logical encoding of all human values.
There’s a timer. That timer is getting uncomfortably low. And when it gets to zero, there’s not a lot of death and a bad economy—there’s an extinction event.
If we had good reason to believe that the US and China will cross a threshold this century causing them to either blow up the world or collaborate and travel to the stars, based solely on the sentiment of each population towards the other, then you’re damn right I’d fund exchange programs.
We don’t have any evidence along those lines. There are a plethora of potential political catastrophes and uncountable factors that could cause them. A China/US nuclear war would be very bad, but it’s one small possibility in a sea of many. It’s far future, and it’s very hard to predict what will help and what won’t.
MIRI isn’t trying to vaguely prod the unknown future into a state that’s maybe better. It’s racing against a timer that’s known to be ticking. This argument hinges on a belief that we’ll get strong AI in the next 30-150 years, that human value is complex and fragile, and that decision theory is nowhere near ready—all arguments that have been made to my satisfaction.
Strong AI this century is a moderately likely probability, and if we don’t have the right decision theory by the time it gets here then humanity loses. End of story, game over, no savepoints.
MIRI isn’t hoping that it will nudge us towards a better future. It isn’t trying to tweak factors that just maybe might push us towards a better future. Rather, MIRI is addressing a single-point-of-failure with an expiration date. There’s a good likelihood that the fate of humanity will hinge upon the existence of this one bit of mathematics.
Yours is the kind of response I have to posts like the OP and also Holden’s “Empowerment and Catastrophic Risk,” though I wouldn’t place so much specific emphasis on e.g. decision theory.
It is important to analyze general features of the world and build up many outside views, but one must also exercise the Be Specific skill. If several outside views tell me to eat Osha Thai, but then I snack on some Synsepalum dulcificum and I know it makes sour food taste sweet, then I should update heavily against the results of my original model combination, even if the Osha Thai recommendation was a robust result of 20+ models under model combination. Similarly, even if you have a very strong outside view that your lottery ticket is not the winner, a simple observation that the number on your ticket matches the announced winner on live TV should allow you to update all the way to belief that you won.
To consider a case somewhat more analogous to risks and AI, there were lots of outside views in 1938 suggesting that one shouldn’t invest billions in an unprecedented technology that would increase our bombing power by several orders of magnitude, based on then-theoretical physics. Definitely an “unproven cause.” And yet there were strong reasons to suggest it would be possible, and could be a determining factor in WWII, even though lots of the initial research would end up being on the wrong path and so on.
Also see Eliezer’s comment here, and its supporting post here.