It seems to me that there are basically two approaches to preventing an UFAI intelligence explosion: a) making sure that the first intelligence explosion is a a FAI instead; b) making sure that intelligence explosion never occurs. The first one involves solving (with no margin for error) the philosophical/ethical/logical/mathematical problem of defining FAI, and in addition the sociological/political problem of doing it “in time”, convincing everyone else, and ensuring that the first intelligence explosion occurs according to this resolution. The second one involves just the sociological/political problem of convincing everyone of the risks and banning/discouraging AI research “in time” to avoid an intelligence explosion.
Naively, it seems to me that the second approach is more viable—it seems comparable in scale to something between stopping use of CFCs (fairly easy) and stopping global warming (very difficult, but it is premature to say impossible). At any rate, sounds easier than solving (over a few year/decades) so many hard philosophical and mathematical problems, with no margin for error and under time pressure to do it ahead of UFAI developing.
However, it seems (from what I read on LW and found quickly browsing the MIRI website; I am not particularly well informed, hence writing this on the Stupid Questions thread) that most of the efforts of MIRI are on the first approach. Has there been a formal argument on why it is preferable, or are there efforts on the second approach I am unaware of? The only discussion I found was Carl Shulman’s “Arms Control and Intelligence Explosions” paper, but it is brief and nothing like a formal analysis comparing the benefits of each strategy. I am worried the situation might be biased by the LW/MIRI kind of people being more interested in (and seeing as more fun) the progress on the timeless philosophical problems necessary for (a) than the political coalition building and propaganda campaigns necessary for (b).
I think it’s easier to get a tiny fraction of the planet to do a complex right thing than to get 99.9% of a planet to do a simpler right thing, especially if 99.9% compliance may not be enough and 99.999% compliance may be required instead.
When I see proposals that involve convincing everyone on the planet to do something, I write them off as loony-eyed idealism and move on. So, creating FAI would have to be hard enough that I considered it too “impossible” to be attempted (with this fact putatively being known to me given already-achieved knowledge), and then I would swap to human intelligence enhancement or something because, obviously, you’re not going to persuade everyone on the planet to agree with you.
But is it really necessary to persuade everyone, or 99.9% of the planet? If gwern’s analysis is correct (I have no idea if it is) then it might suffice to convince the policymakers of a few countries like USA and China.
I see. So you do have an upper bound in mind for the FAI problem difficulty, then, and it’s lower than other alternatives. It’s not simply “shut up and do the impossible”.
Given enough time for ideas to develop, any smart kid in a basement could build an AI, and every organization in the world has a massive incentive to do so. Only omnipresent surveillance could prevent everyone from writing a particular computer program.
Once you have enough power flying around to actually prevent AI, you are dealing with AI-level threats already (a not-necessarily friendly singleton).
So FAI is actually the easiest way to prevent UFAI.
The other reason is that a Friendly Singleton would be totally awesome. Like so totally awesome that it would be worth it to try for the awesomeness alone.
The other reason is that a Friendly Singleton would be totally awesome. Like so totally awesome that it would be worth it to try for the awesomeness alone.
Your tone reminded me of super religious folk who are convinced that, say “Jesus is coming back soon!” and it’ll be “totally awesome”.
Your comment reminds me of those internet atheists that are so afraid of being religious that they refuse to imagine how much better the world could be.
I do imagine how much better the world could be. I actually do want MIRI to succeed. Though currently I have low confidence in their future success, so I don’t feel “bliss” (if that’s the right word.)
BTW I’m actually slightly agnostic because of the simulation argument.
But, in the current situation (or even a few years from now) would it be possible for a smart kid in a basement to build an AI from scratch? Isn’t it something that still requires lots of progress to build on? See my reply to Qiaochu.
The question is whether it would be possible to ban further research and stop progress (open, universally accessible and buildable-upon progress), in time for AGI to be still far away enough that an isolated group in a basement will have no chance of achieving it on its own.
If by “basement” you mean “anywhere, working in the interests of any organization that wants to gain a technology advantage over the rest of the world,” then sure, I agree that this is a good question. So what do you think the answer is?
I have no idea! I am not a specialist of any kind in AI development. That is why I posted in the Stupid Questions thread asking “has MIRI considered this and made a careful analysis?” instead of making a top-level post saying “MIRI should be doing this”. It may seem that in the subthread I am actively arguing for strategy (b), but what I am doing is pushing back against what I see as insufficient answers on such an important question.
Hmmm. What is it going to do that is bad, given that it has the power to do the right thing, and is Friendly?
We have inherited some anti-authoritarian propaganda memes from a cultural war that is no longer relevant, and those taint the evaluation of a Singleton, even though they really don’t apply. At least that’s how it felt to me when I thought through it.
We discuss this proposal in Responses to Catastrophic AGI Risk, under the sections “Regulate research” and “Relinquish technology”. I recommend reading both of those sections if you’re interested, but a few relevant excerpts:
Large-scale surveillance efforts are ethically problematic and face major political resistance, and it seems unlikely that current political opinion would support the creation
of a far-reaching surveillance network for the sake of AGI risk alone. The extent to
which such extremes would be necessary depends on exactly how easy it would be to
develop AGI in secret. Although several authors make the point that AGI is much easier
to develop unnoticed than something like nuclear weapons (McGinnis 2010; Miller
2012), cutting edge high-tech research does tend to require major investments which
might plausibly be detected even by less elaborate surveillance efforts. [...]
Even under such conditions, there is no clear way to define what counts as dangerous AGI. Goertzel and Pitt (2012) point out that there is no clear
division between narrow AI and AGI, and attempts to establish such criteria have failed.
They argue that since AGI has a nebulous definition, obvious wide-ranging economic
benefits, and potentially rich penetration into multiple industry sectors, it is unlikely to
be regulated due to speculative long-term risks.
AGI regulation requires global cooperation, as the noncooperation of even a single
nation might lead to catastrophe. Historically, achieving global cooperation on tasks
such as nuclear disarmament and climate change has been very difficult. As with nuclear
weapons, AGI could give an immense economic and military advantage to the country
that develops it first, in which case limiting AGI research might even give other countries
an incentive to develop AGI faster (Cade 1966; de Garis 2005; McGinnis 2010; Miller
2012) [...]
To be effective, regulation also needs to enjoy support among those being regulated. If developers working in AGI-related fields only follow the letter of the law,
while privately considering all regulations as annoying hindrances and fears about AGI
overblown, the regulations may prove ineffective. Thus, it might not be enough to
convince governments of the need for regulation; the much larger group of people
working in the appropriate fields may also need to be convinced.
While Shulman (2009) argues that the unprecedentedly destabilizing effect of AGI
could be a cause for world leaders to cooperate more than usual, the opposite argument
can be made as well. Gubrud (1997) argues that increased automation could make
countries more self-reliant, and international cooperation considerably more difficult.
AGI technology is also much harder to detect than, for example, nuclear technology
is—nuclear weapons require a substantial infrastructure to develop, while AGI needs
much less (McGinnis 2010; Miller 2012). [...]
Goertzel and Pitt (2012) suggest that for regulation to be enacted, there might need
to be an “AGI Sputnik”—a technological achievement that makes the possibility of AGI
evident to the public and policy makers. They note that after such a moment, it might
not take very long for full human-level AGI to be developed, while the negotiations
required to enact new kinds of arms control treaties would take considerably longer. [...]
“Regulate research” proposals: Our view
Although there seem to be great difficulties involved with regulation, there also remains
the fact that many technologies have been successfully subjected to international regulation. Even if one were skeptical about the chances of effective regulation, an AGI
arms race seems to be one of the worst possible scenarios, one which should be avoided
if at all possible. We are therefore generally supportive of regulation, though the most
effective regulatory approach remains unclear. [...]
Not everyone believes that the risks involved in creating AGIs are acceptable.
Relinquishment
involves the abandonment of technological development that could lead to
AGI. This is possibly the earliest proposed approach, with Butler (1863) writing that
“war to the death should be instantly proclaimed” upon machines, for otherwise they
would end up destroying humans entirely. In a much-discussed article, Joy (2000)
suggests that it might be necessary to relinquish at least some aspects of AGI research,
as well as nanotechnology and genetics research.
AGI relinquishment is criticized by Hughes (2001), with Kurzweil (2005) criticizing
broad relinquishment while being supportive of the possibility of “fine-grained relinquishment,” banning some dangerous aspects of technologies while allowing general
work on them to proceed. In general, most writers reject proposals for broad relinquishment. [...]
McKibben (2003), writing mainly in the context of genetic engineering, suggests
that AGI research should be stopped. He brings up the historical examples of China
renouncing seafaring in the 1400s and Japan relinquishing firearms in the 1600s, as well
as the more recent decisions of abandoning DDT, CFCs, and genetically modified crops
in Western countries. However, it should also be noted that Japan participated in World
War II, that China now has a navy, that there are reasonable alternatives for DDT and
CFCs, which probably do not exist for AGI, and that genetically modified crops are in
wide use in the United States.
Hughes (2001) argues that attempts to outlaw a technology will only make the technology move to other countries. He also considers the historical relinquishment of bio-
logical weapons to be a bad example, for no country has relinquished peaceful biotechnological research such as the development of vaccines, nor would it be desirable to do so.
With AGI, there would be no clear dividing line between safe and dangerous research. [...]
Relinquishment proposals suffer from many of the same problems as regulation proposals, only worse. There is no historical precedent of general, multi-use technology similar
to AGI being successfully relinquished for good, nor do there seem to be any theoretical
reasons for believing that relinquishment proposals would work in the future. Therefore
we do not consider them to be a viable class of proposals.
I would like to think that people who have the answer for each question in their FAQ open in a new tab aren’t competent enough to do much of anything. This is probably wishful thinking.
Just to mention a science fiction handling.… John Barnes’ Daybreak books. I think the books are uneven, but the presentation of civilization-wrecking as an internet artifact is rather plausible.
My model of humans says that some people will read their page, become impressed and join them. I don’t know how much, but I think that the only thing that stops millions of people from joining them is that there already are thousands of crazy ideas out there competing with each other, so the crazy people remain divided.
Also, the website connects destroying civilization with many successful applause lights. (Actually, it seems to me like a coherent extrapolation of them; although that could be just my mindkilling speaking.) That should make it easier to get dedicated followers.
Destroying civilization is too big goal for them, but they could make some serious local damage.
An example of this: the Earth Liberation Front and Animal Liberation Front got mostly dismantled by CIA infiltrators as soon as they started getting media attention.
My impression of Eliezer’s model of the intelligence explosion is that he believes b) is much harder than it looks. If you make developing strong AI illegal then the only people who end up developing it will be criminals, which is arguably worse, and it only takes one successful criminal organization developing strong AI to cause an unfriendly intelligence explosion. The general problem is that a) requires that one organization do one thing (namely, solving friendly AI) but b) requires that literally all organizations abstain from doing one thing (namely, building unfriendly AI).
CFCs and global warming don’t seem analogous to me. A better analogy to me is nuclear disarmament: it only takes one nuke to cause bad things to happen, and governments have a strong incentive to hold onto their nukes for military applications.
Cutting-edge chip manufacturing of the necessary sort? I believe we are lightyears away and things like 3D printing are irrelevant, and that it’s a little like asking how close we are to people running Manhattan Projects in their garage*; see my essay for details.
* Literally. The estimated budget for an upcoming Taiwanese chip fab is equal to some inflation-adjusted estimates of the Manhattan Project.
My notion of nanotech may have some fantasy elements—I think of nanotech as ultimately being able to put every atom where you want it, so long as the desired location is compatible with the atoms that are already there.
I realize that chip fabs keep getting more expensive, but is there any reason to think this can’t reverse?
Right, I see your point. But it depends on how close you think we are to AGI. Assuming we are sill quite far away, then if you manage to ban AI research early enough it seems unlikely that a rogue group will manage to do by itself all the rest of the progress, cut off from the broader scientific and engineering community.
The difference is that AI is relatively easy to do in secret. CFCs and nukes are much harder to hide.
Also, only AGI research is dangerous (or, more exactly, self-improving AI), but the other kinds are very useful. Since it’s hard to tell how far the danger is (and many don’t believe there’s a big danger), you’ll get a similar reaction to emission control proposals (i.e., some will refuse to stop, and it’s hard to convince a democratic country’s population to start a war over that; not to mention that a war risks making the AI danger moot by killing us all).
I agree that all kinds of AI research that are even close to AGI will have to be banned or strictly regulated, and that convincing all nations to ensure this is a hugely complicated political problem. (I don’t think it is more difficult than controlling carbon emissions, because of status quo bias: it is easier to convince someone to not do something new that sounds good, than to get them to stop doing something they view as good. But it is still hugely difficult, no questions about that.) It just seems to me even more difficult (and risky) to aim to solve flawlessly all the problems of FAI.
Note that the problem is not convincing countries not to do AI, the problem is convincing countries to police their population to prevent them from doing AI.
It’s much harder to hide a factory or a nuclear laboratory than to hide a bunch of geeks in a basement filled with computers. Note how bio-weapons are really scary not (just) because countries might (or are) developing them, but that it’s soon becoming easy enough for someone to do it in their kitchen.
The second one involves just the sociological/political problem of convincing everyone of the risks and banning/discouraging AI research “in time” to avoid an intelligence explosion.
There have been some efforts along those lines. It doesn’t look as easy as all that.
It seems to me that there are basically two approaches to preventing an UFAI intelligence explosion: a) making sure that the first intelligence explosion is a a FAI instead; b) making sure that intelligence explosion never occurs. The first one involves solving (with no margin for error) the philosophical/ethical/logical/mathematical problem of defining FAI, and in addition the sociological/political problem of doing it “in time”, convincing everyone else, and ensuring that the first intelligence explosion occurs according to this resolution. The second one involves just the sociological/political problem of convincing everyone of the risks and banning/discouraging AI research “in time” to avoid an intelligence explosion.
Naively, it seems to me that the second approach is more viable—it seems comparable in scale to something between stopping use of CFCs (fairly easy) and stopping global warming (very difficult, but it is premature to say impossible). At any rate, sounds easier than solving (over a few year/decades) so many hard philosophical and mathematical problems, with no margin for error and under time pressure to do it ahead of UFAI developing.
However, it seems (from what I read on LW and found quickly browsing the MIRI website; I am not particularly well informed, hence writing this on the Stupid Questions thread) that most of the efforts of MIRI are on the first approach. Has there been a formal argument on why it is preferable, or are there efforts on the second approach I am unaware of? The only discussion I found was Carl Shulman’s “Arms Control and Intelligence Explosions” paper, but it is brief and nothing like a formal analysis comparing the benefits of each strategy. I am worried the situation might be biased by the LW/MIRI kind of people being more interested in (and seeing as more fun) the progress on the timeless philosophical problems necessary for (a) than the political coalition building and propaganda campaigns necessary for (b).
I think it’s easier to get a tiny fraction of the planet to do a complex right thing than to get 99.9% of a planet to do a simpler right thing, especially if 99.9% compliance may not be enough and 99.999% compliance may be required instead.
This calls for a calculation. How hard creating an FAI would have to be to have this inequality reversed?
When I see proposals that involve convincing everyone on the planet to do something, I write them off as loony-eyed idealism and move on. So, creating FAI would have to be hard enough that I considered it too “impossible” to be attempted (with this fact putatively being known to me given already-achieved knowledge), and then I would swap to human intelligence enhancement or something because, obviously, you’re not going to persuade everyone on the planet to agree with you.
But is it really necessary to persuade everyone, or 99.9% of the planet? If gwern’s analysis is correct (I have no idea if it is) then it might suffice to convince the policymakers of a few countries like USA and China.
I see. So you do have an upper bound in mind for the FAI problem difficulty, then, and it’s lower than other alternatives. It’s not simply “shut up and do the impossible”.
Given enough time for ideas to develop, any smart kid in a basement could build an AI, and every organization in the world has a massive incentive to do so. Only omnipresent surveillance could prevent everyone from writing a particular computer program.
Once you have enough power flying around to actually prevent AI, you are dealing with AI-level threats already (a not-necessarily friendly singleton).
So FAI is actually the easiest way to prevent UFAI.
The other reason is that a Friendly Singleton would be totally awesome. Like so totally awesome that it would be worth it to try for the awesomeness alone.
Your tone reminded me of super religious folk who are convinced that, say “Jesus is coming back soon!” and it’ll be “totally awesome”.
That’s nice.
Your comment reminds me of those internet atheists that are so afraid of being religious that they refuse to imagine how much better the world could be.
I do imagine how much better the world could be. I actually do want MIRI to succeed. Though currently I have low confidence in their future success, so I don’t feel “bliss” (if that’s the right word.)
BTW I’m actually slightly agnostic because of the simulation argument.
Enthusiasm? Excitement? Hope?
Yep. I don’t take it too seriously, but its at least coherent to imagine beings outside the universe who could reach in and poke at us.
But, in the current situation (or even a few years from now) would it be possible for a smart kid in a basement to build an AI from scratch? Isn’t it something that still requires lots of progress to build on? See my reply to Qiaochu.
So will progress just stop for as long as we want it to?
The question is whether it would be possible to ban further research and stop progress (open, universally accessible and buildable-upon progress), in time for AGI to be still far away enough that an isolated group in a basement will have no chance of achieving it on its own.
If by “basement” you mean “anywhere, working in the interests of any organization that wants to gain a technology advantage over the rest of the world,” then sure, I agree that this is a good question. So what do you think the answer is?
I have no idea! I am not a specialist of any kind in AI development. That is why I posted in the Stupid Questions thread asking “has MIRI considered this and made a careful analysis?” instead of making a top-level post saying “MIRI should be doing this”. It may seem that in the subthread I am actively arguing for strategy (b), but what I am doing is pushing back against what I see as insufficient answers on such an important question.
So… what do you think the answer is?
If you want my answers, you’ll need to humor me.
Uh, apparently my awesome is very different from your awesome. What scares me is this “Singleton” thing, not the friendly part.
Hmmm. What is it going to do that is bad, given that it has the power to do the right thing, and is Friendly?
We have inherited some anti-authoritarian propaganda memes from a cultural war that is no longer relevant, and those taint the evaluation of a Singleton, even though they really don’t apply. At least that’s how it felt to me when I thought through it.
Upvoted.
I’m not sure why more people around here are not concerned about the singleton thing. It almost feels like yearning for a god on some people’s part.
We discuss this proposal in Responses to Catastrophic AGI Risk, under the sections “Regulate research” and “Relinquish technology”. I recommend reading both of those sections if you’re interested, but a few relevant excerpts:
I had no idea that Herbert’s Butlerian Jihad might be a historical reference.
Wow, I’ve read Dune several times, but didn’t actually get that before you pointed it out.
It turns out that there’s a wikipedia page.
There’s a third alternative, though it’s quite unattractive: damaging civilization to the point that AI is impossible.
Given that out of billions, a few with extremely weird brains are likely to see evil-AI risk as nigh.
One of them is bound to push the red button way before I or anyone else would reach for it.
So I hope red technology-resetting buttons don’t become widely available.
This suggests a principle: I have a duty to be conservative in my own destroy-the-world-to-save-it projects :)
And there are, in fact, several people proposing this as a solution to other anthropogenic existential risks. Here’s one example.
I would like to think that people who have the answer for each question in their FAQ open in a new tab aren’t competent enough to do much of anything. This is probably wishful thinking.
Just to mention a science fiction handling.… John Barnes’ Daybreak books. I think the books are uneven, but the presentation of civilization-wrecking as an internet artifact is rather plausible.
Not only that, they’re talking about damaging civilization and they have an online FAQ in the first place. They fail Slytherin forever.
My model of humans says that some people will read their page, become impressed and join them. I don’t know how much, but I think that the only thing that stops millions of people from joining them is that there already are thousands of crazy ideas out there competing with each other, so the crazy people remain divided.
Also, the website connects destroying civilization with many successful applause lights. (Actually, it seems to me like a coherent extrapolation of them; although that could be just my mindkilling speaking.) That should make it easier to get dedicated followers.
Destroying civilization is too big goal for them, but they could make some serious local damage.
And my model of the government says this has negative expected value overall.
An example of this: the Earth Liberation Front and Animal Liberation Front got mostly dismantled by CIA infiltrators as soon as they started getting media attention.
Our standards for Slytherin may be too high.
I don’t get it. None of us here set the standards, unless certain donors have way more connections than I think they do.
You’re the one who mentioned failing Slytherin forever.
My actual belief is that people don’t have to be ideally meticulous to do a lot of damage.
Isn’t that already covered by option b?
My impression of Eliezer’s model of the intelligence explosion is that he believes b) is much harder than it looks. If you make developing strong AI illegal then the only people who end up developing it will be criminals, which is arguably worse, and it only takes one successful criminal organization developing strong AI to cause an unfriendly intelligence explosion. The general problem is that a) requires that one organization do one thing (namely, solving friendly AI) but b) requires that literally all organizations abstain from doing one thing (namely, building unfriendly AI).
CFCs and global warming don’t seem analogous to me. A better analogy to me is nuclear disarmament: it only takes one nuke to cause bad things to happen, and governments have a strong incentive to hold onto their nukes for military applications.
What would a law against developing strong AI look like?
I’ve suggested in the past that it would look something like a ban on chips more powerful than X teraflops/$.
How close are we to illicit chip manufacturing? On second thought, it might be easier to steal the chips.
Cutting-edge chip manufacturing of the necessary sort? I believe we are lightyears away and things like 3D printing are irrelevant, and that it’s a little like asking how close we are to people running Manhattan Projects in their garage*; see my essay for details.
* Literally. The estimated budget for an upcoming Taiwanese chip fab is equal to some inflation-adjusted estimates of the Manhattan Project.
My notion of nanotech may have some fantasy elements—I think of nanotech as ultimately being able to put every atom where you want it, so long as the desired location is compatible with the atoms that are already there.
I realize that chip fabs keep getting more expensive, but is there any reason to think this can’t reverse?
It’s hard to say what nanotech will ultimately pan out to be.
But in the absence of nanoassemblers, it’d be a very bad idea to bet against Moore’s second law.
Right, I see your point. But it depends on how close you think we are to AGI. Assuming we are sill quite far away, then if you manage to ban AI research early enough it seems unlikely that a rogue group will manage to do by itself all the rest of the progress, cut off from the broader scientific and engineering community.
The difference is that AI is relatively easy to do in secret. CFCs and nukes are much harder to hide.
Also, only AGI research is dangerous (or, more exactly, self-improving AI), but the other kinds are very useful. Since it’s hard to tell how far the danger is (and many don’t believe there’s a big danger), you’ll get a similar reaction to emission control proposals (i.e., some will refuse to stop, and it’s hard to convince a democratic country’s population to start a war over that; not to mention that a war risks making the AI danger moot by killing us all).
I agree that all kinds of AI research that are even close to AGI will have to be banned or strictly regulated, and that convincing all nations to ensure this is a hugely complicated political problem. (I don’t think it is more difficult than controlling carbon emissions, because of status quo bias: it is easier to convince someone to not do something new that sounds good, than to get them to stop doing something they view as good. But it is still hugely difficult, no questions about that.) It just seems to me even more difficult (and risky) to aim to solve flawlessly all the problems of FAI.
Note that the problem is not convincing countries not to do AI, the problem is convincing countries to police their population to prevent them from doing AI.
It’s much harder to hide a factory or a nuclear laboratory than to hide a bunch of geeks in a basement filled with computers. Note how bio-weapons are really scary not (just) because countries might (or are) developing them, but that it’s soon becoming easy enough for someone to do it in their kitchen.
The approach of Leverage Research is more like (b).
There have been some efforts along those lines. It doesn’t look as easy as all that.