The shield AI bothers me. One would need to be very careful to specify how a shield AI would function so that it would a) not try to generally halt human research development in its attempt to prevent the development of AIs b) allow humans to turn it off when we think we’ve got friendly AIs even as it doesn’t let FOOMing AIs turn it off. Specifying these issues might be quite difficult. Edit: Sorry,b is only an issue for a slightly different form of shield AI since you seem to want a shield AI which is actually not given a specific method of being turned off. I’m not sure that’s a good idea (especially if the shield AI goes drastically wrong).
Regarding encryption issues: Even in the circumstance where encryption turns out to be provably secure (say a proof that factoring is not in P in some very strong sense) that doesn’t mean that our implementations are secure. It isn’t infrequent for an implementation of an encryption scheme to turn out to have a serious problem, not because of anything in the nature of the encryption, but due to errors and oversights in the human programming. This should increase the probability estimate that an AI could take over highly secured systems.
The one thing that could make me deeply worried about FOOMing AIs would be a non-constructive proof that P=NP. If that occurs, most forms of encryption become potentially highly non-secure and most of the potential ways that an Option 3 “software fail” could occur become much less likely. (I think I may have mentioned this to you earlier).
I’m curious what other existential risks other people will think are likely for Option 1. I think I’ll wait until listing my own in order not to anchor the thinking in any specific direction (although given earlier comments I made I suspect that Mass Driver can guess the general focus most of the ones I would list would take.)
One would need to be very careful to specify how a shield AI would function.
Of course! I don’t pretend that it’s easy; it’s just that it may require, say, 6 people-years of work instead of 400 people-years of work, and thus be a project that actually could be completed before unFriendly AIs start launching.
you seem to want a shield AI which is actually not given a specific method of being turned off.
I mean, you could have an off-switch that expires after the first 2 years or so, maybe based on the decay of a radioactive element in a black box, with the quantity put in the black box put there before the shield AI is turned on and with the exact quantity unknown to all except a very small number of researchers (perhaps one) who does her calculations on pencil and paper and then shreds and eats them. That way you could get a sense of the AI’s actual goals (since it wouldn’t know when it was safe to ‘cheat’) during whatever little time is left before unfriendly AI launches that could take over the off-switch start becoming a serious threat, and, if necessary, abort.
The shield AI bothers me. One would need to be very careful to specify how a shield AI would function so that it would a) not try to generally halt human research development in its attempt to prevent the development of AIs b) allow humans to turn it off when we think we’ve got friendly AIs even as it doesn’t let FOOMing AIs turn it off. Specifying these issues might be quite difficult. Edit: Sorry,b is only an issue for a slightly different form of shield AI since you seem to want a shield AI which is actually not given a specific method of being turned off. I’m not sure that’s a good idea (especially if the shield AI goes drastically wrong).
Regarding encryption issues: Even in the circumstance where encryption turns out to be provably secure (say a proof that factoring is not in P in some very strong sense) that doesn’t mean that our implementations are secure. It isn’t infrequent for an implementation of an encryption scheme to turn out to have a serious problem, not because of anything in the nature of the encryption, but due to errors and oversights in the human programming. This should increase the probability estimate that an AI could take over highly secured systems.
The one thing that could make me deeply worried about FOOMing AIs would be a non-constructive proof that P=NP. If that occurs, most forms of encryption become potentially highly non-secure and most of the potential ways that an Option 3 “software fail” could occur become much less likely. (I think I may have mentioned this to you earlier).
I’m curious what other existential risks other people will think are likely for Option 1. I think I’ll wait until listing my own in order not to anchor the thinking in any specific direction (although given earlier comments I made I suspect that Mass Driver can guess the general focus most of the ones I would list would take.)
Of course! I don’t pretend that it’s easy; it’s just that it may require, say, 6 people-years of work instead of 400 people-years of work, and thus be a project that actually could be completed before unFriendly AIs start launching.
I mean, you could have an off-switch that expires after the first 2 years or so, maybe based on the decay of a radioactive element in a black box, with the quantity put in the black box put there before the shield AI is turned on and with the exact quantity unknown to all except a very small number of researchers (perhaps one) who does her calculations on pencil and paper and then shreds and eats them. That way you could get a sense of the AI’s actual goals (since it wouldn’t know when it was safe to ‘cheat’) during whatever little time is left before unfriendly AI launches that could take over the off-switch start becoming a serious threat, and, if necessary, abort.