Here he claims that the default outcome of AI is very likely safe, but attempts at Friendly AI are very likely deadly if they do anything (although I would argue this neglects the correlation between what AI approaches are workable in general and for would-be FAI efforts, and what is dangerous for both types, as well as assuming some silly behaviors and that competitive pressures aren’t severe):
I believe that unleashing an all-powerful “agent AGI” (without the benefit of experimentation) would very likely result in a UFAI-like outcome, no matter how carefully the “agent AGI” was designed to be “Friendly.” I see SI as encouraging (and aiming to take) this approach.
I believe that the standard approach to developing software results in “tools,” not “agents,” and that tools (while dangerous) are much safer than agents. A “tool mode” could facilitate experiment-informed progress toward a safe “agent,” rather than needing to get “Friendliness” theory right without any experimentation.
Therefore, I believe that the approach SI advocates and aims to prepare for is far more dangerous than the standard approach, so if SI’s work on Friendliness theory affects the risk of human extinction one way or the other, it will increase the risk of human extinction. Fortunately I believe SI’s work is far more likely to have no effect one way or the other
Which mission? The FAI mission? The GiveWell mission? I am confused :(
I don’t suppose he said this somewhere linkable?
Here he claims that the default outcome of AI is very likely safe, but attempts at Friendly AI are very likely deadly if they do anything (although I would argue this neglects the correlation between what AI approaches are workable in general and for would-be FAI efforts, and what is dangerous for both types, as well as assuming some silly behaviors and that competitive pressures aren’t severe):