In this scenario, wouldn’t you eventually build a sufficiently powerful goal-directed AI that leads to an existential catastrophe?
Perhaps the hope is that when everyone sees that the first goal-directed AI is visibly dangerous then they actually believe that goal-directed AI is dangerous. But in the scenario where we are building alternatives to goal-directed AI and they are actually getting used, I would predict that we have convinced most AI researchers that goal-directed AI is dangerous.
(Also, I think you can level this argument at nearly all AI safety research agendas, with possibly the exception of Agent Foundations.)
I think I didn’t articulate my argument clearly, I tried to clarify it in my reply to Jessica.
I think my argument might be especially relevant to the effort of persuading AI researchers not to build goal-directed systems.
If a result of this effort is convincing more AI researchers in the general premise that x-risk from AI is something worth worrying about, then that’s a very strong argument in favor of carrying out the effort (and I agree this result should correlate with convincing AI researchers not to build goal-directed systems—if that’s what you argued in your comment).
Yeah, I was imagining that we would convince AI researchers that goal-directed systems are dangerous, and that we should build the non-goal-directed versions instead.
In this scenario, wouldn’t you eventually build a sufficiently powerful goal-directed AI that leads to an existential catastrophe?
Perhaps the hope is that when everyone sees that the first goal-directed AI is visibly dangerous then they actually believe that goal-directed AI is dangerous. But in the scenario where we are building alternatives to goal-directed AI and they are actually getting used, I would predict that we have convinced most AI researchers that goal-directed AI is dangerous.
(Also, I think you can level this argument at nearly all AI safety research agendas, with possibly the exception of Agent Foundations.)
I think I didn’t articulate my argument clearly, I tried to clarify it in my reply to Jessica.
I think my argument might be especially relevant to the effort of persuading AI researchers not to build goal-directed systems.
If a result of this effort is convincing more AI researchers in the general premise that x-risk from AI is something worth worrying about, then that’s a very strong argument in favor of carrying out the effort (and I agree this result should correlate with convincing AI researchers not to build goal-directed systems—if that’s what you argued in your comment).
Yeah, I was imagining that we would convince AI researchers that goal-directed systems are dangerous, and that we should build the non-goal-directed versions instead.