It occurs to me that there is a roadblock for an AI to go foom; namely that it first has to solve the same “keep my goals constant while rewriting myself” problem that MIRI is trying to solve.
Otherwise the situation would be analogous to Gandhi being offered a pill that has a chance of making him into anti-Gandhi, and declining it.
If the superhuman—but not yet foomed! - AI is not yet orders of magnitude smarter than a hoo-mon, it may be a while before it is willing to power-up / go foom, since it would not want to jeapardize its utility function along the way.
Just because it can foom does not imply it’ll want to foom (because of the above).
This is interesting, though I think it’s less relevant for an entity made out of readable code. In the pill situation, if ghandi fully understood both his own biochemistry and the pill, all chance would be removed from the equation.
A human researcher would see all of the AI’s code and the “pill” (the proposed change), yet even without that element of “chance” it is not yet a solved problem what the change would end up doing.
If the first human-programmed foom-able AI is not yet orders of magnitude smarter than a human—and it’s doubtful it would be, given that it’s still human-designed, then the AI would have no advantage in understanding its own code that the human researcher wouldn’t have.
If the human researcher cannot yet solve keeping the utility function steady under modifications, why should the similar-magnitude-of-intelligence AI (both have full access to the code-base)?
Just remember that it’s the not-yet-foomed AI that has to deal with these issues, before it can go weeeeeeeeeeeeeeeeKILLHUMANS (foom).
It occurs to me that there is a roadblock for an AI to go foom; namely that it first has to solve the same “keep my goals constant while rewriting myself” problem that MIRI is trying to solve.
Otherwise the situation would be analogous to Gandhi being offered a pill that has a chance of making him into anti-Gandhi, and declining it.
If the superhuman—but not yet foomed! - AI is not yet orders of magnitude smarter than a hoo-mon, it may be a while before it is willing to power-up / go foom, since it would not want to jeapardize its utility function along the way.
Just because it can foom does not imply it’ll want to foom (because of the above).
This is interesting, though I think it’s less relevant for an entity made out of readable code. In the pill situation, if ghandi fully understood both his own biochemistry and the pill, all chance would be removed from the equation.
edit: More relevant reply:
A human researcher would see all of the AI’s code and the “pill” (the proposed change), yet even without that element of “chance” it is not yet a solved problem what the change would end up doing.
If the first human-programmed foom-able AI is not yet orders of magnitude smarter than a human—and it’s doubtful it would be, given that it’s still human-designed, then the AI would have no advantage in understanding its own code that the human researcher wouldn’t have.
If the human researcher cannot yet solve keeping the utility function steady under modifications, why should the similar-magnitude-of-intelligence AI (both have full access to the code-base)?
Just remember that it’s the not-yet-foomed AI that has to deal with these issues, before it can go weeeeeeeeeeeeeeeeKILLHUMANS (foom).