If a programmer chooses to “unleash an AGI as an agent” with the hope of gaining power, it seems that this programmer will be deliberately ignoring conventional wisdom about what is safe in favor of shortsighted greed. I do not see why such a programmer would be expected to make use of any “Friendliness theory” that might be available. (Attempting to incorporate such theory would almost certainly slow the project down greatly, and thus would bring the same problems as the more general “have caution, do testing” counseled by conventional wisdom.)
But they may think they have a theory of friendliness that works, but actually creates a UFAI. If SI already had code that could be slapped on that creates friendliness, this type of programmer would use it.
But they may think they have a theory of friendliness that works, but actually creates a UFAI. If SI already had code that could be slapped on that creates friendliness, this type of programmer would use it.