This depends upon how you cash out the word “steadfast”, but I don’t think that the type of reliability we are looking for is a contradiction in terms.
I think I can see what Mark_Friedenbach is getting at here; I consider this sentence:
And the space of specialized problem solvers is sufficiently connected that the human-like intelligence is able to move from its current state to become any other specialized problem solver in reasonable time, a process we call learning.
And I note that “any other specialised problem solver” includes both Friendly and Unfriendly AIs; this implies that Mark’s definition of human-like includes the possibility that, at any point, the AI may learn to be Unfriendly. Which would be in direct contradiction to the idea of an AI which is steadfastly Friendly. (Interestingly, if I am parsing this correctly, it does not preclude the possibility of a Friendly non-human-like intelligence...)
I think I can see what Mark_Friedenbach is getting at here; I consider this sentence:
And I note that “any other specialised problem solver” includes both Friendly and Unfriendly AIs; this implies that Mark’s definition of human-like includes the possibility that, at any point, the AI may learn to be Unfriendly. Which would be in direct contradiction to the idea of an AI which is steadfastly Friendly. (Interestingly, if I am parsing this correctly, it does not preclude the possibility of a Friendly non-human-like intelligence...)