This is a great article that I would like to see go further with respect to both people and AGI. With respect to people, it seems to me that, once we assume intent, we build on that error by then assuming the stability of that intent (because peoples intents tend to be fairly stable) which then causes us to feel shock when that intent suddenly changes. We might then see this as intentional deceit and wander ever further from the truth—that it was only an unconscious whim in the first place. Regarding AGI, this is linked to unwarranted anthropomorpism, again leading to unwarranted assumptions of stability. In this case the problem appears to be that we really cannot think like a machine. For an AGI, at least based on current understandings, there are, objectively, more or less stable goals, but our judgement of that stability is not well founded. For current AI, it does not even make sense to talk about the strength of a “preference” or an “intent” except as an observed statistical phenomenon. From a software point of view, the future value of two possible actions are calculated and one number is bigger than the other. There is no difference, in the decision making process, between a difference of 1,000,000 and 0.000001, in either case the action with the larger value will be pursued. Unlike a human, an AI will never perfrom an action halfheartedly.
This is a great article that I would like to see go further with respect to both people and AGI.
With respect to people, it seems to me that, once we assume intent, we build on that error by then assuming the stability of that intent (because peoples intents tend to be fairly stable) which then causes us to feel shock when that intent suddenly changes. We might then see this as intentional deceit and wander ever further from the truth—that it was only an unconscious whim in the first place.
Regarding AGI, this is linked to unwarranted anthropomorpism, again leading to unwarranted assumptions of stability. In this case the problem appears to be that we really cannot think like a machine. For an AGI, at least based on current understandings, there are, objectively, more or less stable goals, but our judgement of that stability is not well founded. For current AI, it does not even make sense to talk about the strength of a “preference” or an “intent” except as an observed statistical phenomenon. From a software point of view, the future value of two possible actions are calculated and one number is bigger than the other. There is no difference, in the decision making process, between a difference of 1,000,000 and 0.000001, in either case the action with the larger value will be pursued. Unlike a human, an AI will never perfrom an action halfheartedly.