I also am unsure about how much people think that’s the primary problem. I feel fairly confident that Eliezer thinks (or thought at some recent point) that this was the primary problem. I came into the field thinking of this as the primary problem.
It certainly seems that many people assume that a superintelligent AI system has a utility function. I don’t know their reasons for this assumption.
The standard rebuttal here is that even if a superintelligent AI system is not goal directed, we should be concerned that the AI will spontaneously develop goal directed behavior because it is instrumentally valuable to doing whatever it is doing (and is not “doing whatever it is doing” a “goal”, even if the AI does not conceive of it as a goal, the same way as the calculator has a “goal” or purpose, even if the calculator is unaware of it). This is of course contingent on it being “superintelligent”.
For what it’s worth this is also the origin, as I recall it, of concerns about paperclip maximizers: you won’t build an AI that sets out to tile the universe with paperclips, but through a series of unfortunate misunderstandings it will, as a subagent or an instrumental action, end up optimizing for paperclips anyway because it seemed like a good idea at the time.
It sounds to me like you’re requiring “superintelligent” to include “has a goal” as part of the definition. If that’s part of the definition, then I would rephrase my point as “why do we have to build something superintelligent? Let’s instead build something that doesn’t have a goal but is still useful, like an AI system that follows norms.”
See also this comment, which answers a related question.
I also am unsure about how much people think that’s the primary problem. I feel fairly confident that Eliezer thinks (or thought at some recent point) that this was the primary problem. I came into the field thinking of this as the primary problem.
It certainly seems that many people assume that a superintelligent AI system has a utility function. I don’t know their reasons for this assumption.
The standard rebuttal here is that even if a superintelligent AI system is not goal directed, we should be concerned that the AI will spontaneously develop goal directed behavior because it is instrumentally valuable to doing whatever it is doing (and is not “doing whatever it is doing” a “goal”, even if the AI does not conceive of it as a goal, the same way as the calculator has a “goal” or purpose, even if the calculator is unaware of it). This is of course contingent on it being “superintelligent”.
For what it’s worth this is also the origin, as I recall it, of concerns about paperclip maximizers: you won’t build an AI that sets out to tile the universe with paperclips, but through a series of unfortunate misunderstandings it will, as a subagent or an instrumental action, end up optimizing for paperclips anyway because it seemed like a good idea at the time.
It sounds to me like you’re requiring “superintelligent” to include “has a goal” as part of the definition. If that’s part of the definition, then I would rephrase my point as “why do we have to build something superintelligent? Let’s instead build something that doesn’t have a goal but is still useful, like an AI system that follows norms.”
See also this comment, which answers a related question.