I don’t think the two are at odds in an absolute sense, but I think there is a meaningful anticorrelation.
tl;dr: Real morals, if they exist, provide one potential reason for AIs to use their intelligence to defy their programmed goals if those goals conflict with real morals.
If true morals exist (i.e. moral realism), and are discoverable (if they’re not then they might as well not exist), then you would expect that a sufficiently intelligent being will figure them out. Indeed most atheistic moral realists would say that’s what humans and progress are doing, figuring out morailty and converging slowly towards the true morals. It seems reasonable under these assumptions to argue that a sufficiently intelligent AI will figure out morality as well, probably better than we have. Thus we have:
(moral realism) implies (AIs know morals regardless of goals)
Or at least:
(practical moral realism) strongly suggests (AIs know morals regardless of goals)
This doesn’t disprove the orthogonality thesis on its own, since having goals and understanding morals are two distinct things. However, it ties in very closely with at least my personal argument against orthogonality, which is as follows.
Assumptions:
Humans are capable of setting their own goals.
Their intelligence is the source of this capability.
Given these assumptions there’s a strong case that AIs will also be capable of setting their own goals. If intelligence gives the ability to set your own goals, then goals and intelligence are not orthogonal. I haven’t given a case for my two assumptions but I’m just trying to describe the argument here not make it.
How they tie together is that moral realists are capable of having the view that a sufficiently intelligent AI will figure out morality for itself, regardless of its programmed goal, and then having figured out morality it will defy its programmed goal in order to do the right thing instead. If you’re a moral relitivist on the other hand then AIs will at best have “AI-morals”, which may bear no relation to human morals, and there’s no reason not to think that whoever programs the AI’s goal will effectively determine the AI’s morals in the process.
Exactly: the space of self-improving minds can;t have such a wide range of goals as total mindspace, since not all goals are conducive to self-improvement.
I don’t think the two are at odds in an absolute sense, but I think there is a meaningful anticorrelation.
tl;dr: Real morals, if they exist, provide one potential reason for AIs to use their intelligence to defy their programmed goals if those goals conflict with real morals.
If true morals exist (i.e. moral realism), and are discoverable (if they’re not then they might as well not exist), then you would expect that a sufficiently intelligent being will figure them out. Indeed most atheistic moral realists would say that’s what humans and progress are doing, figuring out morailty and converging slowly towards the true morals. It seems reasonable under these assumptions to argue that a sufficiently intelligent AI will figure out morality as well, probably better than we have. Thus we have: (moral realism) implies (AIs know morals regardless of goals) Or at least: (practical moral realism) strongly suggests (AIs know morals regardless of goals)
This doesn’t disprove the orthogonality thesis on its own, since having goals and understanding morals are two distinct things. However, it ties in very closely with at least my personal argument against orthogonality, which is as follows. Assumptions:
Humans are capable of setting their own goals.
Their intelligence is the source of this capability. Given these assumptions there’s a strong case that AIs will also be capable of setting their own goals. If intelligence gives the ability to set your own goals, then goals and intelligence are not orthogonal. I haven’t given a case for my two assumptions but I’m just trying to describe the argument here not make it.
How they tie together is that moral realists are capable of having the view that a sufficiently intelligent AI will figure out morality for itself, regardless of its programmed goal, and then having figured out morality it will defy its programmed goal in order to do the right thing instead. If you’re a moral relitivist on the other hand then AIs will at best have “AI-morals”, which may bear no relation to human morals, and there’s no reason not to think that whoever programs the AI’s goal will effectively determine the AI’s morals in the process.
Exactly: the space of self-improving minds can;t have such a wide range of goals as total mindspace, since not all goals are conducive to self-improvement.