Humans haven’t figured out meta ethics well enough to show that moral realism is true. So there is a probability, not a certainty , that an AI will realise moral truths. The argument also requires the AI to be motivated by the truths it discovers, and it requires preserving human life to be an objective moral imperative. The latter point sn’t obvious—there’s a standard Sci Fi plot where a powerful AI is tasked with solving the world’s problems, and decides humans are the problem. So there uncertain premises have to be true simultaneously, so moral realism is far from a surefire solution to AI safety.
For the these reasons, AI safety theorists focus on friendliness the preservation of humans, as a direct goal, rather than objective goodness.
Humans haven’t figured out meta ethics well enough to show that moral realism is true. So there is a probability, not a certainty , that an AI will realise moral truths. The argument also requires the AI to be motivated by the truths it discovers, and it requires preserving human life to be an objective moral imperative. The latter point sn’t obvious—there’s a standard Sci Fi plot where a powerful AI is tasked with solving the world’s problems, and decides humans are the problem. So there uncertain premises have to be true simultaneously, so moral realism is far from a surefire solution to AI safety.
For the these reasons, AI safety theorists focus on friendliness the preservation of humans, as a direct goal, rather than objective goodness.