Human brains are a priori aligned with human values. Human brains are proof positive that a general intelligence can be aligned with human values. Wetware is an awful computational substrate. Silicon ought to work better.
Arguments by definition don’t work. If by “human values” you mean “whatever humans end up maximizing”, then sure, but we are unstable and can be manipulated, which isn’t we want in an AI. And if you mean “what humans deeply want or need”, then human actions don’t seem very aligned with that, so we’re back at square one.
Humans aren’t aligned once you break abstraction of “humans” down. There’s nobody I would trust to be a singleton with absolute power over me (though if I had to take my chances, I’d rather have a human than a random AI).
I like your perspective here, but I don’t think it’s a given that human brains are necessarily aligned with human ‘values’. It entirely depends on how we define human values. Let’s suppose that long-term existential risk reduction is a human value (it ranks highly in nearly all moral theories). Because of cognitive limitations and biases, most human brains aren’t aligned with this value.
Definition implies equality. Equality is commutative. If “human values” equals “whatever vague cluster of things human brains are pointing at” then “whatever vague cluster of things human brains are pointing at” equals “human values”.
Human brains are a priori aligned with human values. Human brains are proof positive that a general intelligence can be aligned with human values. Wetware is an awful computational substrate. Silicon ought to work better.
Arguments by definition don’t work. If by “human values” you mean “whatever humans end up maximizing”, then sure, but we are unstable and can be manipulated, which isn’t we want in an AI. And if you mean “what humans deeply want or need”, then human actions don’t seem very aligned with that, so we’re back at square one.
Humans aren’t aligned once you break abstraction of “humans” down. There’s nobody I would trust to be a singleton with absolute power over me (though if I had to take my chances, I’d rather have a human than a random AI).
I like your perspective here, but I don’t think it’s a given that human brains are necessarily aligned with human ‘values’. It entirely depends on how we define human values. Let’s suppose that long-term existential risk reduction is a human value (it ranks highly in nearly all moral theories). Because of cognitive limitations and biases, most human brains aren’t aligned with this value.
Definition implies equality. Equality is commutative. If “human values” equals “whatever vague cluster of things human brains are pointing at” then “whatever vague cluster of things human brains are pointing at” equals “human values”.