Good point, Jef—Eliezer is attributing the validity of “the ends don’t justify the means” entirely to human fallibility, and neglecting that part accounted for by the unpredictability of the outcome.
He may have some model of an AI as a perfect Bayesian reasoner that he uses to justify neglecting this. I am immediately suspicious of any argument invoking perfection.
I don’t know what “a model of evolving values increasingly coherent over increasing context, with effect over increasing scope of consequences” means.
Good point, Jef—Eliezer is attributing the validity of “the ends don’t justify the means” entirely to human fallibility, and neglecting that part accounted for by the unpredictability of the outcome.
He may have some model of an AI as a perfect Bayesian reasoner that he uses to justify neglecting this. I am immediately suspicious of any argument invoking perfection.
I don’t know what “a model of evolving values increasingly coherent over increasing context, with effect over increasing scope of consequences” means.