Fortunately, before the fundamentally wrongheaded enterprise of Pebblesorter AI gets too far along, a brilliant young AI researcher realizes that if they analyze and extrapolate the common core of Pebblesorter ethical judgments, they can build an AI that implements the computation that leads them to endorse certain piles and reject others.
An AI built to optimize for that computation, it realizes, would be Friendly: that is, it would implement what Pebblesorters want, and they could therefore rely on it to ethically order the world.
A traditionalist skeptic objects that all Pebblesorter ethical arguments, at least for ethical problems up to 1957, have been written down in the Great Book for generations; there’s no need for more.
“That’s true,” replies the researcher, “but that’s just a Not Particularly Large Lookup Table. Sure, such an approach is adequate for all the cases that have ever come up in our entire history, but this coherent extrapolated algorithm could be extended to novel ethical questions like ’300007′ and still be provably correct.”
“But how do we know that’s the right thing?” retorts a Heap Relativist. “Sure, it’s what we want, but why should we privilege that?”
“That’s a wrong question,” replies the researcher. “It presumes there’s some kind of magical Sorter in the sky, or something like that. But there could not possibly be such a thing. Even if there were such a Sorter we’d have no reason to accept its piles as right unless we judged them to be right based on our computations. ‘Right’ simply means the computation we use to determine whether a pile is correct. What else could it possibly mean?”
“But non-Pebblesorters would disagree,” objects the Relativist. “Imagine an alien race… call them humans… who don’t share our computation. They would be just as happy with a pile of, I don’t know, nine pebbles as a pile of seven.”
“True, but so what? They aren’t right.”
Shortly thereafter, the researcher has the key insight of Factorial Decision Theory, which lets it derive the computation that represents Pebblesorter volition, which turns out to be surprisingly simple. It then builds an optimizer that implements that volition, thereby moving its entire light cone to an ethically optimal state.
Coincidentally, the Heap Relativist is the last survivor of this process, and looks around itself at the ethically optimized biomass of its species as the AI’s effectors start to convert its body into a pile of 300007 pebbles.
Fortunately, before the fundamentally wrongheaded enterprise of Pebblesorter AI gets too far along, a brilliant young AI researcher realizes that if they analyze and extrapolate the common core of Pebblesorter ethical judgments, they can build an AI that implements the computation that leads them to endorse certain piles and reject others.
An AI built to optimize for that computation, it realizes, would be Friendly: that is, it would implement what Pebblesorters want, and they could therefore rely on it to ethically order the world.
A traditionalist skeptic objects that all Pebblesorter ethical arguments, at least for ethical problems up to 1957, have been written down in the Great Book for generations; there’s no need for more.
“That’s true,” replies the researcher, “but that’s just a Not Particularly Large Lookup Table. Sure, such an approach is adequate for all the cases that have ever come up in our entire history, but this coherent extrapolated algorithm could be extended to novel ethical questions like ’300007′ and still be provably correct.”
“But how do we know that’s the right thing?” retorts a Heap Relativist. “Sure, it’s what we want, but why should we privilege that?”
“That’s a wrong question,” replies the researcher. “It presumes there’s some kind of magical Sorter in the sky, or something like that. But there could not possibly be such a thing. Even if there were such a Sorter we’d have no reason to accept its piles as right unless we judged them to be right based on our computations. ‘Right’ simply means the computation we use to determine whether a pile is correct. What else could it possibly mean?”
“But non-Pebblesorters would disagree,” objects the Relativist. “Imagine an alien race… call them humans… who don’t share our computation. They would be just as happy with a pile of, I don’t know, nine pebbles as a pile of seven.”
“True, but so what? They aren’t right.”
Shortly thereafter, the researcher has the key insight of Factorial Decision Theory, which lets it derive the computation that represents Pebblesorter volition, which turns out to be surprisingly simple. It then builds an optimizer that implements that volition, thereby moving its entire light cone to an ethically optimal state.
Coincidentally, the Heap Relativist is the last survivor of this process, and looks around itself at the ethically optimized biomass of its species as the AI’s effectors start to convert its body into a pile of 300007 pebbles.
At that moment, it is enlightened.