The AI that understands which details are morally salient is one that doesn’t need the oversight.
That’s quite non-obvious to me. A quite arbitrary claim, it seems to me.
You’re basically saying if an intelligent mind (A for Alice) knows that person (B for Bob) will care about a certain Consequence C, then A will definitely know how much B will care about it.
This isn’t the case for real human minds. If Alice is a human mechanic and tells to Bob “I can fix your car, but it’ll cost 200$ dollars”, then Alice knows that Bob will care about the cost, but doesn’t know how much Bob will care, and whether Bob prefers to have a fixed car, or to have 200$.
So if your claim doesn’t even hold for human minds, why do you think it applies for non-human minds?
And even if it does hold, what about the case where Alice doesn’t know about whether a detail is morally salient, but errs on the side of caution. e.g. Alice the waitress asks Bob the customer “The chocolate icecream you asked for also has some crushed peanuts in it. Is that okay?”—and Bob can respond “Ofcourse, why should I care about that?” or alternatively “It’s not okay, I’m allergic to peanuts!”
In this case Alice the waitress doesn’t know if the detail is salient to Bob, but asks just to make sure.
That’s quite non-obvious to me. A quite arbitrary claim, it seems to me.
You’re basically saying if an intelligent mind (A for Alice) knows that person (B for Bob) will care about a certain Consequence C, then A will definitely know how much B will care about it.
This isn’t the case for real human minds. If Alice is a human mechanic and tells to Bob “I can fix your car, but it’ll cost 200$ dollars”, then Alice knows that Bob will care about the cost, but doesn’t know how much Bob will care, and whether Bob prefers to have a fixed car, or to have 200$.
So if your claim doesn’t even hold for human minds, why do you think it applies for non-human minds?
And even if it does hold, what about the case where Alice doesn’t know about whether a detail is morally salient, but errs on the side of caution. e.g. Alice the waitress asks Bob the customer “The chocolate icecream you asked for also has some crushed peanuts in it. Is that okay?”—and Bob can respond “Ofcourse, why should I care about that?” or alternatively “It’s not okay, I’m allergic to peanuts!”
In this case Alice the waitress doesn’t know if the detail is salient to Bob, but asks just to make sure.