Once again we are encountering the general problem. When an oracle is given a decision: POSSIBLE OUTPUTS: YES; NO; … all the options have consequences, potentially drastic consequences.
If we’re just dealing with an Oracle, we can pipe the actual answer through some version of utility indifference (slightly more subtle, as the measure of reduced impact doesn’t look much like a utility function).
For a general agent, though, then I think “can this work if we magically assume there are no major social consequences” is a fair question to ask, and a “yes” would be of great interest. After that, we can drop the assumption and see if that’s solveable.
If we’re just dealing with an Oracle, we can pipe the actual answer through some version of utility indifference (slightly more subtle, as the measure of reduced impact doesn’t look much like a utility function).
For a general agent, though, then I think “can this work if we magically assume there are no major social consequences” is a fair question to ask, and a “yes” would be of great interest. After that, we can drop the assumption and see if that’s solveable.