Not necessarily a reason to deprioritize AI x-risk work, given that unaligned AI could be bad from an s-risk perspective as well:
Pain seems to have evolved because it has a
functional purpose in guiding behavior: evolution having
found it suggests that pain might be the simplest solution
for achieving its purpose. A superintelligence which was
building subagents, such as worker robots or
disembodied cognitive agents, might then also construct
them in such a way that they were capable of feeling pain—and thus possibly suffering (Metzinger 2015) - if that
was the most efficient way of making them behave in a
way that achieved the superintelligence’s goals.
Humans have also evolved to experience empathy
towards each other, but the evolutionary reasons which
cause humans to have empathy (Singer 1981) may not be
relevant for a superintelligent singleton which had no
game-theoretical reason to empathize with others. In
such a case, a superintelligence which had no
disincentive to create suffering but did have an incentive
to create whatever furthered its goals, could create vast
populations of agents which sometimes suffered while
carrying out the superintelligence’s goals. Because of the
ruling superintelligence’s indifference towards suffering, the amount of suffering experienced by this population
could be vastly higher than it would be in e.g. an
advanced human civilization, where humans had an
interest in helping out their fellow humans. [...]
If attempts to align the superintelligence with human values failed, it might not put any intrinsic value on avoiding suffering, so it may create large numbers of suffering subroutines.
I agree there’s substantial overlap, but there could be cases where “what’s best for reducing Xrisk” and “what’s best for reducing Srisk” really come apart. If I saw a clear-cut case for that; I’d be inclined to favor Srisk reduction (modulo, e.g., comparative advantage considerations).
That’s certainly true. To be clear, my argument was not “these types of work are entirely overlapping”, but rather just that “taking s-risk seriously doesn’t necessarily mean no overlap with x-risk prevention”.
A counter-argument to this would be the classical s-risk example of a cosmic ray particle flipping the sign on the utility function of an otherwise Friendly AI, causing it to maximize suffering that would dwarf any accidental suffering caused by a paperclip maximizer.
Not necessarily a reason to deprioritize AI x-risk work, given that unaligned AI could be bad from an s-risk perspective as well:
I agree there’s substantial overlap, but there could be cases where “what’s best for reducing Xrisk” and “what’s best for reducing Srisk” really come apart. If I saw a clear-cut case for that; I’d be inclined to favor Srisk reduction (modulo, e.g., comparative advantage considerations).
That’s certainly true. To be clear, my argument was not “these types of work are entirely overlapping”, but rather just that “taking s-risk seriously doesn’t necessarily mean no overlap with x-risk prevention”.
A counter-argument to this would be the classical s-risk example of a cosmic ray particle flipping the sign on the utility function of an otherwise Friendly AI, causing it to maximize suffering that would dwarf any accidental suffering caused by a paperclip maximizer.
That seems like a reason to work on AI alignment and figure out ways to avoid that particular failure mode, e.g. hyperexistential separation.