I’m still convinced this is a red herring. Focus on making sure the FAI’s utility is aligned with whatever values you have, and it will be no more nor less susceptible to extortion/trade than is rational to maximize those values.
If some EAI credibly threatens to destroy the universe, I WANT my FAI to stop that outcome, which likely means cooperating.
Yeah I think I agree with this. Although I can sort of see the intuition this is coming from (“If you don’t do this, I’ll torture you” --> just make it impossible to feel pain beyond a certain magnitude), I think this causes some hefty problems in other areas, such as “If you don’t do this, I’ll kill you.” If an agent is trying to maximize the sum of its utility over it’s entire lifetime, murdering it would result in the loss of that entire sum of utility, and the agent has every incentive to preserve itself. That might mean giving in to the extortion. Whatever loss was incurred by giving into the extortion might be made up for by not being killed, and preserving it’s ability to continue performing actions, if that were the situation.
I’m still convinced this is a red herring. Focus on making sure the FAI’s utility is aligned with whatever values you have, and it will be no more nor less susceptible to extortion/trade than is rational to maximize those values.
If some EAI credibly threatens to destroy the universe, I WANT my FAI to stop that outcome, which likely means cooperating.
Yeah I think I agree with this. Although I can sort of see the intuition this is coming from (“If you don’t do this, I’ll torture you” --> just make it impossible to feel pain beyond a certain magnitude), I think this causes some hefty problems in other areas, such as “If you don’t do this, I’ll kill you.” If an agent is trying to maximize the sum of its utility over it’s entire lifetime, murdering it would result in the loss of that entire sum of utility, and the agent has every incentive to preserve itself. That might mean giving in to the extortion. Whatever loss was incurred by giving into the extortion might be made up for by not being killed, and preserving it’s ability to continue performing actions, if that were the situation.