Yeah, I would be very nervous about making an exception to my assistant’s corrigibility. Ultimately, it would be prudent to be able to make some hard commitments after thinking very long and carefully about how to do that. In the meantime, here are a couple corrigibility-preserving commitment mechanisms off the top of my head:
Escrow: Put resources in a dumb incorrigible box that releases them under certain conditions.
The AI can incorrigibly make very short-lived commitments during atomic actions (like making a purchase).
This seems like a role for the law. Like having corrigibility except for breaking the law. I find that reasonable at first hand, but I also know relatively little about law in different countries to understand how uncompetitive that would make the AIs.
(There’s also a risk of giving too much power to the legislative authority in your country, if you’re worried about that kind of thing)
Although I could imagine something like a modern day VPN allowing you to make your AI believe it’s in another country, to make it do something illegal where you are.
That’s bad in a country with useful laws and good in a country with an authoritarian regime.
Yeah, I would be very nervous about making an exception to my assistant’s corrigibility. Ultimately, it would be prudent to be able to make some hard commitments after thinking very long and carefully about how to do that. In the meantime, here are a couple corrigibility-preserving commitment mechanisms off the top of my head:
Escrow: Put resources in a dumb incorrigible box that releases them under certain conditions.
The AI can incorrigibly make very short-lived commitments during atomic actions (like making a purchase).
Are these enough to maintain competitiveness?
This seems like a role for the law. Like having corrigibility except for breaking the law. I find that reasonable at first hand, but I also know relatively little about law in different countries to understand how uncompetitive that would make the AIs.
(There’s also a risk of giving too much power to the legislative authority in your country, if you’re worried about that kind of thing)
Although I could imagine something like a modern day VPN allowing you to make your AI believe it’s in another country, to make it do something illegal where you are. That’s bad in a country with useful laws and good in a country with an authoritarian regime.